Skip to content

fix: drain SSE stream to EOF to prevent ~260ms latency on keepalive connections#2780

Open
xlyoung wants to merge 1 commit into
modelcontextprotocol:mainfrom
xlyoung:fix/streamable-http-drain-sse-stream
Open

fix: drain SSE stream to EOF to prevent ~260ms latency on keepalive connections#2780
xlyoung wants to merge 1 commit into
modelcontextprotocol:mainfrom
xlyoung:fix/streamable-http-drain-sse-stream

Conversation

@xlyoung
Copy link
Copy Markdown

@xlyoung xlyoung commented Jun 4, 2026

Problem

In _handle_sse_response, the client calls await response.aclose() immediately after receiving the first JSON-RPC response event. This early close leaves the underlying HTTP/1.1 keepalive connection in a half-drained state, causing the next request reusing the same connection to block for ~260ms before the server's response status arrives.

Measured impact (from #2707):

Path Avg latency
Raw httpx to EOF ~5 ms
ClientSession.call_tool() (current code) ~265 ms
ClientSession.call_tool() (with fix) ~7 ms

37x speedup per sequential call over streamable HTTP.

Root Cause

The early aclose() returns the connection to the pool without draining the SSE stream. The next POST on the same connection blocks waiting for the server-side SSE writer to finish.

Fix

Remove the early aclose() and let the SSE stream drain to EOF naturally:

# Before
if is_complete:
    await response.aclose()
    return

# After
if is_complete:
    return  # Stream drains to EOF naturally

The server closes the SSE stream after sending the response (sse_starlette.EventSourceResponse exits via break on JSONRPCResponse), so the loop exits naturally on EOF.

Testing

  • 28 existing streamable_http tests pass
  • Pre-existing test failures (socksio ImportError) are unrelated

Fixes #2707

…onnections

In _handle_sse_response, the client called await response.aclose()
immediately after receiving the first JSON-RPC response event. This
early close left the underlying HTTP/1.1 keepalive connection in a
half-drained state, causing the next request reusing the same connection
to block for ~260ms before the server's response status arrived.

Fix: remove the early aclose() and let the SSE stream drain to EOF
naturally. The server closes the SSE stream after sending the response
(sse_starlette.EventSourceResponse exits via break on JSONRPCResponse),
so the loop exits naturally on EOF.

Performance improvement: 37x speedup (265ms → 7ms per call in the
reporter's setup).

Fixes modelcontextprotocol#2707
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

streamable_http: early response.aclose() poisons keepalive connection, causes ~260ms latency on every subsequent tool call

1 participant