Skip to content

fix: wait for pending requests before aborting transports on close#1692

Closed
travisbreaks wants to merge 2 commits intomodelcontextprotocol:mainfrom
travisbreaks:fix/abort-controller-successful-requests
Closed

fix: wait for pending requests before aborting transports on close#1692
travisbreaks wants to merge 2 commits intomodelcontextprotocol:mainfrom
travisbreaks:fix/abort-controller-successful-requests

Conversation

@travisbreaks
Copy link
Copy Markdown

Summary

Fixes #1231.

When close() is called on SSEClientTransport or StreamableHTTPClientTransport, it immediately calls AbortController.abort(). If any send() POST requests are still in-flight (even if the server already returned 200 OK), Undici receives the abort signal and marks them as UND_ERR_ABORTED, corrupting OpenTelemetry traces and making successful requests appear failed.

Changes:

  • Added a _pendingRequests Set to track in-flight send operations in both transport classes
  • Modified close() to wait for pending requests to settle (Promise.allSettled) with a 2-second timeout before calling abort()
  • Split send() into a public method (tracks the promise) and _doSend() (does the actual work)
  • Tracked promises use .catch(() => {}) to prevent unhandled rejections while preserving the original rejection for callers

All 386 existing tests pass.

When close() is called on SSEClientTransport or
StreamableHTTPClientTransport, the immediate AbortController.abort()
causes Undici to mark in-flight requests as failed (UND_ERR_ABORTED),
even when the server has already returned 200 OK. This breaks
OpenTelemetry traces by misclassifying successful requests as errors.

Track pending send() requests and wait for them to settle (with a 2s
timeout to prevent hanging) before calling abort(). This allows
successful responses to complete processing, so Undici/OpenTelemetry
sees the actual 200 OK instead of an abort error. Stuck requests are
still cleaned up after the timeout.

Fixes modelcontextprotocol#1231
@travisbreaks travisbreaks requested a review from a team as a code owner March 17, 2026 04:15
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 17, 2026

⚠️ No Changeset found

Latest commit: ce43e03

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 17, 2026

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/client@1692

@modelcontextprotocol/server

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server@1692

@modelcontextprotocol/express

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/express@1692

@modelcontextprotocol/hono

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/hono@1692

@modelcontextprotocol/node

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/node@1692

commit: ce43e03

@KKonstantinov
Copy link
Copy Markdown
Contributor

@claude review

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks sound, but this modifies core transport close behavior used by all clients and has no new tests covering the graceful-close path. Worth a human look to confirm the 2-second timeout and tracking logic are appropriate for all use cases.

Extended reasoning...

Overview

This PR modifies SSEClientTransport and StreamableHTTPClientTransport in packages/client/src/client/ to add graceful close semantics. Instead of immediately calling abort(), close() now waits up to 2 seconds for in-flight requests to settle via Promise.allSettled before aborting. The send() method is split into a public tracking wrapper and a private _doSend() implementation. Tracked promises use .catch(() => {}) to prevent unhandled rejections.

Security risks

No security concerns. The changes are purely about transport lifecycle management — no auth, crypto, or permission logic is touched. The abort signal is still called after the grace period, so there is no risk of leaving connections open indefinitely.

Level of scrutiny

High. These are core transport classes that every MCP client depends on. A 2-second close delay could affect real-world usage patterns (e.g., CLI tools, test suites, serverless functions). The PR also has no new tests covering the graceful close path, and the changeset bot flagged a missing changeset. The timer leak nit (posted as inline comment) could cause test runners to report open handles.

Other factors

  • The _endpoint\! non-null assertion in sse.ts _doSend() is safe because send() guards with if (\!this._endpoint) before calling _doSend().
  • The tracked-promise pattern (.catch(() => {}) + .finally() cleanup) correctly avoids unhandled rejections while preserving the original rejection for callers.
  • No existing reviewer comments to address — this is the first substantive review.
  • CODEOWNERS assigns all files to @modelcontextprotocol/typescript-sdk by default.

Comment on lines +243 to +249
// successful responses are not marked as aborted by Undici/OpenTelemetry.
if (this._pendingRequests.size > 0) {
const GRACEFUL_CLOSE_TIMEOUT = 2000;
await Promise.race([
Promise.allSettled(this._pendingRequests),
new Promise<void>(resolve => setTimeout(resolve, GRACEFUL_CLOSE_TIMEOUT))
]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: The setTimeout fallback in Promise.race within close() is never cleared when Promise.allSettled resolves first. In Node.js, an active timer keeps the event loop alive, so the process cannot exit for up to 2 seconds after close() completes — this will cause test runners (Jest/Vitest) to report open handles. Fix by storing the timer ID and calling clearTimeout after the race, or by using setTimeout(...).unref(). Same issue exists in streamableHttp.ts line 462.

Extended reasoning...

What the bug is

In both sse.ts (line 246) and streamableHttp.ts (line 462), the close() method uses Promise.race between Promise.allSettled(this._pendingRequests) and a setTimeout-based promise as a 2-second fallback. However, the timer ID from setTimeout is never stored, so when Promise.allSettled wins the race (the common case — pending requests complete quickly), the timer is never cleared.

Why this matters

In Node.js, an active timer handle keeps the event loop alive. This means that even after close() has resolved and all cleanup is done, the process cannot exit for up to 2 seconds while waiting for the orphaned timer to fire. This is particularly problematic for:

  • Test suites: Jest and Vitest will report "open handles" warnings and may hang for 2 seconds after each test that calls close() with pending requests.
  • CLI tools: Short-lived processes will appear to hang briefly before exiting.
  • Any scenario requiring prompt process exit after closing the transport.

Step-by-step proof

  1. A client sends a request via send(), which adds a tracked promise to _pendingRequests.
  2. The server responds quickly (say, within 50ms).
  3. The caller invokes close(). Since _pendingRequests.size > 0, the code enters the Promise.race block.
  4. Promise.allSettled resolves almost immediately (the request already completed).
  5. Promise.race resolves, and close() proceeds to abort and clean up.
  6. However, the setTimeout(resolve, 2000) timer is still active in the event loop — nobody stored its ID or called clearTimeout.
  7. The process cannot exit for ~2 seconds until the timer fires.

How to fix

The simplest fix is to use .unref() on the timer, which tells Node.js not to keep the event loop alive for it:

new Promise<void>(resolve => setTimeout(resolve, GRACEFUL_CLOSE_TIMEOUT).unref())

Alternatively, store the timer ID and clear it after the race:

let timerId: ReturnType<typeof setTimeout>;
const timeout = new Promise<void>(resolve => { timerId = setTimeout(resolve, GRACEFUL_CLOSE_TIMEOUT); });
await Promise.race([Promise.allSettled(this._pendingRequests), timeout]);
clearTimeout(timerId\!);

Both approaches are straightforward and prevent the delayed-exit behavior.

@felixweinberger
Copy link
Copy Markdown
Contributor

Thanks for this. I think the pattern here is to await requests before closing:

const result = await client.callTool(...);
await client.close();

If close() is called while requests are in flight, the abort is what's being asked for, and the OTel span reflecting that seems accurate.

For the SIGINT / graceful-shutdown case where you don't know what's in flight, I'd rather see that as an opt-in on Client.close() (which already has the pending-request registry in _responseHandlers) than a mandatory transport-level drain that adds up to 2s to every close. Happy to look at a PR in that direction if there's appetite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Undici instrumentation marks successful MCP requests as aborted due to premature AbortController.abort() in SSE / StreamableHTTP transports

3 participants