fix(libp2p): release probe stream slot on ConnectionMonitor ping error by marshalleq · Pull Request #3464 · libp2p/js-libp2p

marshalleq · 2026-04-18T01:03:08Z

Motivation

The ConnectionMonitor heartbeat opens an outbound /ipfs/ping/1.0.0 stream, runs Promise.all([bs.write, bs.read]), then awaits stream.close({ signal }). If the signal aborts during write or read — which happens routinely under WAN latency or transient event-loop contention — the await stream.close(...) line is never reached. The outbound stream remains in the muxer's state and continues to count against libp2p's maxOutboundStreams limit for the protocol.

The default maxOutboundStreams for /ipfs/ping/1.0.0 is 1. A single leaked stream therefore permanently starves the monitor on that connection: the next probe 10s later fails with TooManyOutboundProtocolStreamsError, which itself leaks another slot via the same missing cleanup, and every subsequent probe on that connection fails fast with no bytes on the wire.

Evidence

Observed in production across a five-host deployment. With abortConnectionOnPingFailure: false set (see #3463), probe failures are no longer catastrophic, but the stream-slot leak causes the monitor to silently lose its ability to measure RTT on any connection where a probe has failed at least once.

Logs from teardown instrumentation show:

TooManyOutboundProtocolStreamsError: Too many outbound protocol streams for protocol "/ipfs/ping/1.0.0" - 2/1
  at Connection.newStream (libp2p/src/connection.ts:113)

at ~6 events/hour/host. At scale (1M nodes × several connections each) this is noticeable event-loop pressure from work that produces no useful output.

Change

Capture the stream reference at the top of the try block. In the catch, if close() was not reached (status is not already 'closed' or 'aborted'), call stream.abort(err) to release the slot. abort rather than close because at this point we already know the probe failed — we don't want to wait on a graceful close that may itself block on the aborted signal.

Tests

New regression test should abort the probe stream when the ping exchange fails stubs a stream whose send throws (simulating signal-abort during the exchange) and asserts stream.abort is called from the catch block.
All 9 existing connection-monitor tests still pass.

Notes

The status guard avoids double-aborting if close() had completed before a subsequent throw (rare but cheap to check).
Independent of fix(libp2p): default abortConnectionOnPingFailure to false #3463 — either can land first.

The ConnectionMonitor heartbeat opens an outbound /ipfs/ping/1.0.0 stream, runs Promise.all([bs.write, bs.read]), then awaits stream.close(). If the signal aborts during write or read (a routine outcome under WAN latency or transient event-loop contention), the await stream.close() line is never reached. The outbound stream remains in the muxer's state and continues to count against libp2p's maxOutboundStreams limit for the protocol. The default maxOutboundStreams for /ipfs/ping/1.0.0 is 1, so a single leaked stream permanently starves the monitor on that connection: the next probe 10s later fails with TooManyOutboundProtocolStreamsError, which itself leaks another slot via the same missing cleanup, and every subsequent probe on that connection fails fast with no bytes on the wire. This patch captures the stream reference at the top of the try block and, in the catch, aborts it if close() was not reached (status is not already 'closed' or 'aborted'). Releases the slot in exactly the error cases it was being leaked. stream.abort() rather than close() because at this point we already know the probe failed — we don't want to wait on a graceful close that may itself block on the aborted signal. Includes a regression test that stubs a stream whose write throws (simulating signal-abort during the exchange) and asserts that the catch block calls stream.abort(). Signed-off-by: Marshalleq <13308996+marshalleq@users.noreply.github.com>

Removed comments explaining stream handling in case of errors.

dozyio · 2026-04-18T19:24:35Z

Thanks @marshalleq - will get this merged shortly

marshalleq requested a review from a team as a code owner April 18, 2026 01:03

dozyio added 2 commits April 18, 2026 20:04

Merge branch 'main' into fix/connection-monitor-stream-leak

d227369

chore: remove comments on stream error handling

dce1aa7

Removed comments explaining stream handling in case of errors.

dozyio approved these changes Apr 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(libp2p): release probe stream slot on ConnectionMonitor ping error#3464

fix(libp2p): release probe stream slot on ConnectionMonitor ping error#3464
marshalleq wants to merge 3 commits intolibp2p:mainfrom
marshalleq:fix/connection-monitor-stream-leak

marshalleq commented Apr 18, 2026

Uh oh!

dozyio commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

marshalleq commented Apr 18, 2026

Motivation

Evidence

Change

Tests

Notes

Uh oh!

dozyio commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants