Symptom
Under concurrent streaming load, gateway logs show:
```
Error during stream processing: openai [ undefined, 'Response object has been garbage collected' ]
Failed to close the writer: openai TypeError [ERR_INVALID_STATE]: Invalid state: WritableStream is closed
```
Downstream consumers (Node fetch / undici / OpenAI SDK) see:
```
TypeError: terminated
at Fetch.onAborted (node:internal/deps/undici/undici:...)
at TLSSocket.onHttpSocketClose (...)
cause: { name: 'SocketError', message: 'other side closed' }
```
The consumer's finishReason becomes 'other' and the stream is truncated mid-token — often missing trailing tool-call payloads, which can break agent / function-calling flows.
The failure frequency scales with concurrent stream count (allocation churn from many concurrent Responses drives V8 GC).
Root cause (brief)
In src/handlers/streamHandler.ts → handleStreamingMode, the upstream Response is referenced only via response.body.getReader(). The async IIFE that pipes upstream → writer captures reader and writer, but not response itself. After the function returns, the caller only holds the new Response wrapping readable, so the upstream response becomes unreachable. Node's ReadableStreamDefaultReader doesn't keep its parent Response alive — the Response is what owns the network connection — so V8 GC can collect it mid-stream, and the next reader.read() throws.
The unawaited IIFE promise is also unanchored, a secondary GC hazard.
Related prior work
PR #1306 ("fix: handle stream close failures", merged 2025-09-03) explicitly names this error in its description but its scope was wrapping the secondary writer.close() failure in try/catch. That stops the unhandled rejection from crashing Node, but does not prevent the primary GC of the upstream Response. The stream is still lost.
Proposed fix
PR #1658 — anchor the upstream response, reader, writer, and the piping promise on the returned readable so the caller's Response keeps the entire chain alive for the stream's lifetime.
Reproducing
Open enough concurrent streamed /v1/chat/completions requests against any provider (we observed it most clearly with OpenAI-compatible providers at 30+ concurrent in-flight streams) and watch gateway logs for the GC error. Frequency depends on Node version, stream durations, and how much per-stream allocation churn the response transformer adds.
Environment
- Node 20.x (newer Node is more aggressive about freeing unreferenced HTTP resources, makes this worse)
- Gateway main branch (also present on 1.15.x and older releases — same code structure)
- Any streaming OpenAI-compatible provider
Symptom
Under concurrent streaming load, gateway logs show:
```
Error during stream processing: openai [ undefined, 'Response object has been garbage collected' ]
Failed to close the writer: openai TypeError [ERR_INVALID_STATE]: Invalid state: WritableStream is closed
```
Downstream consumers (Node fetch / undici / OpenAI SDK) see:
```
TypeError: terminated
at Fetch.onAborted (node:internal/deps/undici/undici:...)
at TLSSocket.onHttpSocketClose (...)
cause: { name: 'SocketError', message: 'other side closed' }
```
The consumer's
finishReasonbecomes'other'and the stream is truncated mid-token — often missing trailing tool-call payloads, which can break agent / function-calling flows.The failure frequency scales with concurrent stream count (allocation churn from many concurrent
Responses drives V8 GC).Root cause (brief)
In
src/handlers/streamHandler.ts→handleStreamingMode, the upstreamResponseis referenced only viaresponse.body.getReader(). The async IIFE that pipes upstream → writer capturesreaderandwriter, but notresponseitself. After the function returns, the caller only holds the newResponsewrappingreadable, so the upstreamresponsebecomes unreachable. Node'sReadableStreamDefaultReaderdoesn't keep its parentResponsealive — theResponseis what owns the network connection — so V8 GC can collect it mid-stream, and the nextreader.read()throws.The unawaited IIFE promise is also unanchored, a secondary GC hazard.
Related prior work
PR #1306 ("fix: handle stream close failures", merged 2025-09-03) explicitly names this error in its description but its scope was wrapping the secondary
writer.close()failure in try/catch. That stops the unhandled rejection from crashing Node, but does not prevent the primary GC of the upstreamResponse. The stream is still lost.Proposed fix
PR #1658 — anchor the upstream
response,reader,writer, and the piping promise on the returnedreadableso the caller'sResponsekeeps the entire chain alive for the stream's lifetime.Reproducing
Open enough concurrent streamed
/v1/chat/completionsrequests against any provider (we observed it most clearly with OpenAI-compatible providers at 30+ concurrent in-flight streams) and watch gateway logs for the GC error. Frequency depends on Node version, stream durations, and how much per-stream allocation churn the response transformer adds.Environment