What versions & operating system are you using?
System:
OS: macOS 26.2
CPU: (10) arm64 Apple M1 Max
Shell: 5.9 - /bin/zsh
Binaries:
Node: 24.1.0
npm: 11.3.0
npmPackages:
wrangler: ^4.83.0 => 4.84.0
Please provide a link to a minimal reproduction
https://gist.github.com/danieltroger/d8bbdafd3d92d93b4d82cb29f0774147
Describe the Bug
wrangler dev (via workerd) does not enforce the "six simultaneous open connections per invocation" limit that applies in production, as documented at Workers Platform Limits — Simultaneous open connections:
Each Worker invocation can have up to six connections simultaneously waiting for response headers. ... If a seventh connection is attempted while six are already waiting for headers, it is queued until one of the existing connections receives its response headers.
Locally, a single Worker invocation can have arbitrarily many in‑flight fetches. This creates a silent works‑locally / breaks‑in‑prod gap that is hard to notice because nothing complains during development. Depending on the fetch target, this can also cause the dev server to freeze to unrelated concurrent requests (event loop / IO starvation), host‑wide ephemeral TCP port exhaustion on macOS, and elevated memory use as response bodies buffer.
The inverse mismatch matters too: code tuned to run happily in local dev can hit the "cancel LRU open connection to make room for the new one" behaviour in prod (see cloudflare/workerd#4471), which is hard to diagnose after the fact.
This is the same shape of gap as #11582 (closed + locked) and #11803 (which added limits.subrequests for the total subrequest count — a different limit from the concurrent in‑flight cap this issue is about).
Minimal, self‑contained repro
Full runnable repro: https://gist.github.com/danieltroger/d8bbdafd3d92d93b4d82cb29f0774147
The repro uses a small local Node HTTP server that reports the maximum number of in‑flight requests it observed, then asks the Worker to fire N parallel fetch()es at it.
// src/index.ts — excerpt
export default {
async fetch(req: Request): Promise<Response> {
const url = new URL(req.url);
if (url.pathname !== "/storm") return Response.json({}, { status: 404 });
const n = Number(url.searchParams.get("n") ?? "100");
const target = url.searchParams.get("target") ?? "http://localhost:9192/";
await fetch(new URL("/reset", target));
await Promise.allSettled(
Array.from({ length: n }, () => fetch(target).then(r => r.body?.cancel())),
);
const stats = await (await fetch(new URL("/stats", target))).json();
return Response.json({ n, maxInFlightObservedByTarget: stats.maxInFlight });
},
};
Run:
node echo-server.mjs 9192 1000 & # echo server w/ 1s response delay
npx wrangler dev # listens on localhost:9191
curl "http://localhost:9191/storm?n=100"
curl "http://localhost:9191/storm?n=500"
curl "http://localhost:9191/storm?n=2000"
Observed on wrangler 4.84.0:
{ "n": 100, "ok": 100, "maxInFlightObservedByTarget": 100, "elapsedMs": 1022 }
{ "n": 500, "ok": 500, "maxInFlightObservedByTarget": 500, "elapsedMs": 1108 }
{ "n": 2000, "ok": 2000, "maxInFlightObservedByTarget": 2000, "elapsedMs": 1454 }
maxInFlightObservedByTarget is exactly n for every run — all n fetches are in flight simultaneously.
Expected behaviour
Per the production contract, maxInFlightObservedByTarget should be 6. Connections 7+ should queue until one of the first six has received its response headers. (There is some doc / implementation disagreement about the exact semantics beyond the cap — see cloudflare/workerd#4471 — but the cap itself is well defined and enforced in prod.)
Bonus: dev server freeze
Point the storm at https://example.com with n=5000 and simultaneously hit the trivial /probe endpoint from another terminal. Probes time out (5s+) while the storm is in flight because workerd is saturated holding thousands of concurrent TCP/TLS connections open. On macOS the host network stack can also run out of ephemeral ports, at which point all outbound connections on the machine start failing with EADDRNOTAVAIL until workerd is killed.
Suggested fix
Enforce the 6‑cap in miniflare / workerd during local dev, matching prod. Ideally exposed as a limits.simultaneousConnections option in the Wrangler config (analogous to the limits.subrequests added in #11803), defaulting to 6 to match the prod default.
Related
What versions & operating system are you using?
Please provide a link to a minimal reproduction
https://gist.github.com/danieltroger/d8bbdafd3d92d93b4d82cb29f0774147
Describe the Bug
wrangler dev(viaworkerd) does not enforce the "six simultaneous open connections per invocation" limit that applies in production, as documented at Workers Platform Limits — Simultaneous open connections:Locally, a single Worker invocation can have arbitrarily many in‑flight fetches. This creates a silent works‑locally / breaks‑in‑prod gap that is hard to notice because nothing complains during development. Depending on the fetch target, this can also cause the dev server to freeze to unrelated concurrent requests (event loop / IO starvation), host‑wide ephemeral TCP port exhaustion on macOS, and elevated memory use as response bodies buffer.
The inverse mismatch matters too: code tuned to run happily in local dev can hit the "cancel LRU open connection to make room for the new one" behaviour in prod (see cloudflare/workerd#4471), which is hard to diagnose after the fact.
This is the same shape of gap as #11582 (closed + locked) and #11803 (which added
limits.subrequestsfor the total subrequest count — a different limit from the concurrent in‑flight cap this issue is about).Minimal, self‑contained repro
Full runnable repro: https://gist.github.com/danieltroger/d8bbdafd3d92d93b4d82cb29f0774147
The repro uses a small local Node HTTP server that reports the maximum number of in‑flight requests it observed, then asks the Worker to fire N parallel
fetch()es at it.Run:
Observed on
wrangler 4.84.0:{ "n": 100, "ok": 100, "maxInFlightObservedByTarget": 100, "elapsedMs": 1022 } { "n": 500, "ok": 500, "maxInFlightObservedByTarget": 500, "elapsedMs": 1108 } { "n": 2000, "ok": 2000, "maxInFlightObservedByTarget": 2000, "elapsedMs": 1454 }maxInFlightObservedByTargetis exactlynfor every run — allnfetches are in flight simultaneously.Expected behaviour
Per the production contract,
maxInFlightObservedByTargetshould be6. Connections 7+ should queue until one of the first six has received its response headers. (There is some doc / implementation disagreement about the exact semantics beyond the cap — see cloudflare/workerd#4471 — but the cap itself is well defined and enforced in prod.)Bonus: dev server freeze
Point the storm at
https://example.comwithn=5000and simultaneously hit the trivial/probeendpoint from another terminal. Probes time out (5s+) while the storm is in flight becauseworkerdis saturated holding thousands of concurrent TCP/TLS connections open. On macOS the host network stack can also run out of ephemeral ports, at which point all outbound connections on the machine start failing withEADDRNOTAVAILuntilworkerdis killed.Suggested fix
Enforce the 6‑cap in
miniflare/workerdduring local dev, matching prod. Ideally exposed as alimits.simultaneousConnectionsoption in the Wrangler config (analogous to thelimits.subrequestsadded in #11803), defaulting to 6 to match the prod default.Related
subrequestslimit to thelimitsfield of the Wrangler configuration file #11803 — Add a newsubrequestslimit to thelimitsfield of the Wrangler configuration file (merged; covers total count, not concurrent in‑flight)