Enforce the 6 simultaneous open connections limit during local dev

### What versions & operating system are you using?

```
  System:
    OS: macOS 26.2
    CPU: (10) arm64 Apple M1 Max
    Shell: 5.9 - /bin/zsh
  Binaries:
    Node: 24.1.0
    npm: 11.3.0
  npmPackages:
    wrangler: ^4.83.0 => 4.84.0
```

### Please provide a link to a minimal reproduction

https://gist.github.com/danieltroger/d8bbdafd3d92d93b4d82cb29f0774147

### Describe the Bug

`wrangler dev` (via `workerd`) does not enforce the "six simultaneous open connections per invocation" limit that applies in production, as documented at [Workers Platform Limits — Simultaneous open connections](https://developers.cloudflare.com/workers/platform/limits/#simultaneous-open-connections):

> Each Worker invocation can have up to six connections simultaneously waiting for response headers. ... If a seventh connection is attempted while six are already waiting for headers, it is queued until one of the existing connections receives its response headers.

Locally, a single Worker invocation can have arbitrarily many in‑flight fetches. This creates a silent works‑locally / breaks‑in‑prod gap that is hard to notice because nothing complains during development. Depending on the fetch target, this can also cause the dev server to freeze to unrelated concurrent requests (event loop / IO starvation), host‑wide ephemeral TCP port exhaustion on macOS, and elevated memory use as response bodies buffer.

The inverse mismatch matters too: code tuned to run happily in local dev can hit the "cancel LRU open connection to make room for the new one" behaviour in prod (see cloudflare/workerd#4471), which is hard to diagnose after the fact.

This is the same shape of gap as #11582 (closed + locked) and #11803 (which added `limits.subrequests` for the *total* subrequest count — a different limit from the *concurrent* in‑flight cap this issue is about).

#### Minimal, self‑contained repro

Full runnable repro: https://gist.github.com/danieltroger/d8bbdafd3d92d93b4d82cb29f0774147

The repro uses a small local Node HTTP server that reports the maximum number of in‑flight requests it observed, then asks the Worker to fire N parallel `fetch()`es at it.

```ts
// src/index.ts — excerpt
export default {
  async fetch(req: Request): Promise<Response> {
    const url = new URL(req.url);
    if (url.pathname !== "/storm") return Response.json({}, { status: 404 });

    const n = Number(url.searchParams.get("n") ?? "100");
    const target = url.searchParams.get("target") ?? "http://localhost:9192/";

    await fetch(new URL("/reset", target));

    await Promise.allSettled(
      Array.from({ length: n }, () => fetch(target).then(r => r.body?.cancel())),
    );

    const stats = await (await fetch(new URL("/stats", target))).json();
    return Response.json({ n, maxInFlightObservedByTarget: stats.maxInFlight });
  },
};
```

Run:

```bash
node echo-server.mjs 9192 1000 &      # echo server w/ 1s response delay
npx wrangler dev                       # listens on localhost:9191

curl "http://localhost:9191/storm?n=100"
curl "http://localhost:9191/storm?n=500"
curl "http://localhost:9191/storm?n=2000"
```

Observed on `wrangler 4.84.0`:

```json
{ "n": 100,  "ok": 100,  "maxInFlightObservedByTarget": 100,  "elapsedMs": 1022 }
{ "n": 500,  "ok": 500,  "maxInFlightObservedByTarget": 500,  "elapsedMs": 1108 }
{ "n": 2000, "ok": 2000, "maxInFlightObservedByTarget": 2000, "elapsedMs": 1454 }
```

`maxInFlightObservedByTarget` is exactly `n` for every run — all `n` fetches are in flight simultaneously.

#### Expected behaviour

Per the production contract, `maxInFlightObservedByTarget` should be `6`. Connections 7+ should queue until one of the first six has received its response headers. (There is some doc / implementation disagreement about the exact semantics beyond the cap — see cloudflare/workerd#4471 — but the cap itself is well defined and enforced in prod.)

#### Bonus: dev server freeze

Point the storm at `https://example.com` with `n=5000` and simultaneously hit the trivial `/probe` endpoint from another terminal. Probes time out (5s+) while the storm is in flight because `workerd` is saturated holding thousands of concurrent TCP/TLS connections open. On macOS the host network stack can also run out of ephemeral ports, at which point *all* outbound connections on the machine start failing with `EADDRNOTAVAIL` until `workerd` is killed.

#### Suggested fix

Enforce the 6‑cap in `miniflare` / `workerd` during local dev, matching prod. Ideally exposed as a `limits.simultaneousConnections` option in the Wrangler config (analogous to the `limits.subrequests` added in #11803), defaulting to 6 to match the prod default.

#### Related

- #11582 — Enforce (subrequest | service) limits during local dev (closed → discussion)
- #11803 — Add a new `subrequests` limit to the `limits` field of the Wrangler configuration file (merged; covers *total count*, not *concurrent* in‑flight)
- cloudflare/workerd#4471 — Response closed due to connection limit (prod‑side semantics)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce the 6 simultaneous open connections limit during local dev #13609

What versions & operating system are you using?

Please provide a link to a minimal reproduction

Describe the Bug

Minimal, self‑contained repro

Expected behaviour

Bonus: dev server freeze

Suggested fix

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Enforce the 6 simultaneous open connections limit during local dev #13609

Description

What versions & operating system are you using?

Please provide a link to a minimal reproduction

Describe the Bug

Minimal, self‑contained repro

Expected behaviour

Bonus: dev server freeze

Suggested fix

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions