Commit 870bdf2
authored
feat(cli,sdk): qwen serve daemon (Stage 1) (#3889)
* feat(cli): scaffold `qwen serve` HTTP daemon (Stage 1, #3803)
Adds a `serve` subcommand that boots an Express 5 listener with bearer
auth, host allowlist, and CORS modeled on `vscode-ide-companion/src/
ide-server.ts`. Ships only `/health` and `/capabilities` to begin with;
session/prompt/event routes will land in follow-up PRs once the per-
session ACP child-process bridge in `httpAcpBridge.ts` is wired.
Defaults to 127.0.0.1 with auth disabled so local development needs no
configuration. Binding beyond loopback (e.g. `--hostname 0.0.0.0`)
refuses to start without a token (`--token` or `QWEN_SERVER_TOKEN`).
Capabilities envelope versioned at v=1 with a `features` array — clients
should gate UI off `features`, never off `mode`, so subsequent PRs can
add capability tags without breaking older clients.
Per design issue's Stage 1 scope (~700-1000 LOC). Adds ~430 LOC of
implementation + tests in this scaffold; the remaining budget belongs
to the route wiring + bridge implementation in follow-ups.
* feat(cli): wire HttpAcpBridge + POST /session for `qwen serve` (#3803)
Stage 1 follow-up to the scaffold. Implements the bridge between the
HTTP daemon and the existing ACP child agent, plus the first session
endpoint.
`HttpAcpBridge.spawnOrAttach`:
- Spawns `node $cliEntry --acp` per workspace via an injectable
`ChannelFactory` (default uses `process.argv[1]`; tests use an
in-memory `TransformStream` pair so they don't fork real processes).
- Drives the ACP `initialize` + `newSession` handshake via the SDK's
`ClientSideConnection`, with a 10s timeout that kills the channel.
- Under `sessionScope: 'single'` (default), reuses the live session
when the same canonical workspace cwd is requested again — backs
the `attached: true` flag.
- The `Client` impl on the bridge side proxies file reads/writes to
local fs (daemon and agent share the host) and buffers
`sessionUpdate` notifications for the SSE wiring in the next PR.
`requestPermission` returns `cancelled` until the
`/permission/:requestId` route lands.
`POST /session`:
- 400 on missing or relative `cwd`.
- 200 with `{sessionId, workspaceCwd, attached}` on success.
- 500 on bridge failure (the failing channel is killed, not leaked).
`runQwenServe` constructs the bridge and ties `bridge.shutdown()` into
the listener-close path so SIGINT/SIGTERM drain children before the
socket closes.
Tests (14 new, 0 regressions in the 4967-test baseline):
- 9 bridge cases over an in-memory channel — fresh spawn, single-scope
reuse, cross-workspace isolation, thread-scope independence, path
canonicalization, relative-path rejection, init failure cleanup,
init timeout, multi-channel shutdown.
- 4 route cases for /session (missing/relative/200/500).
- 1 lifecycle case asserting `runQwenServe.close()` calls
`bridge.shutdown()` before closing the listener.
Verified end-to-end: `qwen serve` boots, `POST /session` spawns a real
`qwen --acp` child and returns the SDK-assigned `sessionId`, repeat
calls under the same cwd return `attached: true`, `SIGTERM` reaps the
child along with the listener.
* feat(cli): wire POST /session/:id/prompt + /cancel for `qwen serve` (#3803)
Stage 1 follow-up after the bridge scaffold. Adds the two routes a client
needs to actually run a turn against the daemon.
Bridge:
- `sendPrompt(sessionId, req)` looks up the session, FIFO-queues the
call against the per-session prompt queue, and forwards through the
SDK `ClientSideConnection.prompt`. Concurrent calls observe ACP's
"one active prompt per session" invariant — second waits for first.
- A failed prompt does NOT poison the queue; the tail catches and
keeps draining so the next caller still runs (the original caller
still sees its own rejection).
- `cancelSession(sessionId, req?)` bypasses the queue and forwards
the ACP notification immediately. ACP semantics: the agent winds
down the *currently active* prompt; queued work is unaffected.
- Both methods throw `SessionNotFoundError` (a typed Error subclass)
when the id is unknown so route handlers can map cleanly to 404
without brittle message matching.
- Both methods overwrite the `sessionId` field in the request body
with the routing id — a stale or spoofed body would otherwise be
dispatched to the wrong agent process.
Routes:
- `POST /session/:id/prompt` → 200 with PromptResponse, 400 on
missing/non-array prompt, 404 on unknown session, 500 on agent
error.
- `POST /session/:id/cancel` → 204 always (cancel is a notification),
404 on unknown session.
Tests (14 new — 7 bridge + 7 route, 0 regressions in the 4981 baseline):
- sendPrompt: success forwards & returns response · routing-id
overrides body sessionId · concurrent prompts FIFO-serialize
(verified via per-prompt start/end ordering with a release latch) ·
failed prompt doesn't block subsequent prompts · 404 for unknown id.
- cancelSession: forwards with routing id · 404 for unknown id.
- Routes: 200/400/404/500 paths for prompt; 204 with body or empty +
404 for cancel.
Verified end-to-end against a real `qwen --acp` child:
- POST /session/:id/prompt with `[{type:'text',text:'hi'}]` → 200
`{"stopReason":"end_turn"}` in ~3.4s.
- POST /session/:id/cancel → 204.
- POST /session/does-not-exist/prompt → 404 with the unknown id
surfaced in the body.
* feat(cli): wire SSE streaming for `qwen serve` events (#3803)
Stage 1 follow-up that turns prompt into a real streaming experience.
Replaces the in-memory `notifications: SessionNotification[]` buffer
on each session with a per-session EventBus and exposes it through
`GET /session/:id/events` as an `text/event-stream` SSE feed.
EventBus (`packages/cli/src/serve/eventBus.ts`):
- Monotonic per-session ids (`v: 1` schema). Each `publish` chains an
id, returning the materialized BridgeEvent.
- Bounded ring (default 1000) backs `Last-Event-ID` reconnect — a
consumer that drops can resume from `lastEventId` and replay any
still-buffered events before live events flow.
- Per-subscriber bounded queue (default 256). When a slow consumer
overruns its queue, the bus appends a synthetic `client_evicted`
terminal frame and closes that subscription so it can't hold the
daemon hostage. Other subscribers are unaffected.
- `subscribe()` returns an AsyncIterable — registration is synchronous
so events `publish`ed immediately after the subscribe land in the
queue (a generator-style implementation deferred registration to
first `next()` and raced with publishes).
- AbortSignal-aware: aborting the signal closes the iterator promptly.
Bridge (`httpAcpBridge.ts`):
- `BridgeClient.sessionUpdate` now publishes onto the session's
EventBus instead of pushing to a plain array — every ACP
notification the agent emits becomes a stream event automatically.
- New `subscribeEvents(sessionId, opts?)` returns the bus's
AsyncIterable; throws `SessionNotFoundError` for unknown ids.
- Shutdown closes every live event bus before killing channels so
pending consumers unwind cleanly.
Route (`server.ts`):
- `GET /session/:id/events` sets the SSE content type, advertises a
3s reconnect hint, and writes a 15s heartbeat comment frame to
keep proxy/NAT connections alive.
- Forwards the `Last-Event-ID` header to the bus.
- `req.on('close')` triggers an AbortController that propagates into
the bridge subscription so disconnects don't leak subscribers.
- 404 when the bridge can't find the session.
Capabilities envelope: `STAGE1_FEATURES` now advertises
`session_create`, `session_prompt`, `session_cancel`, `session_events`
in addition to `health`/`capabilities` so clients can light up UI for
the routes that have actually shipped.
Tests (16 new, 0 regressions in the 4995 baseline):
- 9 EventBus unit cases — id sequencing, live delivery, replay,
replay+live splice, fan-out to N subscribers, eviction on
overflow, abort-signal unsubscribe, bus.close() drains
subscribers, ring-size eviction.
- 4 bridge subscribe cases — 404, sessionUpdate→event publishing
via real ACP fake-agent, shutdown closes live subscriptions.
- 4 SSE route cases against a live HTTP listener — frame format,
Last-Event-ID forwarding, 404, abort propagation on disconnect.
Verified end-to-end against a real `qwen --acp` child:
- Subscribed to `/session/$SID/events`, fired `POST /session/$SID/prompt`
with text content. Captured 13 distinct `event: session_update`
SSE frames in real time during the model's response — `available_
commands_update` metadata, 9 `agent_thought_chunk` frames carrying
the model's chain-of-thought, 3 `agent_message_chunk` frames with
the actual reply, and a final usage frame with token totals.
- Frames carry monotonic ids 1..13, the daemon-side counter, and
are valid SSE per the EventSource spec.
* feat(cli): wire POST /permission/:requestId for `qwen serve` (#3803)
Stage 1 follow-up that turns `BridgeClient.requestPermission` from a
hardcoded `cancelled` placeholder into a real first-responder vote
loop, and ships the HTTP route any attached client uses to cast the
deciding vote.
Bridge:
- `requestPermission` generates a UUID requestId, registers a
pending entry on a daemon-wide map (and the owning session's
`pendingPermissionIds` set), publishes a `permission_request`
event onto the session's EventBus (so SSE subscribers see it),
and awaits the resolution.
- New `respondToPermission(requestId, response)` resolves the
pending promise with the supplied outcome. First call wins —
subsequent calls return false. On success the bridge publishes a
`permission_resolved` event so other attached clients can update
their UI when the race is decided.
- `cancelSession` and `shutdown` both resolve every still-pending
permission for the affected session(s) as
`{ outcome: { outcome: 'cancelled' } }` per the ACP spec
requirement that a cancelled prompt MUST resolve outstanding
requestPermission calls with cancelled.
- New `pendingPermissionCount` getter exposes inflight count for
inspection / tests.
Route (`server.ts`):
- `POST /permission/:requestId` validates the body's `outcome` is
either `{ outcome: 'cancelled' }` or `{ outcome: 'selected',
optionId: string }`, then forwards to `bridge.respondToPermission`.
- 200 on accepted vote, 404 when the requestId is unknown or
already resolved (Stage 1 doesn't differentiate), 400 on a
malformed outcome.
Capabilities envelope: STAGE1_FEATURES gains `permission_vote`.
Tests (14 new — 9 bridge + 5 route, 0 regressions in the 5011 baseline):
- Bridge: publishes permission_request with a generated requestId
and waits; respondToPermission first-responder wins; publishes
permission_resolved on vote; respondToPermission false for
unknown requestId; cancelSession resolves outstanding as
cancelled; shutdown resolves outstanding as cancelled.
- Route: 200 on selected outcome; 200 on cancelled outcome; 404 on
unknown requestId; 400 on malformed outcome; 400 on missing
outcome.
Verified end-to-end against a real `qwen --acp` child:
- Subscribed to /session/$SID/events, sent a prompt asking the
agent to write a file at /tmp/qwen-serve-permission-e2e-test.txt.
- The agent triggered a permission_request via the bus, surfacing
the three options Qwen Code presents (Allow Always / Allow /
Reject) with their option ids.
- POSTed `{outcome:{outcome:"selected",optionId:"proceed_once"}}`
to /permission/$requestId — got HTTP 200.
- Bus published the matching permission_resolved event.
- Agent proceeded with the writeTextFile tool call; file was
actually created on disk with the expected content.
* feat(sdk): add DaemonClient for the qwen serve HTTP API (#3803)
Stage 1 follow-up that proves the cross-mode protocol-isomorphism design
assumption: an SDK client can drive the daemon's HTTP routes end-to-end
without going through ProcessTransport's stdio + stream-json path.
DaemonClient is a sibling of ProcessTransport, not a replacement. The two
speak different protocols (ACP NDJSON over HTTP vs stream-json over
stdio). Existing `query()` users keep getting subprocess-mode unchanged;
applications that want daemon-mode (cross-client attach, shared MCP
pool, network reachability, first-responder permissions) opt in by
constructing a DaemonClient against a running `qwen serve`.
API surface (`packages/sdk-typescript/src/daemon/`):
- `new DaemonClient({ baseUrl, token?, fetch? })`. The `fetch` override
is for tests; defaults to `globalThis.fetch`. Trailing slashes on
`baseUrl` are stripped.
- `health()`, `capabilities()` — discovery.
- `createOrAttachSession({ workspaceCwd, modelServiceId? })` — `attached:
true` on the response indicates a session was reused under
sessionScope:single.
- `prompt(sessionId, { prompt: ContentBlock[] })` — returns
PromptResult with stopReason.
- `cancel(sessionId)` — tolerates 204; throws on 404.
- `subscribeEvents(sessionId, { lastEventId?, signal? })` — async
iterator over parsed SSE frames; AbortSignal-aware. Native Node
AbortController only — jsdom polyfills are incompatible with undici.
- `respondToPermission(requestId, response)` — first-responder vote;
returns true on 200, false on 404 (lost the race or unknown id),
throws on 400/500.
`DaemonHttpError` is thrown for any non-2xx (besides the 404
"already-resolved" case on permission votes); carries `status` and
`body` so callers can branch on standard daemon HTTP semantics.
`parseSseStream(body)` is the underlying SSE parser; exported separately
so applications can consume daemon SSE outside the DaemonClient surface.
Handles split-chunk frames, comment/retry directives, malformed JSON
(skip), trailing frame without final newline.
Wire types live SDK-side (no SDK→CLI dep); the capabilities envelope's
`v` field signals breaking changes.
Tests (26 new, 0 regressions in the 201 baseline):
- 7 SSE parser cases — single frame, multiple frames, comments,
chunked-split frame, malformed JSON skip, trailing frame on close,
empty stream.
- 19 DaemonClient cases — health success/error, capabilities, bearer
auth presence/absence, createOrAttachSession success/400, prompt
body shape + sessionId url-encoding, cancel 204/404, permission
200/400/404, subscribeEvents header forwarding + 404, baseUrl
normalization.
Verified end-to-end against a real `qwen serve` daemon driving a real
`qwen --acp` child:
- `client.capabilities()` returned `{v:1, mode:"http-bridge", features:
[...7 tags]}`.
- First `createOrAttachSession` returned `attached:false`; second
returned `attached:true` with the same sessionId.
- `client.prompt(...)` with text content yielded `{stopReason:
"end_turn"}` while the parallel `subscribeEvents` iterator streamed
10 distinct frames during the same turn.
- AbortController on the events iterator cleanly severed the SSE
connection.
* feat(cli,sdk): list workspace sessions + set session model (#3803)
Closes the §04 Stage-1 routes table for `qwen serve` with the two
remaining endpoints, plus matching SDK methods.
`GET /workspace/:id/sessions`
- `:id` is the URL-encoded canonical absolute workspace path
(Express decodes path params automatically; clients pass
`encodeURIComponent(cwd)`).
- Returns `{ sessions: [{ sessionId, workspaceCwd }, ...] }` for live
sessions whose canonical workspace matches.
- Empty array (not 404) when the workspace is idle so picker UIs
don't have to special-case "no sessions yet".
- 400 when the decoded path isn't absolute.
`POST /session/:id/model`
- Body: `{ modelId: string, ... }`. The route's `:id` overrides any
spoofed sessionId in the body.
- Forwards to ACP's `unstable_setSessionModel` and publishes a
`model_switched` event onto the session bus so cross-client UIs
update.
- 200 with the agent's response on success, 400 on missing/empty
modelId, 404 on unknown session.
- The SDK method is currently unstable; documented in the bridge
comment in case the spec renames the method when it stabilizes.
Bridge:
- New `listWorkspaceSessions(workspaceCwd)` iterates `byId.values()`
and filters by canonical workspace path; works for both `single`
and `thread` session scopes.
- New `setSessionModel(sessionId, req)` forwards through
`connection.unstable_setSessionModel`, normalizes sessionId,
publishes `model_switched`, throws SessionNotFoundError on
unknown ids.
`STAGE1_FEATURES` capabilities envelope grows to 9 tags, adding
`session_list` and `session_set_model`.
SDK (`DaemonClient`):
- `listWorkspaceSessions(workspaceCwd)` URL-encodes the cwd and
returns the parsed `sessions` array directly.
- `setSessionModel(sessionId, modelId)` POSTs the body and returns
the agent response (currently opaque per ACP unstable spec).
- Wire types `DaemonSessionSummary` and `SetModelResult` exported
from the SDK barrel.
Tangential cleanup: `sendBridgeError` now extracts a useful message
from non-Error values via a small `errorMessage` helper. JSON-RPC
errors from the agent (`{code, message, data}`) used to surface as
`"[object Object]"` in the 500 response body; they now show the
inner `message` field. Caught while running the model-set e2e.
Tests (17 new — 9 bridge + 7 route + 4 SDK, 0 regressions in the
5022 + 227 baselines):
- Bridge listWorkspaceSessions: matching cwd returns the live
sessions; canonicalizes the lookup; empty for relative paths.
- Bridge setSessionModel: forwards modelId + overrides body
sessionId; publishes model_switched event; 404 unknown session.
- Route /workspace/:id/sessions: returns the bridge list; empty for
idle workspace; 400 for relative path.
- Route /session/:id/model: 200 success; 400 missing modelId; 400
empty modelId; 404 unknown session.
- SDK listWorkspaceSessions: URL-encodes the cwd; throws on 400.
- SDK setSessionModel: posts body; throws on 404.
Verified end-to-end against a real `qwen serve`:
- SDK reports 9 capability features, list returns the existing
session, attached:true on repeat create, and `setSessionModel`
rejects with HTTP 500 when the modelId isn't registered (with the
daemon now surfacing "Internal error" instead of "[object Object]").
- 404 path through SDK on unknown sessionId works.
* fix(cli,sdk): audit round 1 follow-ups for `qwen serve` (#3803)
Self-review pass on PR #3889. Two real correctness bugs and an
ergonomics gap, plus the test-coverage holes the audit surfaced. The
loudest finding ("host allowlist no-op when bind=localhost") was a
false positive — the conditional was misread; existing tests already
prove the validator is active on `localhost` binds.
Real fixes:
- Bearer-auth timing-attack: `parts[1] !== token` short-circuits per
byte, leaking which prefix is correct via response latency. Replace
with SHA-256 of both sides + `crypto.timingSafeEqual` so comparison
is constant-time regardless of token length.
- Concurrent `spawnOrAttach` race in single-scope: two parallel
callers for the same workspace both passed the `byWorkspace.get`
check, both spawned, and one entry ended up orphaned in `byId`
while the other won `byWorkspace`. Violates the
"at most one session per workspace" invariant. Coalesce via an
`inFlightSpawns` map: parallel callers attach to the in-flight
promise and report `attached: true`. The slot is cleared on both
success and rejection so a failed spawn doesn't poison the
workspace forever. New test asserts ONE channel spawns under
parallel calls and that retry works after rejection.
- `Number.parseInt('1.5e10z', 10)` returns 1, so a malformed
`Last-Event-ID` header silently passes through. Tighten
`parseLastEventId` to `^\d+$` so anything not a pure decimal
integer is dropped. New test exercises 'abc', '-1', '1.5e10z'.
Ergonomics:
- `LOOPBACK_BINDS` and `LOOPBACK_HOST_BINDS` now include `::1` and
`[::1]`. IPv6 loopback users no longer have to set a token.
Host-allowlist allows `[::1]:port` Host headers.
Documentation:
- `BridgeClient` doc-comment now states the Stage 1 trust model
explicitly: agent runs as the same UID, the file-proxy methods
are NOT a workspace-cwd sandbox, restricting them would be
theatre. The audit flagged this as a "design gap" but the
daemon-and-agent-on-same-host posture makes a sandbox here
redundant — Stage 4+ remote-sandbox swaps the Client for a
sandbox-aware variant.
SDK fix:
- `DaemonClient.failOnError` previously called `res.json()`, which
consumes the body even on parse-failure; the subsequent
`res.text()` returned empty. New impl reads once as text and
attempts JSON-parse; raw text is the fallback. New test asserts
a `text/plain` 502 surfaces the body verbatim.
Test gap fills (audit-flagged):
- Bridge: in-memory file-proxy tests for `BridgeClient.{read,write}
TextFile` including line/limit slicing.
- SSE route: `stream_error` synthetic frame on iterator throw
mid-stream; numeric Last-Event-ID forwarded; malformed
Last-Event-ID dropped.
- DaemonClient: text/plain error body coerced to `body` field;
`respondToPermission` 5xx throws; `subscribeEvents` null-body
throws; `cancel`/`respondToPermission` URL-encode session/request
ids that contain slashes.
Verified end-to-end with a token-required daemon: right token → 200,
wrong/missing/malformed → 401. All paths return uniform 401 messages
so a side-channel can't distinguish between "no header", "bad scheme",
and "wrong token".
Test counts: cli serve **89** (was 81, +8), sdk daemon **35** (was
30, +5). Full suites still green.
* fix(cli): audit round 2 follow-ups for `qwen serve` (#3803)
Second self-review pass on PR #3889. Three real bugs (one
correctness, one resource-cleanup, one cosmetic) plus consolidation
of the loopback bindings into a single source of truth.
Real fixes:
- Shutdown could hang forever on a long-lived SSE consumer:
`server.close` waits for every in-flight connection to drain,
and a paused EventSource client never disconnects. Added a
`SHUTDOWN_FORCE_CLOSE_MS` (5s) timer that calls
`server.closeAllConnections()` to force-destroy stuck sockets,
then resolves so `process.exit(0)` can run. New test asserts
close completes well under 5.5s even when an SSE GET is in
flight.
- Signal-handler race during shutdown: round 1 detached the
SIGINT/SIGTERM listeners *up front* in `handle.close()`. If a
second SIGTERM arrived during the drain, no handler existed and
Node's default termination ran, orphaning agent children. Switch
to detaching at the *end* of the close path (in `finish()`):
during the drain window the handler is still attached and the
`if (shuttingDown) return` guard makes a second signal a no-op;
after drain completes we can safely remove the listeners (this
also fixes a test-suite MaxListenersExceededWarning that fired
once we ran the runQwenServe tests >10 times in a single
process).
- SSE response had no `error` listener. When the underlying TCP
socket died (RST, kill -9 on the client), the next `res.write`
threw EPIPE and Express forwarded it to the default error
handler, logging noisily. Added `res.on('error', cleanup)` so
the failure is absorbed and triggers the same teardown path the
`req.on('close')` handler uses.
Validation:
- `createHttpAcpBridge` now throws on invalid `sessionScope` (anything
other than `'single'` or `'thread'`) and on `initializeTimeoutMs <= 0`.
Misconfigured callers used to silently degrade to thread behavior;
now they fail loudly.
Cleanup:
- The `LOOPBACK_BINDS` set was duplicated between `auth.ts` and
`runQwenServe.ts` (round 1 missed this). Extracted into
`packages/cli/src/serve/loopbackBinds.ts` with a single
`isLoopbackBind(hostname)` helper. Both files now import; drift is
impossible.
- `res.flushHeaders?.()` lost the optional chaining. The method is
on `http.ServerResponse` since Node 1.6; our `engines` floor is 20.
Tests added:
- bridge: `sessionScope` validation, `initializeTimeoutMs` validation.
- server: shutdown force-close timeout, SIGINT/SIGTERM listener
detach-after-drain.
False positives from the round 2 audit (verified and dismissed):
- "EventBus nextId overflow at 2^53" — theoretical only (would
require ~9 quadrillion publishes per session). No code change.
- "Subscribe-during-close race" — JS is single-threaded; the close()
flag is set synchronously before the loop touches state.
- "Queued prompts on shutdown" — by design; documented via the
promptQueue tail comment.
- "10MB body parser limit" — design choice for Stage 1's in-memory
buffering model; revisit if ACP streaming lands in Stage 2.
- "Unbounded body read in DaemonClient.failOnError" — daemon is
local in Stage 1; the threat surface for adversarial-large error
bodies is the same as the daemon's other unbounded buffers.
Test counts: cli serve **93** (was 89, +4), full cli **5047** (no
regressions), sdk **236** (no regressions).
* docs(cli): audit rounds 3 + 4 follow-ups for `qwen serve` (#3803)
Two more self-review passes on PR #3889. No correctness bugs surfaced
this time — round 3 found a HIGH-severity Windows-path claim that
turned out to be a false positive (`path.win32.isAbsolute('/foo/bar')`
returns true; verified against Node 20). Round 4 confirmed every
prior decision and surfaced one latent-but-not-currently-triggered
concurrency note.
Changes are pure documentation + a tiny optional-chain cleanup:
- Drop `?.` on `server.closeAllConnections()` in runQwenServe.ts —
the method exists since Node 18.2 and our `engines` floor is 20.
The optional chain dated from before round 2's force-close timer
landed; clean it up.
- Help text for `qwen serve --port` now documents that port 0 means
"OS-assigned ephemeral port" (which the implementation has always
supported but never advertised).
- `defaultSpawnChannelFactory` gains a comment near the spawn site
documenting the FD-budget implication (~3 FDs per session, bump
`ulimit -n` for many concurrent sessions) and the `stdio:
['pipe', 'pipe', 'inherit']` choice (child stderr lands in the
daemon's stderr, interleaved across sessions). Both are
Stage-1-accepted; Stage 2/4+ revisit each.
- Comment on the bridge's `byWorkspace`/`byId` Maps documenting the
known gap that a child crashing between requests leaves a garbage
SessionEntry until daemon shutdown — surfaced as a per-prompt
failure when the dead session is touched, not a hang. Stage 2's
in-process bridge eliminates the spawned-child failure mode
entirely so this gap goes away naturally.
- `EventBus.subscribe` doc-comment now states explicitly that the
returned iterator is NOT safe to drive from concurrent
`.next()` callers — the underlying queue isn't atomic. Daemon
usage is the sequential `for await ... of` inside the SSE route,
so this is safe in production. Documented so a future fan-out
consumer doesn't accidentally rely on undefined behavior.
False positives verified and dismissed (round 3 + 4 combined):
- `path.isAbsolute('/foo/bar')` Windows breakage — `path.win32.
isAbsolute('/foo/bar')` is true; verified empirically.
- "Windows drive divergence" causing duplicate sessions — different
drives are different on-disk paths; sessions intentionally
differ.
- "parseSseStream early-break leaks reader" — `for await ... break`
triggers `iterator.return()` which runs the generator's `finally`
that calls `releaseLock`. Standard JS semantics.
- "Promise executor sync-throw fragility in requestPermission" —
sync throws inside `new Promise(executor)` reject the outer
promise; functionally correct, just stylistic.
- "Force-close timeout test elapsed assertion flakiness" — assertion
is `< 5500ms` but the natural happy-path is sub-100ms. Generous
headroom; not flake-prone in practice.
- "fetch reference stale after polyfill" — `globalThis.fetch.bind`
captures at construction; tests inject `opts.fetch` instead of
polyfilling, which is the correct pattern.
Test counts unchanged (cli serve **93**, sdk **236**); typecheck +
lint clean. STAGE1_FEATURES still matches every implemented route
1:1, fakeBridge in tests implements every HttpAcpBridge method.
* fix(cli): PR #3889 review round 1 — critical correctness (#3803)
Addresses the four critical findings from the PR #3889 reviewer pass:
1. ACP `ReadTextFileRequest.line` is 1-based per spec, but the
bridge's `BridgeClient.readTextFile` was treating it as a
0-based slice index. A client asking for `{line:1, limit:2}`
("first two lines") was getting lines 2-3 — a sign-off-by-one
bug that breaks every editor / SDK client following the ACP
schema. Convert to 0-based via `Math.max(0, line - 1)`. The
existing slice test was asserting the wrong behavior; updated
to expect the spec-correct result and added a second `line:3,
limit:2` case to lock in the offset.
2. `modelServiceId` was accepted by the SDK + server `POST /session`
path, forwarded into `bridge.spawnOrAttach`, and then silently
dropped: `doSpawn` never wired it into the agent. Callers
requesting a specific model got the agent's default and no
indication anything was wrong. Now `doSpawn` issues
`unstable_setSessionModel` immediately after `newSession`. If
the agent rejects the model id, the half-initialized session is
torn down and the spawn rejects so the caller can retry cleanly
instead of inheriting silent drift. Three new bridge tests:
happy path, omit-when-undefined, agent-rejection cleanup.
3. The CORS middleware used `cors({ origin: (o, cb) =>
cb(new CORSError(...), false) })` for browser-Origin requests.
`cors` flows the Error into Express's error chain; without an
explicit error handler that produces a 500 + HTML body, which
is misleading for what is really a deterministic 403 denial.
Replace with a tiny `RequestHandler` that checks
`req.headers.origin` directly and returns
`403 { error: 'Request denied by CORS policy' }` JSON. Drops
the `cors` and `@types/cors` dependencies — there's no other
consumer in the cli package.
4. The SSE `stream_error` synthetic frame hard-coded `id: 0`,
which would regress the client's `Last-Event-ID` tracker and
trigger duplicate replays on reconnect. The frame is terminal
and daemon-emitted — it has no place in the per-session
monotonic sequence. Refactor `formatSseFrame` to omit the
`id:` line when the input event has no id field, and emit
`stream_error` without one. Test updated to assert
`frames[1].id === undefined` while the preceding
`session_update` still carries its monotonic id.
Tangential cleanup: `errorMessage` now formats the SSE error body
(was `err.message` only — would have shown `[object Object]` for
JSON-RPC errors mid-stream, mirroring the round-1 SDK fix).
Test counts: cli serve **96** (was 93, +3 modelServiceId cases);
existing readTextFile slice test rewritten in place. Full
typecheck + lint + suite green.
* fix(cli,sdk): PR #3889 review round 2 — SSE robustness + EventBus polish (#3803)
Second batch of reviewer-flagged fixes for PR #3889. Addresses 7
robustness issues across the daemon's SSE pipeline + the bus + the
SDK's stream parser.
Daemon SSE (`server.ts`):
- SSE writes now respect backpressure. `res.write` returns false when
the kernel send buffer is full; the previous code ignored that and
Node accumulated payloads in user-space memory unboundedly. A slow
consumer on a chatty session could balloon daemon RSS. New
`writeWithBackpressure` helper awaits `drain` (or `close`/`error`)
before scheduling the next write — for both per-frame writes and
heartbeats.
- `parseLastEventId` rejects values > `Number.MAX_SAFE_INTEGER`. With
the prior `^\d+$` regex a malicious 25-digit value would parse to
a number that loses precision and confuses replay comparisons.
EventBus (`eventBus.ts`):
- `Last-Event-ID` replay events now `forcePush` past `maxQueued`. A
client reconnecting with a 1000-event gap on a subscriber whose
cap is 256 was silently losing entries 257-1000 — a sign-off-by-
nothing breakage of the resume contract. Live publishes still go
through the normal cap (slow live consumer must be evictable);
historical replay is bypassed.
- `onAbort` now disposes the subscription immediately instead of
only closing the queue. An aborted-but-never-iterated subscriber
used to linger in `bus.subs` until the consumer drove `next()` /
`return()`. New tests cover both abort-after-subscribe and
already-aborted-at-subscribe paths.
- `BoundedAsyncQueue.next` now checks `buf.length > 0` before
shifting instead of `buf.shift() !== undefined`. The bus never
pushes `undefined` today but the queue is generic — the prior
pattern would mis-handle a queue whose element type legitimately
includes undefined.
SDK SSE parser (`sse.ts`):
- Now flushes the TextDecoder on stream close. Without the final
`decoder.decode()`, an incomplete multi-byte UTF-8 sequence at
the tail of the last chunk was silently dropped — corrupting any
frame whose JSON ended mid-character. New test feeds a stream
split mid-byte through "中" (3-byte UTF-8) and asserts the
character round-trips.
- Frame separators now accept both `\n\n` and `\r\n\r\n`. SSE spec
allows CRLF, and intermediaries (corporate proxies, some Node
http servers) sometimes normalize. Frame field splitter also
accepts `\r?\n`. Two new tests cover pure CRLF + mixed-LF/CRLF.
Test counts: cli serve **99** (was 96, +3 EventBus); sdk daemon-sse
**10** (was 7, +3). Full typecheck + lint + suite green.
* docs(cli,sdk): PR #3889 review round 3 — minor + docs (#3803)
Last batch from the PR #3889 reviewer pass: mostly docs + a
ReDoS-tooling-silencing rewrite + a yargs-key cleanup.
- `commands/serve.ts` ServeArgs interface dropped the camelCase
`httpBridge` mirror; the handler now reads `argv['http-bridge']`
matching the declared option name. The dual surface relied on
yargs's camelCase expansion behavior — fragile if yargs config
ever changes.
- `DaemonClient` constructor's `baseUrl.replace(/\/+$/, '')` (which
is end-anchored and linear, but CodeQL's polynomial-regex
detector flags any `\/+$` pattern on attacker-controlled input)
swapped for a hand-rolled `stripTrailingSlashes` loop. Same
behavior, no rule trigger.
- `defaultSpawnChannelFactory`'s `cwd: workspaceCwd` flow into
`spawn` is the second CodeQL finding ("uncontrolled data used in
path expression"). It IS user-controlled, by design — that's the
Stage 1 trust model. Added a `// lgtm[js/shell-command-
constructed-from-input]` suppression with a comment explaining
the model and pointing at issue #3803 §11 for the Stage 4+ remote-
sandbox replacement.
- Stale doc comment on `createServeApp` that still listed only
`/health`, `/capabilities`, `POST /session` as shipped — now
enumerates all 9 routes that match §04 of the design.
- Stale doc comment on `HttpAcpBridge` saying "Stage 1 buffers them
in-memory; SSE wiring lands in the next PR" — SSE wiring landed
in commit 41aa95094. Replaced with a description of the actual
flow through EventBus + SSE.
No behavior change; tests + lint + typecheck still green. cli serve
still **99**, sdk **38** (was 30 before this batch — daemon-sse +3,
DaemonClient +5 from rounds 1+2). Full e2e against built daemon
re-verified: CORS denial returns 403 JSON (was 500 HTML), bad
`modelServiceId` now causes spawn to fail with HTTP 500 (was: silent
default-model substitution), `POST /session` without modelServiceId
unaffected.
* fix(cli,sdk): self-audit round 5+ — close orphaned EventBus + DaemonEvent.id optional (#3803)
Two more fixes from a final post-review-comment audit pass on PR #3889.
Both are subtle correctness gaps that fell out of the round-1 critical
fixes (modelServiceId apply + SSE id-less stream_error).
- In `httpAcpBridge.ts:doSpawn`, when `unstable_setSessionModel`
rejects after `newSession` succeeded, we tear down the entry from
`byWorkspace` + `byId` (round 1 fix) but did NOT close the
EventBus we'd just constructed for that entry. The agent could
have published a session_update notification during init that
queued in the (now unreachable) bus's ring buffer; without an
explicit close the bus + buffer linger until the next GC cycle.
Bounded leak (1 bus per failed spawn × 1000-event ring) but
cleaner to close it. New regression test exercises the retry path
after a model-rejection failure to lock in that we don't reuse
the orphan and that subscribers on the fresh session see an empty
iterator on immediate abort.
- SDK `DaemonEvent.id` is now `id?: number` instead of `id: number`.
The round-1 SSE fix made the daemon emit `stream_error` frames
*without* an `id:` line so they don't pollute the per-session
monotonic sequence. The SDK parser correctly returns `undefined`
for the missing field, but the type still advertised `id: number`
— TypeScript consumers persisting `lastSeenId = event.id` would
accidentally store `undefined`. Made the field optional and added
a doc comment instructing consumers to skip frames without an id.
Plus one more false-positive verified and dismissed:
- "writeWithBackpressure Promise double-settle race": the auditor
flagged that `res.write(chunk, callback)` could fire its callback
after the synchronous `ok=true` resolve. Verified harmless —
Promise double-settle is a no-op, the callback only rejects on
error (caught separately by `res.on('error', cleanup)`), and
multiple parallel writes register independent listener sets that
each remove their own pair after firing.
Test counts: cli serve **100** (was 99, +1 retry-after-model-rejection
regression). SDK unchanged at 239. Full typecheck + lint + suites
green; flow re-verified end-to-end.
* fix(cli,sdk): PR #3889 review round 4 — child-crash recovery + SSE/permission/SSE polish (#3803)
Fourth and final batch of reviewer-flagged fixes for PR #3889. 14
inline threads addressed, plus 8 spam threads up for resolution.
Critical correctness:
- `eventBus.test.ts`'s ring-eviction test wrapped its assertion in a
`void (async () => { … })()` IIFE that returned synchronously to
vitest — the inner `expect` could fail without ever surfacing.
Hoisted to a top-level `await` so the harness actually waits and a
broken eviction would now fail loudly.
- `runQwenServe.ts handle.close()` is now idempotent. Concurrent
callers (test harness + signal handler firing simultaneously,
explicit caller + finally-block fallback) used to each construct a
new shutdown promise, arm a fresh force-close timer, and call
`bridge.shutdown` redundantly. Cache a single `closePromise`;
repeat calls return it. New test exercises 3 overlapping callers
+ a post-settle call → exactly one bridge.shutdown.
- `POST /permission/:requestId` now rejects `outcome.selected` with
an empty `optionId`. The string-typeof check passed `""` through;
bridge would forward an opaque "unknown option" error from the
agent. Tighten the validator + add a 400 test.
- `denyBrowserOriginCors` now has explicit unit tests (3 cases:
Origin-bearing GET → 403 JSON, no-Origin GET → 200, Origin-bearing
POST → 403 + bridge untouched). The CSRF defense was previously
implicit-only.
Channel-exit recovery:
- `AcpChannel` interface gains an `exited: Promise<void>` that
resolves on either planned `kill()` or unexpected child crash.
Bridge subscribes via `channel.exited.then(...)`: if the entry is
still in `byId` when exit fires (i.e. unexpected crash), it
cancels pending permissions, publishes a `session_died` event so
SSE subscribers get notified, closes the bus, and removes the
entry from `byWorkspace`/`byId`. Without this, a crashed child
used to leave its `SessionEntry` stuck — under
`sessionScope:'single'` (default) the whole workspace was
unreachable until daemon restart.
- `defaultSpawnChannelFactory` now wires `child.once('error', …)` in
addition to `'exit'`. Without an `error` listener Node treats an
async spawn failure (ENOMEM, EACCES, …) as an unhandled error and
crashes the daemon.
- Two new bridge tests: `crash()` simulates an unexpected exit →
asserts `session_died` event + entry removed + retry spawns a
fresh child; planned shutdown asserts the cleanup handler no-ops
when the entry is already gone (no double-publish).
SSE robustness:
- SDK `parseSseStream` now calls `reader.cancel()` (not just
`releaseLock`) in its `finally`. Early-break consumers were
leaving the underlying HTTP body stream open; cancel propagates
upstream so the connection drops promptly. New test asserts the
underlying ReadableStream's `cancel()` runs.
- SDK `parseSseStream` accepts `data:` (no space after colon) AND
multiple `data:` lines per frame (joined by `\n` per spec). Two
new tests cover both cases.
- SDK `DaemonClient.subscribeEvents` now validates response
Content-Type before delegating to the parser. A misconfigured
proxy returning 200 + JSON was silently producing zero events;
now throws `DaemonHttpError` with the actual mime type.
- Daemon SSE route's initial `retry: 3000` write now `.catch(()=>{})`s.
A socket that errors before the first write would have surfaced as
an unhandled rejection.
Documentation (deferred items now noted in code):
- `EventBus.publish` ring shift is O(n) when full. Comment notes
the deferral; circular-buffer refactor only if profiling flags it.
- SSE heartbeat doesn't detect dead connections without TCP RST.
Comment notes Stage 2 may add an explicit idle timeout.
- `defaultSpawnChannelFactory` won't run a `.ts` entry directly —
`npm run dev` users must build first. Comment in the spawn site.
Test counts: cli serve **107** (was 100, +7), SDK daemon **42**
(was 38, +4). Full typecheck + lint + suite green.
* test(integration): qwen serve daemon — routes + streaming + recovery (#3803)
Persists the e2e validation of every PR #3889 fix as vitest
integration tests under `integration-tests/cli/`. Two files split by
auth requirement:
`qwen-serve-routes.test.ts` (18 cases, no LLM credential needed)
- Bearer auth timing-safe compare: right token / wrong-same-length /
wrong-shorter / missing / Basic-scheme.
- CORS browser-Origin denial: GET-with-Origin → 403 JSON; no-Origin
→ 200.
- Capabilities envelope: all 9 Stage 1 features advertised in order.
- POST /session validation: relative cwd → 400; two parallel POSTs
same workspace coalesce; bad modelServiceId tears down half-init.
- POST /permission/:requestId validation: empty optionId → 400;
missing optionId → 400; valid vote on unknown id → 404.
- SDK SSE Content-Type guard: throws DaemonHttpError when upstream
returns 200 + JSON.
- Last-Event-ID strict parsing: malformed value accepted but
ignored (`'1abc'` doesn't get parsed as 1).
- Cancel idempotent + listWorkspaceSessions returns the live session.
`qwen-serve-streaming.test.ts` (3 cases, gated by SKIP_LLM_TESTS)
- Real `qwen --acp` child SIGKILL → daemon publishes
`session_died`, removes the entry from `byWorkspace`/`byId`,
next createOrAttachSession spawns fresh. Uses `pgrep -P` to
locate the daemon's direct child by PID.
- Two SSE subscribers + a tool requiring permission: both observe
the same `permission_request` requestId; two concurrent POST
votes resolve as exactly one 200 + one 404 (first-responder
wins).
- SSE reconnect with `Last-Event-ID: N` after consuming N frames
yields events with `id > N` from the bus's replay ring.
Both files spawn `node packages/cli/dist/index.js serve --port 0
--token …` per `beforeAll` and clean up in `afterAll`. Use the
existing `@qwen-code/sdk` alias the integration-tests vitest config
already wires to the built SDK bundle.
Run with the existing `npm run test:integration:cli:sandbox:none`
(or any of the integration-tests target). The streaming file is
skip-able via `SKIP_LLM_TESTS=1` for environments without auth.
Verified locally: 18/18 routes pass in ~6.8s; 3/3 streaming pass in
~23s against a real model.
* fix(cli): PR #3889 review round 5 — claude-opus-4-7 audit (#3803)
Seven new substantive findings from a `/qreview` pass on PR #3889.
Six real bugs + one type-safety gap; all addressed.
Critical correctness:
- **EventBus replay overflow + eviction race**. Round 4's
`forcePush` for `Last-Event-ID` replay bypassed the per-subscriber
cap, but `BoundedAsyncQueue.push`'s cap check was `buf.length >=
maxSize` — so the very next live publish saw the inflated buf,
rejected, and triggered the `client_evicted` terminal frame.
Concrete sequence the audit walked through: client reconnects
after 300+ events, replay force-pushes 300 entries, next live
event evicts them. Defeats the resume contract.
Fix: track force-pushed items separately (`forcedInBuf` counter).
`push()` cap is now on `(buf.length - forcedInBuf)`. `next()`
decrements `forcedInBuf` as the consumer drains (force-pushed
entries are FIFO at the front of `buf` since `forcePush` only
runs at subscribe time, before any live `push`). Two new
regression tests: (1) live publish after a >cap replay does
NOT evict; (2) eviction triggers only after the LIVE backlog
(excluding replay) hits the cap.
Performance + UX:
- **Eager express import on every `qwen` invocation**. The
`serve` subcommand statically imported `../serve/index.js`,
which transitively pulled express + body-parser + qs into
cold-start path of every CLI invocation (interactive, mcp,
channel, etc). ~50ms tax on the 99% of invocations that never
run `serve`. Defer to dynamic `import()` inside the handler;
types are still imported for the builder shape.
- **Middleware order**: `express.json({limit:'10mb'})` ran
BEFORE `bearerAuth`. Unauth POST got full JSON.parse before
401. Trivial DoS amp on non-loopback deployments. Reorder so
auth + Host allowlist + CORS run first; body parser runs
only for requests that pass the gate.
- **`sendPrompt` no AbortSignal**. A stuck/dead child poisons
the per-session FIFO; HTTP client disconnect didn't propagate
so daemon CPU stayed tied up. `HttpAcpBridge.sendPrompt` now
accepts `signal?: AbortSignal`. Route handler creates an
AbortController and wires `req.on('close')` to abort it. On
abort, bridge sends an ACP `cancel` notification; the agent
winds down → prompt resolves with `stopReason: 'cancelled'`
→ next queued prompt can run. New test exercises real
socket disconnect via `node:http` (jsdom AbortSignal isn't
compatible with undici).
Security:
- **`--token` on argv leaks via `/proc/<pid>/cmdline`**. Default
Linux permissions allow any local user to `ps auxww | grep
'qwen serve'` and read the bearer token. Daemon now warns to
stderr when `--token` is used and recommends
`QWEN_SERVER_TOKEN` (which uses `/proc/<pid>/environ`,
owner-only).
- **Token inherited by spawned `qwen --acp` child**. `env:
process.env` in `defaultSpawnChannelFactory` passed
`QWEN_SERVER_TOKEN` into the child. The agent runs
user-supplied prompts with shell-tool access — leaving the
token in env enables prompt-injection-into-self-call attacks.
Strip `QWEN_SERVER_TOKEN` from the child's env before spawn.
Robustness:
- **`BridgeClient` publishes lacked try/catch on closed bus**.
`BridgeClient.requestPermission` and `sessionUpdate` called
`entry.events.publish(...)` directly. Shutdown closes the bus
*before* killing the channel, so a late `sessionUpdate` from a
not-yet-dead agent throws. For `requestPermission` the throw
was particularly bad: `registerPending` had already mutated
the daemon-wide map, so the throw left the registry
inconsistent. Cleaner fix: make `EventBus.publish` a no-op on
closed bus (returns undefined) instead of throwing. Removes
the need for try/catch at every call site and keeps state
consistent.
Type safety:
- **`STAGE1_FEATURES: readonly string[]`** widened the inferred
tuple-of-literals back to `string[]`. A typo'd feature
(`'sesion_set_model'`) compiled silent. Drop the annotation +
add `as const`; export `Stage1Feature` literal-union for
SDK-side `features.includes(...)` checks to narrow against.
Test counts: cli serve **112** (was 105, +7); SDK unchanged at
243. Full typecheck + lint + suite green.
* fix(cli): PR #3889 review round 6 — gpt-5.5 audit (#3803)
Four new findings from a `/review` pass on PR #3889. Three real
correctness bugs + one Stage 1 design-gap documentation.
Critical:
- **`[::1]` bind ENOTFOUND**. `LOOPBACK_BINDS` accepts `[::1]` for
the auth gate, but `app.listen()` wants the unbracketed `::1`;
`qwen serve --hostname [::1]` passed the gate and then crashed
with ENOTFOUND. Strip brackets at bind-time, keep them for the
printed URL. New test asserts the listener actually binds when
the operator types `[::1]`.
- **`sendPrompt` no transport-close detection**. The chained
`entry.connection.prompt()` could hang indefinitely if the
`qwen --acp` child wedged or the underlying stream broke
mid-flight (the SDK's pending JSON-RPC promise never delivers
a response). Because the per-session FIFO tail derives from
that promise, a single stuck prompt poisoned every subsequent
caller for the same session. Round 4's `channel.exited` is
already wired to remove the entry, but the in-flight prompt
itself wasn't racing it.
Fix: race `entry.connection.prompt(...)` against
`entry.channel.exited` inside `sendPrompt`; when the transport
closes mid-flight, the prompt fast-fails with a descriptive
error rather than hanging the queue. New test exercises this
via a stuck fake agent + manual `crash()`.
Real correctness:
- **`spawnOrAttach` attach-path ignored modelServiceId**. Under
`sessionScope:'single'` (default) a client requesting a
specific model on attach got `attached:true` while continuing
to use whatever model the shared session already had — a
silent contract drift. Refactor the per-session
`unstable_setSessionModel` call into a shared
`applyModelServiceId(entry, modelId)` helper that runs both at
create-time (existing path) AND on attach-with-model. Same
helper publishes the `model_switched` event so cross-client
UIs see the change. New tests cover apply-on-attach and the
omit-modelServiceId-on-attach no-op case.
Stage 1 design:
- **`BridgeClient.{readTextFile, writeTextFile}` raw fs proxy**.
The audit flagged that the bridge reimplements file I/O with
`fs.{read,write}File` instead of delegating to core's
filesystem service — divergence on BOM handling, non-UTF-8
encodings, original line endings. Wiring core's
FileSystemService through the bridge is invasive (constructor
dep, reaches into core's runtime), and Stage 2's in-process
bridge eliminates the proxy entirely. Documented as a
known gap with the exact user-visible scenarios; no behavior
change in this PR.
Test counts: cli serve **116** (was 112, +4); full cli **5070**
(was 5066, +4); SDK unchanged at 243. Lint + typecheck green.
* fix(cli): PR #3889 review round 7 — match CodeQL suppression to fired query (#3803)
Single new CodeQL alert (#201) on `workspaceCwd → spawn({cwd})`. The
round-3 suppression I added (`lgtm[js/shell-command-constructed-from-
input]`) referenced the WRONG query id — the alert fires the
`js/path-injection` query, not the shell-command one. The misnamed
suppression also lived 30+ lines above the actual flagged spawn call,
out of CodeQL's annotation scope.
Move the suppression onto the line immediately preceding the spawn
call and use the matching query id `js/path-injection`. The
function-level comment block above still documents the Stage 1 trust
model rationale (operator-controlled cwd is intentional; agent runs
as same UID with shell-tool access; Stage 4+ remote sandbox replaces
this factory entirely).
Defense-in-depth note added: `workspaceCwd` is canonicalized via
`path.resolve()` in `spawnOrAttach` before reaching this factory, and
spawn's `cwd` doesn't pass through any shell.
No behavior change. Test counts unchanged (cli serve 116, full cli
5070).
* fix(cli): self-audit round 8 — concurrency + listener leak + IPv6 + CodeQL honesty (#3803)
Multi-round audit pass on PR #3889 commits 5/6/7. Four findings, one
real high-severity.
High:
- Attach-with-modelServiceId had no error recovery and no FIFO. If
the agent rejected the new model on attach, `applyModelServiceId`
threw, the route 500'd, and the existing session kept running the
OLD model — caller sees a 500 with no easy way to detect the
state. Worse, two simultaneous attaches with different
modelServiceIds would race the `unstable_setSessionModel` calls
with no serialization. Add a per-session `modelChangeQueue`
(parallel to `promptQueue`); `applyModelServiceId` now chains
through it. On failure publishes a `model_switch_failed` event to
the bus so OTHER attached clients can see what happened (the
failed-caller still gets the 500). Two new bridge tests cover
rejection observability + concurrent FIFO.
Medium:
- `sendPrompt` was adding a `.then` listener to
`entry.channel.exited` PER CALL, accumulating linearly with
prompt count over a session's lifetime. ~hundreds of bytes per
prompt; trivially observable on chatty long-running sessions.
Cache a single `transportClosedReject` lazy-init promise on
SessionEntry; every subsequent prompt's race uses the same
promise.
Low:
- `[host]:port` IPv6 syntax in `--hostname` was being naively
bracket-stripped to `host]:port`, which Node rejects with a
cryptic ENOTFOUND at startup. Tighten the strip to only
accept pure `[addr]` forms; reject the URL-with-port form
upfront with a useful error pointing at `--port`.
- `BoundedAsyncQueue.forcedInBuf` invariant comment was wrong: it
claimed force-pushed items were always at the front of `buf`,
but the eviction-frame path force-pushes at the BACK. The
miscount that follows is functionally inert (`close()` blocks
the next cap check), but the comment was actively misleading.
Rewrote it to honestly describe both call paths and explain
why the eviction-case miscount is harmless.
CodeQL honesty:
- Round 7's `// lgtm [js/path-injection]` comment doesn't actually
suppress alerts — GitHub Code Scanning ignores inline `lgtm`
annotations (LGTM.com retired 2021). Replaced the misleading
`// lgtm` line with a NOTE block stating the constraint
explicitly: suppression requires UI dismissal or
`.github/codeql/codeql-config.yml`, both out of scope for a
code-only PR. The function-level comment that explains the
Stage 1 trust model rationale stays.
Test counts: cli serve **119** (was 116, +3); full cli **5073**
(was 5070, +3, no regressions).
* fix(cli): self-audit round 9-10 — reject empty-bracket --hostname (#3803)
Final fix from rounds 9-10 of the audit chain. One real concern + three
nice-to-have test gaps that the code already handles correctly.
- `--hostname '[]'` (empty brackets) used to slip past the bracket
validator: `slice(1, -1)` produced `''`, which Node interprets as
"bind to all interfaces". An operator typing `[]` clearly meant
something specific, not wildcard. Reject the empty-inner case
upfront with the same useful error as the `[host]:port` case.
New test asserts the rejection.
Round 10 ran a clean convergence pass and signed off:
- Cross-cutting state invariants (byWorkspace, byId, inFlightSpawns,
pendingPermissions, plus all per-entry queues and caches) — all
mutations paired and async holes safe.
- All test names match assertions.
- Public type surface clean (DaemonEvent.id?, Stage1Feature
CLI-only, DaemonClientOptions.fetch shape correct).
- Production paths verified: non-executable child times out at 10s
init, multiple-daemon EADDRINUSE rejects cleanly via
`server.once('error', reject)`.
- Three "missing test" notes (transportClosedReject cache sharing,
full subscribe-publish-evict sequence, modelChangeQueue failure
isolation) are diagnostic gaps — the code paths are correct and
covered by adjacent tests.
Test counts: cli serve **120** (was 119, +1 empty-bracket); SDK
unchanged at 243.
* docs(cli): note SSE single-line data emit vs multi-line parser (#3803)
formatSseFrame emits the payload as a single `data:` line. The
EventSource spec also allows a frame to span multiple `data:` lines
(joined by `\n` on parse), and the SDK receive-side parser handles
that variant — but we never emit it because the JSON payload has no
embedded newlines after JSON.stringify. Document the in/out asymmetry
so future readers don't mistake the absence of newline splitting for
a bug. Closes review thread AMgP0.
* fix(cli,sdk): close 11 #3889 review threads — race + leak + IPv6 + SSE
Critical correctness:
- setSessionModel now serializes through `entry.modelChangeQueue` so
POST /session/:id/model can't race with the attach-with-different-
modelServiceId path that already chains on the same queue. Without
this two concurrent model changes interleave and the published
`model_switched` event may not match the agent's actual model.
- POST /session reaps the spawned child when the client disconnected
during the 1-3s spawn window (`req.aborted && !session.attached`).
Without this, every aborted request leaks one orphan child the
daemon can't address by sessionId. Attached sessions skip the kill
— another client legitimately owns them.
- spawnOrAttach refuses dispatch once shutdown has started
(`shuttingDown` flag set at the top of `shutdown()`). Late-arrivers
on already-established HTTP connections that pass `server.close`'s
rejection of NEW connections would otherwise spawn children the
shutdown snapshot already missed. Late re-check inside `doSpawn`
(after `connection.newSession` resolves) catches the in-flight case
and tears down the half-built channel.
- sendPrompt early-aborts pre-aborted callers before queuing — saves
a queue trip and gives a clean trace for retry-after-abort flows.
Defensive:
- parseSseStream caps the unread buffer at 16 MiB. Without this, an
upstream that returns non-SSE (misconfigured proxy, long-lived
non-streaming body) feeds `buf` until the consumer OOMs.
- parseSseStream now accepts an optional AbortSignal that is checked
at each iteration, and DaemonClient.subscribeEvents forwards
`opts.signal` into it. Post-200 aborts now actually stop iteration
instead of buffering frames until the upstream closes.
- DaemonClient.fetchTimeoutMs (30s default) wraps every short-poll
method (health/capabilities/createOrAttachSession/listWorkspaceSessions/
setSessionModel/cancel/respondToPermission) with `AbortSignal.timeout`.
Composes with caller-provided signals via `AbortSignal.any`. `prompt`
is intentionally exempt (long-lived: model + tool turns can take
minutes); `subscribeEvents` is exempt (long-lived SSE).
- New `bridge.killSession(sessionId)` API mirrors the shutdown teardown
for a single session — used by POST /session orphan-reap above and
exposed for future routes that need targeted cleanup.
Stale + cosmetic:
- Bridge map header comment said "no path that removes a session...
when its child process crashes between requests" — out of date since
the `channel.exited` cleanup landed in an earlier audit round.
Rewritten to describe the actual cleanup chain.
- runQwenServe now wraps IPv6 hostname literals in brackets when
building the URL (`http://[::1]:4170` not `http://::1:4170`). The
bracket-stripping logic on `listenHostname` already handled
`app.listen()` correctly; this fixes the printed/copy-paste URL.
- Dead `mode: ServeMode` variable in serve.ts removed (the runQwenServe
call hardcodes `mode: 'http-bridge'`); the warning condition is now
inlined.
Test plan:
- `vitest run` cli/serve: 120/120 + 49/49 (httpAcpBridge) pass
- `vitest run` sdk-typescript daemon: 42/42 pass
- tsc --build packages/cli packages/sdk-typescript: clean
- ESLint: clean
* chore(lint): allow mime/lite in import/no-internal-modules (#3803)
`packages/core/src/utils/fileUtils.ts` and its test import `mime/lite`,
which is mime@4's documented public sub-export (a smaller bundle that
omits the legacy mime DB) — not an internal module. The rule has been
flagging these on PR CI runs even though main's CI happens to pass
(likely stale-cache vs fresh-install timing). Add `mime/lite` to the
allowlist so lint is consistent across main and PR runs.
* fix(cli,sdk): close 14 review threads — env whitelist + races + Windows tests + structured errors (#3803)
Critical correctness:
- registerPending now resolves orphaned permissions as cancelled when
the entry has been torn down between the agent's `requestPermission`
decision and the bridge handler firing. Previously the permission
would hang the agent forever (killSession's pendingPermissionIds
iteration didn't include the just-orphaned id, shutdown's clear()
dropped it without resolving).
- Workspace key now goes through `realpathSync.native` (with a
resolved-but-uncanonicalized fallback for non-existent paths) so
case-insensitive filesystems (macOS APFS, Windows NTFS) don't
silently degrade `sessionScope: 'single'` into "one session per
spelling". Matches how `config.ts` / `settings.ts` / `sandbox.ts`
resolve workspace paths.
- killChild gets a hard 10s deadline after SIGKILL so a child stuck
in uninterruptible sleep (D-state, e.g. NFS read on a dead server)
can't block `bridge.shutdown()`'s `Promise.all` forever.
`SHUTDOWN_FORCE_CLOSE_MS` in `runQwenServe` only covers
`server.close()` — without this hard kill, daemon shutdown hangs.
- setSessionModel now races the agent call against
`transportClosedReject` and wraps in `withTimeout`, matching what
`sendPrompt` and `applyModelServiceId` already do. Without the
race, a wedged child blocks `POST /session/:id/model` forever.
Also publishes a `model_switch_failed` SSE event on rejection so
passive subscribers see the failure (matches `applyModelServiceId`).
- shutdown() now awaits `inFlightSpawns` so the late-shutdown re-check
inside `doSpawn` finishes its half-built channel teardown before
`bridge.shutdown()` resolves. Without the await, `runQwenServe.close()`
returns and `process.exit(0)` is queued before the orphan tears
itself down, surfacing a stderr error AFTER the daemon claimed
graceful shutdown.
- sendPrompt re-checks `signal.aborted` immediately after
`addEventListener` so a microsecond-window synchronous abort that
fires between the early-exit check and listener registration still
triggers the agent `cancel` notification.
Security:
- `defaultSpawnChannelFactory` now passes an *allowlisted* environment
to the spawned `qwen --acp` child instead of `{ ...process.env }`
with `QWEN_SERVER_TOKEN` deleted. The agent runs user-supplied
prompts with shell-tool access; anything in its env (OPENAI/
ANTHROPIC/DASHSCOPE keys, AWS/GCP credentials, DB passwords,
OAuth tokens) is reachable by prompt injection. Allowlist covers
HOME/PATH/USER/LOGNAME/LANG/LC_*/TMPDIR/TEMP/TMP/NODE_PATH plus
Windows essentials (SYSTEMROOT/USERPROFILE/APPDATA/...). The
explicit `delete childEnv['QWEN_SERVER_TOKEN']` stays as
defense-in-depth — anyone grepping for the token name finds the
scrub explicitly named.
Observability:
- 5xx responses now carry structured `code` and `data` fields when
the underlying error has them (JSON-RPC errors from the ACP SDK
forward as `{code, message, data}`). Without this, every distinct
failure (quota / rate-limit / auth / crash) collapses to the same
opaque "Internal error" string at the client.
- 5xx errors log to stderr (via `writeStderrLine`, not `console.error`,
to keep the no-console lint rule happy). Stop-gap until structured
access/error logging lands.
- Eviction frame on EventBus subscriber overflow no longer consumes
…1 parent fd53527 commit 870bdf2
32 files changed
Lines changed: 12784 additions & 5 deletions
File tree
- docs
- developers
- examples
- users
- integration-tests/cli
- packages
- cli
- src
- commands
- config
- serve
- sdk-typescript
- src
- daemon
- test/unit
- vscode-ide-companion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
431 | | - | |
| 431 | + | |
432 | 432 | | |
433 | 433 | | |
434 | 434 | | |
435 | 435 | | |
436 | 436 | | |
| 437 | + | |
437 | 438 | | |
438 | 439 | | |
439 | 440 | | |
| |||
461 | 462 | | |
462 | 463 | | |
463 | 464 | | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
464 | 479 | | |
465 | 480 | | |
466 | 481 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
0 commit comments