fix(producer): guard fileServer.close in all render cleanup paths#1406
Merged
jrusso1020 merged 2 commits intoJun 16, 2026
Merged
Conversation
`plan()` and `renderChunk()` both close the probe/chunk file server with a bare `fileServer.close()` in their cleanup sequence. `FileServerHandle.close` tears down the underlying http.Server, whose `close()` throws `ERR_SERVER_NOT_RUNNING` if the server was already torn down (for example a cancellation path that closed it once already). An unguarded throw there escapes the cleanup and masks the original plan/render result, exactly the failure the adjacent probe-session close already guards against with a try/catch (its comment even spells this out). Add `closeFileServerSafely`, which wraps the close in a try/catch and logs, and route both cleanup sites through it so the two stay consistent and a throwing close can never mask the real result. Covered by unit tests for both the throwing and happy paths.
jrusso1020
approved these changes
Jun 16, 2026
jrusso1020
left a comment
Collaborator
There was a problem hiding this comment.
Reviewed end-to-end. Small, correct defensive-hardening fix — approving.
Strengths
- Correct root cause.
FileServerHandle.close()callshttp.Server.close()with no callback (fileServer.ts:671), and Node'snet.Server.close()throwsERR_SERVER_NOT_RUNNINGsynchronously on a double-close — so a bare close in afinally/cleanup path genuinely can escape and mask the real result. A sync try/catch is exactly the right shape here. - Highest-value sites are
renderChunk.ts:657(bare close in afinally→ classic finally-masks-the-original-exception) and therenderOrchestrator.ts:2277success path (a throw there skips the assemble stage and fails an otherwise-successful render). Both now guarded, consistent with the adjacent probe-session try/catch the PR description cites. - Same-contract coverage is complete: all three direct producer close sites are routed through the guard; the cancel/error paths already go through
safeCleanup/cleanupRenderResources(render/cleanup.ts); andprobeStage.tshands the handle back to the caller rather than closing it (header comment, lines 6–8) — so there's no missed site. - Tests pin both branches: throwing close → swallowed + logged with the
[label]and error message; happy path → closes exactly once, no warning.
Nits (non-blocking)
- Observability: the new helper logs at
log.warn(fileServer.ts), but the pre-existingsafeCleanuplogs the same class of failure atlog.debug(render/cleanup.ts:29). SinceERR_SERVER_NOT_RUNNINGon a double-close is the expected, benign condition this guard exists for, the distributed/cancel paths will now emit WARN-level noise for a non-actionable event. Considerdebugfor consistency, or a short comment on why the success path warrantswarn. - DRY: there are now two "swallow + log a failed close" helpers (
safeCleanupvscloseFileServerSafely). The split is defensible (module layering —cleanup.tsimports orchestrator types, so importing it intofileServer.tsrisks a cycle; and the new helper is sync whilesafeCleanupis async), but a one-line cross-reference comment would help future readers pick the right one.
Note (out of scope): packages/engine/src/services/fileServer.ts:106 has the same bare close: () => server.close() shape in a separate module — same latent throw if any engine consumer double-closes. Separate package; flagging as a possible follow-up, not a gap in this PR.
Verdict: APPROVE
Reasoning: Real synchronous-throw bug, confirmed against Node's net.Server.close() and the handle impl; all same-contract producer sites are covered; good tests; CI fully green. Only minor observability/DRY nits.
— Claude Code (pr-review)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
fileServer.close()is called bare (no try/catch) in several render cleanup paths.FileServerHandle.closetears down the underlying http.Server, whoseclose()throwsERR_SERVER_NOT_RUNNINGif the server was already torn down (for example a cancellation path that already closed it once). An unguarded throw escapes the cleanup and masks the original plan/render result.The immediately adjacent probe-session close already guards against exactly this, and its comment spells it out:
The file-server closes next to it were left unguarded. There were three:
distributed/plan.ts(probe cleanup),distributed/renderChunk.ts(finally block),renderOrchestrator.ts(the in-process render success path, right before the assemble stage — a throw here would skip assembly and mask the result).Fix
Add
closeFileServerSafely(fileServer, label, log)infileServer.ts, which wraps the close in a try/catch and logs, and route all three sites through it.Tests
Unit tests for
closeFileServerSafely: a throwingclose()is swallowed and logged (does not propagate), and the happy path closes exactly once with no warning.plantest suite stays green. (TherenderChunkbyte-identical determinism test is host-Chrome dependent and soft-skips outsideDockerfile.test.)