test(ci): drain event loop with setImmediate after every test (mitigation hypothesis)#7844
test(ci): drain event loop with setImmediate after every test (mitigation hypothesis)#7844JohnMcLear wants to merge 1 commit into
Conversation
Mitigation hypothesis for the Windows backend silent ELIFECYCLE flake. Ten captured deaths so far on the merged diagnostic infrastructure (#7838, #7842) show a consistent shape: 200-400 ms after a test starts, the heartbeat (5 Hz setInterval) goes silent for the entire death window — clear evidence the event loop has stopped servicing timers — and the process is then externally terminated, bypassing all of Node's JS-level handlers, --report-on-fatalerror, and --report-uncaught-exception. Pre-kill state in the libuv handle trace is nominal (3-7 handles, no leak, no spike). Dying tests span supertest+JWT HTTP, socket.io connect bursts, and DOCX export round-trips — different surface code, same fingerprint. The common substrate is rapid loopback TCP and queued I/O across test boundaries. Insert a single setImmediate yield in the mocha root afterEach so the event loop has a deterministic drain point at every test boundary. If kill rate drops materially on the Windows backend matrix after this lands, cumulative event-loop pressure is the trigger and we have a working mitigation; if it doesn't change, we rule that out and look at per-test pathologies (jose CNG, specific Express middleware paths, etc.). Cost: ~600 tests × 1 setImmediate ≈ negligible compared to the multi-minute backend test phase. Locally verified: a 3-test probe runs cleanly with the new async afterEach. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Review Summary by QodoDrain event loop with setImmediate after every test
WalkthroughsDescription• Add setImmediate yield in mocha afterEach hook • Drain event loop at every test boundary deterministically • Mitigate Windows backend silent ELIFECYCLE flake hypothesis • Cumulative I/O pressure across tests suspected root cause Diagramflowchart LR
A["Test execution"] -->|afterEach hook| B["setImmediate yield"]
B -->|event loop drain| C["Queued I/O callbacks processed"]
C -->|deterministic boundary| D["Next test starts clean"]
E["Hypothesis: cumulative I/O pressure"] -->|mitigation| B
File Changes1. src/tests/backend/diagnostics.ts
|
Code Review by Qodo
1. Global drain not gated
|
|
Closing — the setImmediate drain in afterEach causes real plugin-test regressions (ep_subscript_and_superscript's |
Summary
Insert a single
setImmediateyield in the mocha rootafterEachso the event loop drains queued I/O callbacks at every test boundary. Pure mitigation hypothesis test — no other changes.Why
Ten captured deaths on the merged diagnostic infrastructure (#7838, #7842) show a consistent shape:
setIntervalgoes silent for the entire death window (no timer events at all).unhandledRejection,uncaughtException,--report-on-fatalerror,--report-uncaught-exception, and all signal handlers.The common substrate across all 10 deaths is rapid loopback TCP and queued I/O across test boundaries. This experiment tests one specific hypothesis: cumulative event-loop pressure across tests is the trigger. A
setImmediateyield inafterEachforces a deterministic drain at every boundary instead of letting work stack across tests.Expected outcome
Cost
~600 tests × 1 setImmediate yield ≈negligible compared to the multi-minute backend test phase. Locally verified — a 3-test probe runs cleanly with the new asyncafterEach.🤖 Generated with Claude Code