You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
test(flake): scoped per-test event-loop yield in the 6 dying spec files
The Windows backend silent-ELIFECYCLE flake (12 captures so far across
PRs #7838 / #7842 / #7846) clusters tightly to six spec files whose
tests share a common shape: many short tests (50-100 in a single
describe) firing rapid sequential supertest HTTP or socket.io
connect/disconnect calls against the in-process Etherpad server.
Pre-kill node-reports show the V8 main isolate becomes event-loop-
starved 200-400 ms before each kill — the 5 Hz heartbeat falls silent
for the entire death window — then the process is externally
terminated bypassing every JS handler, --report-on-fatalerror,
--report-uncaught-exception, signal handlers. PR #7852 ruled out
TIME_WAIT accumulation as the proximate cause (the kill survived
connection reuse, even though TIME_WAIT churn dropped to near zero).
The actual trigger remains unidentified — but rapid sequential
loopback I/O across test boundaries is the only common substrate
across all six files.
This patch adds a single setImmediate yield in beforeEach of each
top-level describe in the six dying files. The yield forces the
event loop to drain its timer queue between every test in those
files, breaking the tight microtask chain that may be what triggers
whatever native-level kill is happening on Windows + Node 24.
Scope-critical: this is per-file beforeEach, NOT a root-level
mochaHooks.beforeEach. The latter was tried in #7844 and broke
ep_subscript_and_superscript's tests, which share state across
describe-block boundaries and don't tolerate a yield. Per-file scope
contains any timing perturbation to the affected files. Plugin tests
loaded from `../node_modules/ep_*/static/tests/backend/specs` are
completely untouched.
Files modified (with death counts in parens):
- src/tests/backend/specs/api/pad.ts (2)
- src/tests/backend/specs/api/importexportGetPost.ts (4)
- src/tests/backend/specs/socketio.ts (3)
- src/tests/backend/specs/messages.ts (3)
- src/tests/backend/specs/import.ts (1)
- src/tests/backend/specs/clientvar_rev_consistency.ts (1)
Test plan:
- Linux ± plugins must pass. If they regress, the yield is breaking
the affected tests' own state-sharing assumptions — we'd revert.
- Windows ± plugins: pre-fix flake rate is ~22% per #7663 (13/60
runs). If the post-fix rate drops materially over 5-10 reruns,
rapid-sequential-cadence is the trigger and this is a working
mitigation. If unchanged, cadence is ruled out and we look at
per-test pathologies (jose CNG on Windows, specific Express
middleware paths, libuv IOCP edge cases unrelated to load).
Either outcome is a strict improvement on the current state of
"we have no idea what's causing it."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 commit comments