fix(eslint-plugin-runner): await stdio pipe close before terminating workers#1029
fix(eslint-plugin-runner): await stdio pipe close before terminating workers#1029fansenze wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to safely terminate worker threads by waiting for their piped stdout and stderr streams to fully close before calling terminate(). This is implemented via a new closePipe helper function with a 1-second grace timeout to prevent teardown from stalling. Test files are updated to use the exported terminateWorker utility. Feedback was provided to improve closePipe by ensuring that the 'close' event listener is cleaned up if the timeout fires, preventing potential memory leaks, and wrapping stream.destroy() in a try-catch block to handle synchronous errors.
…workers On windows-latest, terminating a worker thread while its stdio pipes are still live lets libuv's concurrent pipe teardown fault below the JS layer (a native abort). rstest's forked test child intercepts uncaughtException/unhandledRejection/process.exit, so this surfaced as "Worker exited unexpectedly" in the high-terminate-churn worker-pool-e2e suite. terminateWorker already destroyed the pipes first, but destroy() only initiates the close; a terminate() right after still races the in-flight teardown. Await each pipe's 'close' (bounded by a 1s grace) so the two are serialized — by the time we terminate, the pipes are gone and there is no concurrent teardown left to fault. Also route the tests' raw worker.terminate() calls through terminateWorker so they exercise the same safe teardown as the production timeout/crash/shutdown paths.
9044bf3 to
c7e1d78
Compare
|
Superseded by #1031. The pipe-teardown approach here didn't address the root cause: the windows abort comes from tearing down a worker thread that has oxc (a napi addon) loaded (nodejs/node#34567), not from stdio pipe teardown. #1031 skips the windows-flaky worker-pool e2e suites instead — a known worker_threads + napi limitation; a real fix would be process isolation (child_process), tracked separately. |
Summary
On
windows-latest, terminating a worker thread while its stdio pipes are still live lets libuv's concurrent pipe teardown fault below the JS layer (a native abort). Because rstest's forked test child interceptsuncaughtException/unhandledRejection/process.exit, the only way that child can "exit unexpectedly" is a native fault — so this surfaced asWorker exited unexpectedlyin the high-terminate-churnworker-pool-e2esuite (e.g.multiconfig,resilience), a flaky that has been failingmainCI.terminateWorkeralready destroyed the pipes first, butdestroy()only initiates the close (it completes on a later tick), so aterminate()called right after it still races the in-flight teardown. This PR awaits each pipe's'close'— bounded by a 1s grace so a wedged worker can't stall — before terminating. The two are now serialized: by the time we terminate, the pipes are already gone and there is no concurrent teardown left to fault.terminateWorkeris only ever called from cold paths (init/task timeout, crash respawn, shutdown grace), so normal lint throughput is unaffected.It also routes the tests' raw
worker.terminate()calls throughterminateWorker, so they exercise the same safe teardown as the production timeout / crash / shutdown paths — production never bare-terminates a worker.Verification note: the native abort only reproduces on
windows-latest. Locally (macOS) all 554 tests pass but cannot exercise the abort, so effectiveness needs to be confirmed across windows CI runs.Related Links
Checklist