test: fix flaky CI tests (multipart tmpdir teardown + egg-scripts start readiness)#6015
Conversation
The afterAll cleanup `fs.rm(tmpdir, { force: true, recursive: true })` defaults to
maxRetries: 0, so a transient ENOTEMPTY (a file appearing in a date-bucket dir
mid-removal — async per-request file cleanup, or APFS directory operations on
macOS) throws and fails the whole suite. This flaked `file-mode.test.ts` on
macOS Node 24 in CI ("ENOTEMPTY: directory not empty, rmdir .../2026/06/28/05")
while passing on every other matrix job.
Add `maxRetries: 3` to the recursive removal in all 7 multipart test teardowns so
Node retries with linear backoff on the transient ENOTEMPTY/EBUSY/EPERM family.
The production clean_tmpdir schedule already swallows these errors, so no
src change is needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
📝 WalkthroughWalkthroughMultipart test teardown now retries tmpdir removal, and stop-script tests now wait for startup and shutdown log output using a shared polling helper. ChangesMultipart test cleanup retry hardening
Stop test log polling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## next #6015 +/- ##
=======================================
Coverage 81.95% 81.95%
=======================================
Files 677 677
Lines 20651 20651
Branches 4099 4099
=======================================
+ Hits 16924 16925 +1
+ Misses 3214 3213 -1
Partials 513 513 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request adds maxRetries: 3 to fs.rm calls in several test files to prevent failures when deleting temporary directories. The reviewer suggests reordering the afterAll hooks so that server.close() and app.close() are executed before fs.rm. This ensures that the server and application are fully shut down and all file handles are released before attempting to delete the temporary directory, addressing the root cause of flaky ENOTEMPTY or EBUSY errors.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| afterAll(async () => { | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | ||
| }); | ||
| afterAll(() => app.close()); | ||
| afterAll(() => server.close()); |
There was a problem hiding this comment.
In Vitest, afterAll hooks are executed sequentially in the order they are registered. Currently, the temporary directory removal (fs.rm) is registered before app.close() and server.close(). This means the test suite attempts to delete the temporary directory while the server and application are still active and potentially holding open file handles or writing files, which is the primary cause of the flaky ENOTEMPTY / EBUSY errors.
Reordering the hooks so that server.close() and app.close() run first ensures that all network activity has ceased and all file handles are released before fs.rm is executed.
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); | |
| afterAll(() => app.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => app.close()); | |
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); |
There was a problem hiding this comment.
Thanks, but this rests on an incorrect premise about Vitest. Vitest's default sequence.hooks is 'stack', so afterAll hooks run in reverse registration order — verified empirically on v4.1.9 (3rd-registered runs first, 1st-registered last). The current order therefore already executes server.close() → app.close() → fs.rm(...): the tmpdir is removed last, after the server and app are closed. Applying the suggested reorder would move fs.rm to run first, reintroducing exactly the race described. The residual ENOTEMPTY is a transient FS-level race during the recursive removal (macOS APFS / async per-request file cleanup draining), which maxRetries: 3 handles. Keeping the current order + maxRetries.
| afterAll(async () => { | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | ||
| }); | ||
| afterAll(() => app.close()); | ||
| afterAll(() => server.close()); |
There was a problem hiding this comment.
In Vitest, afterAll hooks are executed sequentially in the order they are registered. Currently, the temporary directory removal (fs.rm) is registered before app.close() and server.close(). This means the test suite attempts to delete the temporary directory while the server and application are still active and potentially holding open file handles or writing files, which is the primary cause of the flaky ENOTEMPTY / EBUSY errors.
Reordering the hooks so that server.close() and app.close() run first ensures that all network activity has ceased and all file handles are released before fs.rm is executed.
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); | |
| afterAll(() => app.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => app.close()); | |
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); |
There was a problem hiding this comment.
Thanks, but this rests on an incorrect premise about Vitest. Vitest's default sequence.hooks is 'stack', so afterAll hooks run in reverse registration order — verified empirically on v4.1.9 (3rd-registered runs first, 1st-registered last). The current order therefore already executes server.close() → app.close() → fs.rm(...): the tmpdir is removed last, after the server and app are closed. Applying the suggested reorder would move fs.rm to run first, reintroducing exactly the race described. The residual ENOTEMPTY is a transient FS-level race during the recursive removal (macOS APFS / async per-request file cleanup draining), which maxRetries: 3 handles. Keeping the current order + maxRetries.
| afterAll(async () => { | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | ||
| }); | ||
| afterAll(() => app.close()); | ||
| afterAll(() => server.close()); |
There was a problem hiding this comment.
In Vitest, afterAll hooks are executed sequentially in the order they are registered. Currently, the temporary directory removal (fs.rm) is registered before app.close() and server.close(). This means the test suite attempts to delete the temporary directory while the server and application are still active and potentially holding open file handles or writing files, which is the primary cause of the flaky ENOTEMPTY / EBUSY errors.
Reordering the hooks so that server.close() and app.close() run first ensures that all network activity has ceased and all file handles are released before fs.rm is executed.
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); | |
| afterAll(() => app.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => app.close()); | |
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); |
There was a problem hiding this comment.
Thanks, but this rests on an incorrect premise about Vitest. Vitest's default sequence.hooks is 'stack', so afterAll hooks run in reverse registration order — verified empirically on v4.1.9 (3rd-registered runs first, 1st-registered last). The current order therefore already executes server.close() → app.close() → fs.rm(...): the tmpdir is removed last, after the server and app are closed. Applying the suggested reorder would move fs.rm to run first, reintroducing exactly the race described. The residual ENOTEMPTY is a transient FS-level race during the recursive removal (macOS APFS / async per-request file cleanup draining), which maxRetries: 3 handles. Keeping the current order + maxRetries.
| afterAll(async () => { | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | ||
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | ||
| }); | ||
| afterAll(() => app.close()); | ||
| afterAll(() => server.close()); |
There was a problem hiding this comment.
In Vitest, afterAll hooks are executed sequentially in the order they are registered. Currently, the temporary directory removal (fs.rm) is registered before app.close() and server.close(). This means the test suite attempts to delete the temporary directory while the server and application are still active and potentially holding open file handles or writing files, which is the primary cause of the flaky ENOTEMPTY / EBUSY errors.
Reordering the hooks so that server.close() and app.close() run first ensures that all network activity has ceased and all file handles are released before fs.rm is executed.
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); | |
| afterAll(() => app.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => app.close()); | |
| afterAll(async () => { | |
| await fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); |
There was a problem hiding this comment.
Thanks, but this rests on an incorrect premise about Vitest. Vitest's default sequence.hooks is 'stack', so afterAll hooks run in reverse registration order — verified empirically on v4.1.9 (3rd-registered runs first, 1st-registered last). The current order therefore already executes server.close() → app.close() → fs.rm(...): the tmpdir is removed last, after the server and app are closed. Applying the suggested reorder would move fs.rm to run first, reintroducing exactly the race described. The residual ENOTEMPTY is a transient FS-level race during the recursive removal (macOS APFS / async per-request file cleanup draining), which maxRetries: 3 handles. Keeping the current order + maxRetries.
| afterAll(() => { | ||
| return fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | ||
| return fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | ||
| }); | ||
| afterAll(() => app.close()); | ||
| afterAll(() => server.close()); |
There was a problem hiding this comment.
In Vitest, afterAll hooks are executed sequentially in the order they are registered. Currently, the temporary directory removal (fs.rm) is registered before app.close() and server.close(). This means the test suite attempts to delete the temporary directory while the server and application are still active and potentially holding open file handles or writing files, which is the primary cause of the flaky ENOTEMPTY / EBUSY errors.
Reordering the hooks so that server.close() and app.close() run first ensures that all network activity has ceased and all file handles are released before fs.rm is executed.
| afterAll(() => { | |
| return fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true }); | |
| return fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); | |
| afterAll(() => app.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => server.close()); | |
| afterAll(() => app.close()); | |
| afterAll(() => { | |
| return fs.rm(app.config.multipart.tmpdir, { force: true, recursive: true, maxRetries: 3 }); | |
| }); |
There was a problem hiding this comment.
Thanks, but this rests on an incorrect premise about Vitest. Vitest's default sequence.hooks is 'stack', so afterAll hooks run in reverse registration order — verified empirically on v4.1.9 (3rd-registered runs first, 1st-registered last). The current order therefore already executes server.close() → app.close() → fs.rm(...): the tmpdir is removed last, after the server and app are closed. Applying the suggested reorder would move fs.rm to run first, reintroducing exactly the race described. The residual ENOTEMPTY is a transient FS-level race during the recursive removal (macOS APFS / async per-request file cleanup draining), which maxRetries: 3 handles. Keeping the current order + maxRetries.
`stop.test.ts` waited a fixed 1s for a 2-worker egg app to boot before asserting its "started on http://..." log — while every sibling start-without-demon test waits 10s. On a loaded CI runner (seen on ubuntu Node 22) 1s is not enough, so the app had only logged "Starting custom-framework application…" and the assertion failed. Replace the fixed `scheduler.wait(waitTime)` before each startup/shutdown assertion with a `waitFor(getText, pattern, timeout)` poll (in test/utils.ts) that returns as soon as the expected output appears (up to 10s) — race-free and faster than a fixed delay. Windows keeps the fixed settle for the signal-driven shutdown checks (SIGTERM is not handled there). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/scripts/test/utils.ts`:
- Around line 67-69: Reset the regex state in waitFor before each poll:
RegExp.test() on a stateful pattern with /g or /y can advance lastIndex and
cause missed matches, so update waitFor to use a fresh RegExp instance or
explicitly set pattern.lastIndex = 0 before calling pattern.test(getText()).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1e672d1b-6d9c-4b7e-83bc-85d96e582a25
📒 Files selected for processing (2)
tools/scripts/test/stop.test.tstools/scripts/test/utils.ts
| export async function waitFor(getText: () => string, pattern: RegExp, timeout = 10000): Promise<void> { | ||
| const deadline = Date.now() + timeout; | ||
| while (!pattern.test(getText())) { |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -nP --type=ts 'waitFor\s*\([^,]+,\s*/(?:\\.|[^/])+/[a-z]*[gy][a-z]*' tools/scripts/testRepository: eggjs/egg
Length of output: 147
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== utils.ts ==\n'
sed -n '1,180p' tools/scripts/test/utils.ts
printf '\n== waitFor call sites ==\n'
rg -n --type=ts 'waitFor\s*\(' tools/scripts/testRepository: eggjs/egg
Length of output: 3423
Reset regex state before polling. RegExp.test() advances lastIndex for /g and /y, so a stateful regex can make this loop miss a match or behave inconsistently. Clone the regex or set lastIndex = 0 before each test.
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] Importing child_process exposes a command-execution surface; ensure any command/argument built from input is validated, and prefer execFile/spawn with an argument array over exec.
Context: import { ChildProcess } from 'node:child_process';
Note: [CWE-78] Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection').
(detect-child-process-typescript)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/scripts/test/utils.ts` around lines 67 - 69, Reset the regex state in
waitFor before each poll: RegExp.test() on a stateful pattern with /g or /y can
advance lastIndex and cause missed matches, so update waitFor to use a fresh
RegExp instance or explicitly set pattern.lastIndex = 0 before calling
pattern.test(getText()).
Two unrelated flaky test fixes surfaced while stabilizing CI for the snapshot work (#6011). Test-only; no
srcchanges.1.
@eggjs/multipart—ENOTEMPTYteardown racefile-mode.test.tsflaked on macOS Node 24 (passing on every other matrix job):fs.rm(tmpdir, { force: true, recursive: true })defaults tomaxRetries: 0, so a transientENOTEMPTY(a file in a date-bucket dir mid recursive-removal — async per-request cleanup draining / APFS latency) throws and fails the suite. The productionclean_tmpdirschedule already swallows these; only the 7 test teardowns are fragile. Fix: addmaxRetries: 3to all 7.2.
@eggjs/scripts— flakystop.test.tsstart readinessTest scripts (ubuntu-22)failed instop.test.ts:stop.test.tswaited a fixed 1s for a 2-worker egg app to boot before asserting itsstarted on http://…log — while every siblingstart-without-demon-*.test.tswaits 10s. On a loaded runner 1s isn't enough. Fix: replace the fixedscheduler.wait(waitTime)before each startup/shutdown assertion with awaitFor(getText, pattern, timeout)poll (test/utils.ts) that returns as soon as the expected output appears (up to 10s) — race-free and faster than a fixed delay. Windows keeps the fixed settle for the signal-driven shutdown checks (SIGTERM isn't handled there).Verification
stop.test.tspasses locally after build (8 passed, 1 skipped).Re Gemini's review on the multipart hunks: it claimed Vitest runs
afterAllin registration order — that's incorrect (defaultsequence.hooks: 'stack'→ reverse; verified empirically on v4.1.9), so the current teardown already removes the tmpdir last (afterserver.close()/app.close()); the suggested reorder would movefs.rmfirst and reintroduce the race. Replied inline on each thread.🤖 Generated with Claude Code
Summary by CodeRabbit