Skip to content

[workflows-shared] Fix test flakiness on Windows CI#13600

Merged
petebacondarwin merged 3 commits intomainfrom
fix/workflows-shared-test-flakiness
Apr 20, 2026
Merged

[workflows-shared] Fix test flakiness on Windows CI#13600
petebacondarwin merged 3 commits intomainfrom
fix/workflows-shared-test-flakiness

Conversation

@petebacondarwin
Copy link
Copy Markdown
Contributor

@petebacondarwin petebacondarwin commented Apr 19, 2026

Fixes consistent @cloudflare/workflows-shared test failures on Windows CI runners (e.g. https://github.com/cloudflare/workers-sdk/actions/runs/24634528457/job/72027500951).

Root cause: The workflows-shared vitest config was not extending the monorepo-wide vitest.shared.ts. This meant tests ran with vitest's default 5s test timeout (vs the shared 50s used everywhere else) and zero retries. Combined with tight vi.waitUntil polling timeouts (500–1000ms), the first tests in each file consistently timed out on slow Windows runners due to cold workerd/DO startup costs.

Six tests across 3 files were failing:

  • binding.test.ts: 2 failures (test timeout 5000ms, waitUntil timeout 1000ms)
  • context.test.ts: 2 failures (test timeout 5000ms, waitUntil timeout 1000ms)
  • engine.test.ts: 2 failures (test timeout 5000ms, waitUntil timeout 1000ms)

Changes:

  • Merge vitest.shared.ts into workflows-shared/vitest.config.mts — picks up testTimeout: 50_000, retry: 2, restoreMocks: true
  • Raise waitUntilLogEvent helper default from 1000ms → 5000ms (affects all 13 call sites in binding.test.ts)
  • Fix 4 bare-number vi.waitUntil(cb, 500) calls in engine.test.ts to proper { timeout: 5000 } format
  • Bump all vi.waitUntil polling timeouts from 500–3000ms → 5000ms across all 3 test files

Deliberately not changed: step.waitForEvent timeouts, step.sleep durations, and retries.delay values — those test real workflow timeout semantics.


  • Tests
    • Tests included/updated
    • Automated tests not possible - manual testing has been completed as follows:
    • Additional testing not necessary because:
  • Public documentation
    • Cloudflare docs PR(s):
    • Documentation not necessary because: test-only changes to internal package

Open in Devin Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 19, 2026

⚠️ No Changeset found

Latest commit: 004efb4

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-project-automation github-project-automation bot moved this to Untriaged in workers-sdk Apr 19, 2026
@workers-devprod workers-devprod requested review from a team and dario-piotrowicz and removed request for a team April 19, 2026 21:15
@workers-devprod
Copy link
Copy Markdown
Contributor

workers-devprod commented Apr 19, 2026

Codeowners approval required for this PR:

  • @cloudflare/workflows
  • ✅ @cloudflare/wrangler
Show detailed file reviewers
  • packages/workflows-shared/tests/binding.test.ts: [@cloudflare/workflows]
  • packages/workflows-shared/tests/context.test.ts: [@cloudflare/workflows]
  • packages/workflows-shared/tests/engine.test.ts: [@cloudflare/workflows]
  • packages/workflows-shared/vitest.config.mts: [@cloudflare/workflows]

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 19, 2026

UnknownError: ProviderInitError

github run

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Apr 19, 2026

@petebacondarwin Bonk workflow failed. Check the logs for details.

View workflow run · To retry, trigger Bonk again.

devin-ai-integration[bot]

This comment was marked as resolved.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 19, 2026

create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@13600

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@13600

miniflare

npm i https://pkg.pr.new/miniflare@13600

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@13600

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@13600

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@13600

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@13600

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@13600

wrangler

npm i https://pkg.pr.new/wrangler@13600

commit: 004efb4

Copy link
Copy Markdown
Member

@dario-piotrowicz dario-piotrowicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The devin's comments about filter returning always truthy values seem valid to me 🤔

@petebacondarwin petebacondarwin force-pushed the fix/workflows-shared-test-flakiness branch from 90d0d73 to d80d336 Compare April 20, 2026 06:35
@petebacondarwin petebacondarwin changed the base branch from main to pbacondarwin/fix-parallel-ci-timeouts April 20, 2026 08:21
@petebacondarwin petebacondarwin force-pushed the fix/workflows-shared-test-flakiness branch 2 times, most recently from 52b2515 to 65241fa Compare April 20, 2026 08:59
Base automatically changed from pbacondarwin/fix-parallel-ci-timeouts to main April 20, 2026 09:38
The workflows-shared vitest config was not extending the monorepo-wide
vitest.shared.ts, so tests ran with vitest's default 5s test timeout
(vs the shared 50s) and zero retries. Combined with tight vi.waitUntil
polling timeouts (500-1000ms), this caused consistent failures on slow
Windows CI runners.

- Merge vitest.shared.ts into workflows-shared vitest config (adds
  testTimeout: 50s, retry: 2, restoreMocks: true)
- Raise waitUntilLogEvent helper default timeout from 1s to 5s
- Fix bare-number vi.waitUntil(cb, 500) calls to use { timeout: 5000 }
- Bump all vi.waitUntil polling timeouts from 500-3000ms to 5000ms
  (preserving deliberate step.waitForEvent timeouts that test workflow
  timeout semantics)
The 'should increment step count for steps with the same name' test
runs 3 sequential step.do calls, which needs more headroom than
single-step tests on slow Windows CI runners. Bump its vi.waitUntil
timeout from 5s to 10s.
Two waitForEvent tests used .filter() (returns Array, always truthy)
inside vi.waitUntil callbacks instead of .some() (returns boolean).
This made the waits resolve immediately on the first poll without
actually checking whether the expected events had been logged.

- Replace .filter() with .some() so vi.waitUntil actually polls
- Use === instead of == for strict equality
- Add { expect } from test context and real assertions verifying
  WAIT_START and WORKFLOW_SUCCESS events are present in the logs
- Remove the eslint-disable jest/expect-expect comments since the
  tests now contain proper assertions
@petebacondarwin petebacondarwin force-pushed the fix/workflows-shared-test-flakiness branch from 65241fa to 004efb4 Compare April 20, 2026 09:51
Copy link
Copy Markdown
Member

@dario-piotrowicz dario-piotrowicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners Bypass

Copy link
Copy Markdown
Contributor

@workers-devprod workers-devprod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners reviews satisfied

@github-project-automation github-project-automation bot moved this from Untriaged to Approved in workers-sdk Apr 20, 2026
@petebacondarwin petebacondarwin merged commit b3e6877 into main Apr 20, 2026
63 of 65 checks passed
@petebacondarwin petebacondarwin deleted the fix/workflows-shared-test-flakiness branch April 20, 2026 11:47
@github-project-automation github-project-automation bot moved this from Approved to Done in workers-sdk Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants