Skip to content

fix(test): stabilize lifecycle, sleep, queue, and run edge cases#4660

Open
NathanFlurry wants to merge 1 commit intobreak-up/reenable-gateway-url-coveragefrom
break-up/stabilize-driver-lifecycle-tests
Open

fix(test): stabilize lifecycle, sleep, queue, and run edge cases#4660
NathanFlurry wants to merge 1 commit intobreak-up/reenable-gateway-url-coveragefrom
break-up/stabilize-driver-lifecycle-tests

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 14, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 14, 2026

PR #4660 Review: fix(test): stabilize lifecycle, sleep, queue, and run edge cases

Overview

This PR stabilizes driver test flakes across lifecycle, sleep, queue, and run suites. Changes fall into four categories: fixture correctness fixes, test timing/sequencing hardening, API usage fixes, and a sandbox-specific workaround in the kitchen-sink example.


Issues

1. isDynamicSandboxRuntime() is a fragile heuristic

File: examples/kitchen-sink/src/actors/testing/inline-client.ts

Detecting the runtime via cwd === "/root" is brittle. Any container or CI environment running from /root silently skips the WebSocket connection path. The two branches also behave differently: the sandbox path collects no events but returns an empty events array as if it did. A dedicated env var or an explicit flag from the test harness would be more reliable than a cwd check.

2. Race window in waitForConnectionOpen

File: examples/kitchen-sink/src/actors/testing/inline-client.ts

There is a TOCTOU race: the connection can transition to "connected" between the connStatus === "connected" guard and the onOpen subscription. If that happens the promise never settles. The guard and subscribe need to be atomic, or onOpen must fire synchronously for already-connected handles.

3. Polling loop without a justification comment

File: rivetkit-typescript/packages/rivetkit/tests/driver/actor-queue.test.ts (new loop around line 310)

CLAUDE.md requires a comment immediately before every polling loop explaining why direct awaiting is insufficient. The new for loop polling parent.getSpawned() lacks that comment. Add one explaining why the run-handler spawn must be polled (e.g. "spawn is queued asynchronously in the run handler so we must poll until it appears").

4. Unsafe type cast on abort error

File: rivetkit-typescript/packages/rivetkit/fixtures/driver-test-suite/queue.ts

error as { group?: string; code?: string } is applied to unknown without a null/object check. Prefer instanceof ActorError or a proper type guard. ActorError is already used in adjacent test files.

5. triggerSleepTwice fixture added but the test remains skipped

File: rivetkit-typescript/packages/rivetkit/fixtures/driver-test-suite/sleep-db.ts

The triggerSleepTwice action is added to all three sleep-db actors, but actor-sleep-db.test.ts still has test.skip("double sleep call is a no-op") pointing at TODO #4705. These fixture additions are dead code until that test is enabled. Either enable the test here or defer the fixture changes to the PR that resolves #4705.


Suggestions

onDestroy in start-stop-race.ts - missing error handling. If recordEvent throws during actor shutdown (e.g. observer unavailable), the destroy hook surfaces an unexpected error that can mask the lifecycle event under test. Wrapping the observer call in try/catch and logging on failure is safer than propagating in a test fixture.

describe.sequential in actor-run.test.ts - add a comment. The switch from describe.skipIf to describe.sequential is semantically different (sequential execution vs. skipped). If sequential execution is required to avoid shared actor state collisions, a brief comment on describeRunTests would clarify the non-obvious constraint per CLAUDE.md.

Good changes: sendMessage simplified to c.queue.send() removes unnecessary self-client indirection. Response.json() replaced with explicit new Response(JSON.stringify(...)) is more portable. Moving beforeConnect params from connect() to getOrCreate(undefined, { params: ... }) matches the actual API signature.


Summary

The stabilization direction is sound. Actionable items before merge:

  1. Replace cwd-based sandbox detection with an explicit mechanism.
  2. Fix the race window in waitForConnectionOpen.
  3. Add a justification comment before the new polling loop.
  4. Enable the double-sleep test or defer triggerSleepTwice fixture additions to the PR resolving Track skipped actor-sleep-db shutdown invariants in driver tests #4705.

@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from ecf268c to 4c4e267 Compare April 15, 2026 02:40
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from dcb82e8 to 1d9c571 Compare April 15, 2026 02:40
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 4c4e267 to e823f78 Compare April 15, 2026 02:50
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 1d9c571 to 8e50d57 Compare April 15, 2026 02:50
@NathanFlurry NathanFlurry marked this pull request as ready for review April 15, 2026 02:57
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 8e50d57 to f2052c1 Compare April 15, 2026 03:07
@NathanFlurry NathanFlurry marked this pull request as draft April 15, 2026 03:17
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from e823f78 to 83bf5c0 Compare April 15, 2026 06:55
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from f2052c1 to 5aeb5f4 Compare April 15, 2026 06:55
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 83bf5c0 to c25da72 Compare April 27, 2026 05:57
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 5aeb5f4 to 4d25019 Compare April 27, 2026 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant