Skip to content

fix(rivetkit-core): error on sleep/destroy before startup or already-requested#4743

Merged
NathanFlurry merged 1 commit intomainfrom
sleep-cleanup/03-sleep-destroy-guards
Apr 27, 2026
Merged

fix(rivetkit-core): error on sleep/destroy before startup or already-requested#4743
NathanFlurry merged 1 commit intomainfrom
sleep-cleanup/03-sleep-destroy-guards

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 24, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/03-sleep-destroy-guards branch from 671145c to a8c2068 Compare April 24, 2026 10:32
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/02-serialize-state-timeout-15s branch from f32030d to 96f22d0 Compare April 24, 2026 10:32
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/02-serialize-state-timeout-15s branch from 96f22d0 to cfc540f Compare April 24, 2026 11:48
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/03-sleep-destroy-guards branch 3 times, most recently from 734fc88 to efecd06 Compare April 24, 2026 12:32
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/02-serialize-state-timeout-15s branch from 0c5605c to ed50b2a Compare April 24, 2026 12:32
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/02-serialize-state-timeout-15s branch from ed50b2a to 9ba012c Compare April 24, 2026 13:16
@NathanFlurry NathanFlurry force-pushed the sleep-cleanup/03-sleep-destroy-guards branch from efecd06 to 3f13316 Compare April 24, 2026 13:16
@claude
Copy link
Copy Markdown

claude Bot commented Apr 24, 2026

Review: fix(rivetkit-core): error on sleep/destroy before startup or already-requested

The PR adds startup and duplicate-request guards to sleep() and destroy(). The intent is correct - preventing silent no-ops when these are called too early or redundantly. Here are the key observations:


Bug: mark_destroy_requested() + user destroy() is now a behavioral regression

Most significant concern.

The engine calls mark_destroy_requested() internally (registry/mod.rs:616 and :822) to force-stop a running actor. This sets destroy_requested = true via store(true). If user code subsequently calls ctx.destroy() in an abort-signal handler or cleanup path, the new swap-based guard sees the flag already set and returns an error.

Before this PR, destroy() was fully idempotent (it delegated to mark_destroy_requested() which used store). Now a forced stop from the engine followed by user code calling ctx.destroy() surfaces an error in the actor run handler. This is a silent behavioral regression for actors that call ctx.destroy() in response to the abort signal.

Suggested fix: keep destroy() silently idempotent when already set, since calling destroy on an already-being-destroyed actor is not a logic error the same way a double-sleep() call is. The error-on-duplicate behavior is most valuable for sleep() where a double-sleep in one generation is almost certainly a bug.


Code duplication between destroy() and mark_destroy_requested()

The new destroy() inlines the same four side effects already present in mark_destroy_requested(): cancel_sleep_timer(), flush_on_shutdown(), destroy_completed.store(false), abort_signal.cancel().

The old code had this right: destroy() delegated to mark_destroy_requested(). Consider extracting a private do_mark_destroy() helper and calling it from both to keep the logic in one place.


Error code mismatch for sleep already requested

ActorLifecycleError::Stopping (wire code: stopping) is used when sleep is already requested. Actor is stopping is accurate for destroy but potentially confusing for sleep since the actor is transitioning to sleep. Consider a dedicated Sleeping variant or at minimum a more precise context message.


Missing test coverage for the new error paths

Existing tests are correctly updated to call set_sleep_started(true) first. However there are no tests for the guard conditions themselves:

  • sleep() before startup should return ActorLifecycle::Starting
  • destroy() before startup should return ActorLifecycle::Starting
  • sleep() called twice should return ActorLifecycle::Stopping on the second call
  • destroy() called twice should return ActorLifecycle::Stopping on the second call

Since this PR specifically converts silent no-ops into explicit errors, test coverage of those error paths would be valuable.


Positive notes

  • Using swap instead of load + store for the duplicate-request check is the correct lock-free pattern.
  • Propagating errors through the NAPI boundary (.map_err(napi_anyhow_error)) is correct and will surface as RivetError on the TypeScript side.
  • The idle-timer error suppression at debug! severity is the right level - benign race, not a real fault.
  • The set_sleep_started(true) additions in tests correctly reflect the new startup requirement.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 24, 2026

Preview packages published to npm

Install with:

npm install rivetkit@pr-4743

All packages published as 0.0.0-pr.4743.a0504f4 with tag pr-4743.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-a0504f4
docker pull rivetdev/engine:full-a0504f4
Individual packages
npm install rivetkit@pr-4743
npm install @rivetkit/react@pr-4743
npm install @rivetkit/rivetkit-napi@pr-4743
npm install @rivetkit/workflow-engine@pr-4743

Base automatically changed from sleep-cleanup/02-serialize-state-timeout-15s to main April 27, 2026 07:13
@NathanFlurry NathanFlurry merged commit 5234a53 into main Apr 27, 2026
20 of 27 checks passed
@NathanFlurry NathanFlurry deleted the sleep-cleanup/03-sleep-destroy-guards branch April 27, 2026 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant