fix(rivetkit-core): error on sleep/destroy before startup or already-requested#4743
Conversation
671145c to
a8c2068
Compare
f32030d to
96f22d0
Compare
96f22d0 to
cfc540f
Compare
734fc88 to
efecd06
Compare
0c5605c to
ed50b2a
Compare
ed50b2a to
9ba012c
Compare
efecd06 to
3f13316
Compare
Review: fix(rivetkit-core): error on sleep/destroy before startup or already-requestedThe PR adds startup and duplicate-request guards to sleep() and destroy(). The intent is correct - preventing silent no-ops when these are called too early or redundantly. Here are the key observations: Bug: mark_destroy_requested() + user destroy() is now a behavioral regressionMost significant concern. The engine calls mark_destroy_requested() internally (registry/mod.rs:616 and :822) to force-stop a running actor. This sets destroy_requested = true via store(true). If user code subsequently calls ctx.destroy() in an abort-signal handler or cleanup path, the new swap-based guard sees the flag already set and returns an error. Before this PR, destroy() was fully idempotent (it delegated to mark_destroy_requested() which used store). Now a forced stop from the engine followed by user code calling ctx.destroy() surfaces an error in the actor run handler. This is a silent behavioral regression for actors that call ctx.destroy() in response to the abort signal. Suggested fix: keep destroy() silently idempotent when already set, since calling destroy on an already-being-destroyed actor is not a logic error the same way a double-sleep() call is. The error-on-duplicate behavior is most valuable for sleep() where a double-sleep in one generation is almost certainly a bug. Code duplication between destroy() and mark_destroy_requested()The new destroy() inlines the same four side effects already present in mark_destroy_requested(): cancel_sleep_timer(), flush_on_shutdown(), destroy_completed.store(false), abort_signal.cancel(). The old code had this right: destroy() delegated to mark_destroy_requested(). Consider extracting a private do_mark_destroy() helper and calling it from both to keep the logic in one place. Error code mismatch for sleep already requestedActorLifecycleError::Stopping (wire code: stopping) is used when sleep is already requested. Actor is stopping is accurate for destroy but potentially confusing for sleep since the actor is transitioning to sleep. Consider a dedicated Sleeping variant or at minimum a more precise context message. Missing test coverage for the new error pathsExisting tests are correctly updated to call set_sleep_started(true) first. However there are no tests for the guard conditions themselves:
Since this PR specifically converts silent no-ops into explicit errors, test coverage of those error paths would be valuable. Positive notes
|
Preview packages published to npmInstall with: npm install rivetkit@pr-4743All packages published as Engine binary is shipped via Docker images: docker pull rivetdev/engine:slim-a0504f4
docker pull rivetdev/engine:full-a0504f4Individual packagesnpm install rivetkit@pr-4743
npm install @rivetkit/react@pr-4743
npm install @rivetkit/rivetkit-napi@pr-4743
npm install @rivetkit/workflow-engine@pr-4743 |

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: