fix(rivetkit-core): gate startup until runtime is ready#4736
Conversation
13072b7 to
6db71dc
Compare
Review of PR #4736: gate startup until runtime is readyThis is a solid and necessary fix. Gating active_actor on ctx.ready() closes a real race where a client request could arrive at an Active instance that has not completed its startup preamble, causing silent misbehavior. The TypeScript client changes are symmetrical and correctly extend the stale-identity retry logic to cover starting. A few things worth addressing: ISSUES
In rivetkit-rust/packages/rivetkit-core/src/registry/mod.rs (around the not-yet-ready Active instance branch), the code calls warn_work_sent_to_stopping_instance(active_actor). This warning fires when an Active instance is not yet ready() -- meaning it is still starting, not stopping. The metric name inc_direct_subsystem_shutdown_warning and the log message work sent to stopping actor instance will mislead operators into thinking an actor is shutting down. This should use a distinct diagnostic path (e.g. warn_work_sent_to_starting_instance) or at minimum emit a different log message. The actual stopping path below uses the same helper correctly.
In rivetkit-rust/packages/rivetkit-core/src/registry/http.rs (framework_error_status): The function has no arm for (actor, starting), so it falls through to StatusCode::INTERNAL_SERVER_ERROR. The correct status for actor is still starting up, retry soon is 503 Service Unavailable. Consider adding (actor, starting) (and by symmetry stopping, destroying) mapping to StatusCode::SERVICE_UNAVAILABLE in framework_error_status.
In rivetkit-typescript/packages/rivetkit/src/client/actor-handle.ts (the starting/stopping retry branch inside sendActionNow): the inline 100ms delay is inconsistent with all other retry paths which call waitForRetryWindow(). Use the shared helper instead.
In rivetkit-typescript/packages/rivetkit/src/client/actor-query.ts: The isStaleResolvedActorError function includes starting and stopping but not destroying. Meanwhile shouldRetryDynamicLifecycleError in actor-handle.ts includes destroying in its retry set. This means actor.destroying skips resolved-actor-ID invalidation. Since the PR explicitly updated both files to add starting, it would be consistent to add destroying to isStaleResolvedActorError in the same change.
rivetkit-rust/engine/artifacts/errors/actor.starting.json is missing a trailing newline. All sibling files (actor.stopping.json, etc.) end with a newline. MINOR OBSERVATIONS
|
38f839d to
e8072b7
Compare
6db71dc to
e33e626
Compare
e33e626 to
06b3383
Compare
Preview packages published to npmInstall with: npm install rivetkit@pr-4736All packages published as Engine binary is shipped via Docker images: docker pull rivetdev/engine:slim-ed96c00
docker pull rivetdev/engine:full-ed96c00Individual packagesnpm install rivetkit@pr-4736
npm install @rivetkit/react@pr-4736
npm install @rivetkit/rivetkit-napi@pr-4736
npm install @rivetkit/workflow-engine@pr-4736 |
e8072b7 to
4f00fb4
Compare
06b3383 to
49fed12
Compare
49fed12 to
c4a5108
Compare
c4a5108 to
b563149
Compare
3fd37c2 to
9fa17f8
Compare
b563149 to
5aa18e8
Compare
5aa18e8 to
d45b410
Compare
|
Landed in main via stack-merge fast-forward push. Commits are in main; closing to match. |

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: