You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Design constraints and invariants for the RivetKit actor sleep / destroy lifecycle. Pair with `actor-task-dispatch.md` and `rivetkit-core-internals.md` for surrounding context.
4
+
5
+
## Authority
6
+
7
+
- The engine owns lifecycle authority. `ctx.sleep()` and `ctx.destroy()` send fire-and-forget `ActorIntent` events; they do not transition lifecycle state locally. The local `SleepGrace` / `DestroyGrace` transition runs when the engine replies with `StopActor`.
8
+
-`envoy-client` retries intent delivery across reconnects via checkpoint-based event replay (`engine/sdks/rust/envoy-client/src/events.rs`). Core does not need its own retry path.
9
+
10
+
## Public surface: keep-awake primitives
11
+
12
+
Two user-facing primitives in TypeScript. Both accept a `Promise`, never a closure.
|`c.keepAwake(promise)`| Yes | Yes | Returns the same promise. Use for work the actor must stay up for. |
17
+
|`c.waitUntil(promise)`| No | Yes | Returns void. Use for best-effort flush/cleanup work that is allowed to complete inside the grace window. |
18
+
19
+
`c.setPreventSleep(b)` and `c.preventSleep` are deprecated no-ops retained for binary / call-site compatibility. They will be removed in 2.2.0.
20
+
21
+
### Why two primitives and not one
22
+
23
+
`keepAwake` is scoped, non-leaky, and symmetric with `waitUntil`. `setPreventSleep` was a flag that had to be paired by hand; forgetting to clear it wedged the actor awake. A promise-scoped counter cannot leak: when the promise settles (resolve or reject), the counter decrements.
24
+
25
+
### Why separate `keep_awake` and `internal_keep_awake` in core
26
+
27
+
Kept separate for debug visibility. Grace deadline warn logs report each counter independently so diagnostics distinguish user keep-awake sites from framework-owned keep-awake sites (schedule alarms, queue receives).
28
+
29
+
## Sleep readiness predicates
30
+
31
+
Two predicates govern the sleep state machine. Both live on `ActorContext` / `SleepState`.
32
+
33
+
-`can_arm_sleep_timer()` — the idle predicate. Returns `CanSleep::Yes` only when every sleep-affecting counter is zero and the run handler is inactive (or waiting on a queue). Used to start the sleep idle timer.
34
+
-`can_finalize_sleep()` — the grace predicate. Returns `true` only when every shutdown-affecting counter is zero: `core_dispatched_hooks`, `shutdown_task_count`, `sleep_keep_awake`, `sleep_internal_keep_awake`, `active_http_requests`, `websocket_callbacks`, `pending_disconnects`. Used to advance from `SleepGrace` to `SleepFinalize` (or finalize destroy).
35
+
36
+
Removing `preventSleep` deleted both predicate branches. Any future sleep-affecting counter must add an entry in each predicate and must call `ActorContext::reset_sleep_timer()` on transitions that change the result.
37
+
38
+
## Grace period and abort signals
39
+
40
+
-`start_grace(reason)` fires at the start of `SleepGrace` / `DestroyGrace`. It cancels the sleep idle timer, cancels the actor abort signal (`actor_abort_signal`), installs a `SleepGraceState` with the effective grace deadline, and resets the sleep timer to arm the grace tick.
41
+
- The actor abort signal is a soft signal: "shutdown has started, please wrap up." User code observes it via `c.abortSignal`. It does not force-stop work.
42
+
- For destroy, the abort signal may fire earlier than grace entry because `ctx.destroy()` cancels the abort token immediately via `mark_destroy_requested(...)`.
43
+
44
+
## Grace deadline enforcement
45
+
46
+
When the grace deadline elapses before `can_finalize_sleep()` returns true:
47
+
48
+
-`on_sleep_grace_deadline` aborts the user `run` handle (`run_handle.abort()`), cancels the shutdown deadline token (`cancel_shutdown_deadline()`), records the timeout, and emits a structured warn log enumerating every non-drained counter.
49
+
- The NAPI `RunGracefulCleanup` task observes `shutdown_deadline_token()` via `tokio::select!` and aborts its in-flight `onSleep` / `onDestroy` call so SQLite and KV cleanup in `teardown_sleep_state` do not race against mid-commit user work.
50
+
- Foreign-runtime adapters that run user cleanup callbacks must observe the shutdown deadline token the same way.
51
+
52
+
## Guarding lifecycle requests
53
+
54
+
-`ctx.sleep()` and `ctx.destroy()` return `Result<()>`. They fail with `actor/starting` if called before startup completes and `actor/stopping` if the request flag has already been swapped to true for this generation. An atomic `swap(true, ...)` on `sleep_requested` / `destroy_requested` enforces single-shot request semantics per generation.
55
+
- The idle sleep timer request path (`spawn_sleep_timer_task`) and the `ActorTask` sleep-tick path both suppress the already-requested error: idle-driven requests may race user-driven requests and the warning is informational.
56
+
57
+
## Serialize-state shutdown cap
58
+
59
+
`SERIALIZE_STATE_SHUTDOWN_SANITY_CAP = 15s` is the upper bound on how long the shutdown `SerializeState` reply wait is allowed to pend before `save_final_state` falls back to empty deltas (preserving prior state). This is a sanity cap, not a deadline anyone should ever hit; the normal drain finishes in milliseconds.
60
+
61
+
## Test harness parity
62
+
63
+
- Rust integration tests live in `rivetkit-core/tests/modules/sleep.rs` and pin predicate behavior, grace period selection, and `save_final_state` cap.
64
+
- TypeScript driver tests in `rivetkit-typescript/packages/rivetkit/tests/driver/actor-sleep*.test.ts` cover abort-signal-at-grace-entry, `keepAwake` holding shutdown, `c.db` writes surviving `onSleep`, and regression coverage for `setPreventSleep` being a no-op.
Copy file name to clipboardExpand all lines: rivetkit-rust/packages/rivetkit-core/CLAUDE.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,10 @@
6
6
7
7
## Sleep state invariants
8
8
9
-
- Any mutation that changes a `can_sleep` input must call `ActorContext::reset_sleep_timer()` so the `ActorTask` sleep deadline is re-evaluated. Inputs are: `ready`/`started`, `prevent_sleep`, `no_sleep`, `active_http_request_count`, `sleep_keep_awake_count`, `sleep_internal_keep_awake_count`, `pending_disconnect_count`, `conns()`, and `websocket_callback_count`. Missing this call leaves the sleep timer armed against stale state and triggers the `"sleep idle deadline elapsed but actor stayed awake"` warning on the next tick.
9
+
- Any mutation that changes a `can_sleep` input must call `ActorContext::reset_sleep_timer()` so the `ActorTask` sleep deadline is re-evaluated. Inputs are: `ready`/`started`, `no_sleep`, `active_http_request_count`, `sleep_keep_awake_count`, `sleep_internal_keep_awake_count`, `pending_disconnect_count`, `conns()`, and `websocket_callback_count`. Missing this call leaves the sleep timer armed against stale state and triggers the `"sleep idle deadline elapsed but actor stayed awake"` warning on the next tick.
10
+
-`ActorContext::set_prevent_sleep(...)` / `prevent_sleep()` are deprecated no-ops kept for NAPI bridge compatibility. Use `keep_awake(future)` (holds counter while awaited) or `wait_until(future)` (tracked shutdown task) instead. Do not reintroduce a `prevent_sleep` field, a `CanSleep::PreventSleep` variant, or branches that read it.
11
+
-`ctx.sleep()` and `ctx.destroy()` return `Result<()>`. They error with `ActorLifecycleError::Starting` when called before startup completes and `ActorLifecycleError::Stopping` if the requested flag has already been set this generation (atomic `swap(true, ...)`). Internal idle-timer paths log and suppress the already-requested error.
12
+
- The grace deadline path (`on_sleep_grace_deadline`) aborts the user `run` handle and cancels `shutdown_deadline_token()`. Foreign-runtime adapters running `onSleep` / `onDestroy` must observe that token via `tokio::select!` so SQLite teardown does not race user cleanup work.
10
13
- Counter `register_zero_notify(&idle_notify)` hooks only drive shutdown drain waits. They are not a substitute for the activity-dirty notification, so any new sleep-affecting counter must also notify on transitions that change `can_sleep`.
11
14
- A clean `run` exit while `Started` is not terminal. Keep the generation alive until the guaranteed `Stop` drives `SleepGrace` or `DestroyGrace`, and only treat `Terminated` as "grace hooks already completed."
12
15
- Do not reply to actor startup until the runtime adapter has acknowledged its startup preamble. Otherwise `getOrCreate` can race the first action against `onWake` or `run` startup.
Copy file name to clipboardExpand all lines: website/src/content/docs/actors/lifecycle.mdx
+26-14Lines changed: 26 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -285,7 +285,7 @@ The `run` hook is called after the actor starts and runs in the background witho
285
285
The handler exposes `c.aborted` for loop checks and `c.abortSignal` for canceling operations when the actor is stopping. You should always check or listen for shutdown to exit gracefully.
286
286
287
287
**Important behavior:**
288
-
- The actor may go to sleep at any time during the `run` handler. Use `c.setPreventSleep(true)` while work is active, then clear it with `c.setPreventSleep(false)`once the actor can sleep again.
288
+
- The actor may go to sleep at any time during the `run` handler. Wrap work that must keep the actor awake with `c.keepAwake(promise)`to block idle sleep until the promise settles.
289
289
- If the `run` handler exits (returns), the actor follows its normal idle sleep timeout once it becomes idle
290
290
- If the `run` handler throws an error, the actor logs the error and then follows its normal idle sleep timeout once it becomes idle
291
291
- On shutdown, `c.abortSignal` fires so the `run` handler can exit within the graceful shutdown window.
@@ -725,7 +725,7 @@ An actor is considered idle and eligible to sleep when **all** of the following
725
725
- No active HTTP requests
726
726
- No active connections (unless they are hibernatable WebSockets)
727
727
- No active `run` handler (unless it is waiting on a queue)
728
-
-`setPreventSleep` is not enabled
728
+
-No outstanding `c.keepAwake(promise)` promises
729
729
- No pending disconnect callbacks
730
730
- No async `onWebSocket` event handlers (eg `open`, `message`, `close`) still running
731
731
@@ -737,13 +737,18 @@ Outbound requests (e.g. `fetch` calls) do not count as activity and will not kee
737
737
738
738
The platform may force an actor to migrate to a new machine during version upgrades or when a serverless request is about to timeout. The same [shutdown sequence](#shutdown-sequence) runs, then the actor is rescheduled on a new machine and wakes up with its persisted state.
739
739
740
-
Use `onSleep`, `waitUntil`, or `setPreventSleep` to control the length of the grace period before the actor moves to another machine.
740
+
Use `onSleep`, `waitUntil`, or `keepAwake` to control the length of the grace period before the actor moves to another machine.
741
741
742
-
### Preventing Sleep
742
+
### Keeping the Actor Awake
743
743
744
-
If actor state says the actor should stay awake, call `c.setPreventSleep(true)` and clear it once the actor can sleep again. You can read `c.preventSleep` to inspect the current flag.
744
+
RivetKit gives you two primitives for holding the actor awake across background work. Both take a `Promise` and differ in how they interact with idle sleep and the grace period.
745
745
746
-
`setPreventSleep` blocks normal idle sleep until you clear it. It is not a platform-wide stop blocker though. If shutdown has already started, RivetKit waits for `preventSleep` to clear within the same `sleepGracePeriod` shutdown budget used by `onSleep` and `waitUntil`.
746
+
| Method | Accepts | Blocks idle sleep | Blocks grace finalize | Use case |
747
+
| --- | --- | --- | --- | --- |
748
+
|`c.keepAwake(promise)`|`Promise<T>` (returns same promise) | Yes | Yes | Critical work that must keep the actor running end to end (for example a turn in a game, an ongoing tool call). |
749
+
|`c.waitUntil(promise)`|`Promise<unknown>` (returns void) | No | Yes | Best-effort finalization work that may complete during the grace window (for example analytics flushes, cleanup writes). |
750
+
751
+
`c.keepAwake(promise)` is the preferred primitive for long-running work the actor should not sleep through. It holds a keep-awake counter until the promise settles, which blocks both idle sleep and the grace finalize step. The promise is returned unchanged, so you can `await` it if you need the value.
`setPreventSleep(enabled)` is deprecated and now a no-op. Wrap the work you want to keep alive with `c.keepAwake(promise)` instead.
779
+
</Note>
780
+
769
781
### On Sleep Hook
770
782
771
783
The [`onSleep`](#onsleep) hook runs during shutdown for cleanup like clearing intervals or closing connections. It is best-effort and will not run if the actor crashes.
The actor waits up to `sleepGracePeriod` for graceful sleep work during the [shutdown sequence](#shutdown-sequence). That single budget covers `onSleep`, `waitUntil`, async raw WebSocket handlers such as `message` and `close`, and waiting for `preventSleep` to clear after shutdown has started. By default, this graceful sleep window is 15 seconds total. If the timeout is exceeded, the actor proceeds with sleep anyway.
830
+
The actor waits up to `sleepGracePeriod` for graceful sleep work during the [shutdown sequence](#shutdown-sequence). That single budget covers `onSleep`, `waitUntil`, `keepAwake`, async raw WebSocket handlers such as `message` and `close`. By default, this graceful sleep window is 15 seconds total. If the timeout is exceeded, the actor proceeds with sleep anyway.
819
831
820
832
### Sleep Timeouts
821
833
822
834
| Option | Default | Description |
823
835
|--------|---------|-------------|
824
836
|`sleepTimeout`| 30 seconds | Time of inactivity before the actor begins sleeping. |
825
-
|`sleepGracePeriod`| 15 seconds | Total graceful shutdown window for hooks, `waitUntil`, async raw WebSocket handlers, disconnects, and waiting for `preventSleep` to clear. |
837
+
|`sleepGracePeriod`| 15 seconds | Total graceful shutdown window for hooks, `waitUntil`, `keepAwake`, async raw WebSocket handlers, and disconnects. |
826
838
827
839
Rivet enforces a hard limit of **30 minutes** for the entire stop process. These can be configured in the [options](#options).
0 commit comments