Skip to content

fix(rivetkit): drain native sleep side tasks reliably#4782

Merged
NathanFlurry merged 1 commit intomainfrom
driver-fixes/native-sleep-side-task-drain
Apr 27, 2026
Merged

fix(rivetkit): drain native sleep side tasks reliably#4782
NathanFlurry merged 1 commit intomainfrom
driver-fixes/native-sleep-side-task-drain

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 26, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

PR Review: fix(rivetkit): drain native sleep side tasks reliably

Summary

The core correctness fix — making waitUntil registration synchronous — is solid and closes a real race where the WaitUntil shutdown counter could be evaluated before tasks were registered, letting the actor sleep with in-flight work undrained. Several bundled changes (database lifecycle simplification, setPreventSleep deprecation) deserve individual scrutiny.


Core Fix: waitUntil Made Synchronous ✅

The change from pub async fn wait_until to pub fn wait_until (and callNativecallNativeSync in JS) is the right fix. track_shutdown_task is now called immediately, incrementing shutdown_task_count before waitUntil returns, which correctly gates can_finalize_sleep. The old code acknowledged this race: "increment of the shutdown_counter happens on first poll of the Rust future."

The .d.ts change (Promise<void> to void) correctly matches the new Rust signature. Rethrowing non-task-registration errors instead of silently logging them aligns with the fail-by-default rule.


Issues

1. setPreventSleep/preventSleep warnings will spam logs for agent-os

The two previously-silent no-ops now emit logger().warn(...) on every call. The agent-os actor calls syncPreventSleep across at least 10 call sites in agent-os/actor/{process,shell,session,index}.ts — on every process start/stop, session event, hook run, and shell operation. This will generate per-call warning spam in production.

Options:

  • Emit the warning only once per actor instance via a flag in runtime state
  • Rely on the @deprecated JSDoc annotation alone and remove the runtime warn calls
  • Update agent-os in this same PR to use waitUntil/keepAwake so no warnings fire at all

2. Narrowed onSleep condition may leave JS database wrapper un-cleared between sleep/wake

Old: typeof config.onSleep === 'function' || databaseProvider !== undefined
New: typeof config.onSleep === 'function'

Actors with a databaseProvider but no explicit onSleep no longer get the JS-side closeDatabase(false) cleanup on sleep. The underlying Rust VFS handle is cleaned up by core, and onMigrate resets the client on wake — so this is functionally safe. However, runtimeState.databaseClient is not cleared between sleep and the next wake's onMigrate, leaving a stale wrapper pointing at a closed native handle for that window.

3. saveState placement in onSleep callback

If config.onSleep throws, saveState is skipped (it is after the onSleep call inside try, not in finally). Core runs save_final_state() unconditionally so state is not lost, but this looks like an unintentional double-write in the normal path and a silent skip in the error path. Either move saveState to finally, or remove it entirely and rely on core's save_final_state().

4. JS runtime state unref removed — leak grows with sleep cycles

The old code called callNativeSync(() => ctx.clearRuntimeState()) in both sleep and destroy paths. The new code removes these, relying on reset_runtime_shared_state() at wake time via mem::forget. For long-running hosts with many sleep cycles the leak grows linearly. If this tradeoff is intentional, document it at the mem::forget site.


Positive Changes

  • Core waitUntil race fix is correct and well-targeted
  • Error rethrowing in wait_until matches fail-by-default
  • DatabaseProvider.onDestroy removal simplifies lifecycle (@experimental makes the breaking change acceptable)
  • provider field removal from NativeDatabaseClientState is a clean simplification
  • Test cleanup is consistent with removing DatabaseProvider.onDestroy

Minor

PR description is empty and checklist items are unchecked. The behavioral changes (setPreventSleep warnings, narrowed onSleep condition, new saveState call) should be described so reviewers can assess intent vs. implementation.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 26, 2026

Preview packages published to npm

Install with:

npm install rivetkit@pr-4782

All packages published as 0.0.0-pr.4782.2c31448 with tag pr-4782.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-2c31448
docker pull rivetdev/engine:full-2c31448
Individual packages
npm install rivetkit@pr-4782
npm install @rivetkit/react@pr-4782
npm install @rivetkit/rivetkit-napi@pr-4782
npm install @rivetkit/workflow-engine@pr-4782

@NathanFlurry NathanFlurry force-pushed the driver-fixes/native-sleep-side-task-drain branch from efe8a6e to af3430b Compare April 26, 2026 20:54
@NathanFlurry NathanFlurry force-pushed the driver-fixes/raw-request-routing branch from a4b1187 to 0d560ef Compare April 26, 2026 20:54
@NathanFlurry NathanFlurry force-pushed the driver-fixes/native-sleep-side-task-drain branch from af3430b to 61a843e Compare April 27, 2026 01:06
@NathanFlurry NathanFlurry force-pushed the driver-fixes/raw-request-routing branch from 0d560ef to 2a7294b Compare April 27, 2026 01:06
@NathanFlurry NathanFlurry force-pushed the driver-fixes/native-sleep-side-task-drain branch from 61a843e to ca7b637 Compare April 27, 2026 01:17
@NathanFlurry NathanFlurry force-pushed the driver-fixes/native-sleep-side-task-drain branch from ca7b637 to cb5e5d4 Compare April 27, 2026 01:47
@NathanFlurry NathanFlurry force-pushed the driver-fixes/raw-request-routing branch from 158a1f1 to 25dd07c Compare April 27, 2026 01:47
@NathanFlurry NathanFlurry force-pushed the driver-fixes/native-sleep-side-task-drain branch from cb5e5d4 to 9138a29 Compare April 27, 2026 02:19
Base automatically changed from driver-fixes/raw-request-routing to main April 27, 2026 07:13
@NathanFlurry NathanFlurry merged commit c5f8855 into main Apr 27, 2026
9 of 28 checks passed
@NathanFlurry NathanFlurry deleted the driver-fixes/native-sleep-side-task-drain branch April 27, 2026 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant