Skip to content

fix(rivetkit-wasm): fix mem leaks#4880

Draft
NathanFlurry wants to merge 15 commits into04-29-chore_rivetkit_wasm_supportfrom
05-02-fix_rivetkit-wasm_fix_mem_leaks
Draft

fix(rivetkit-wasm): fix mem leaks#4880
NathanFlurry wants to merge 15 commits into04-29-chore_rivetkit_wasm_supportfrom
05-02-fix_rivetkit-wasm_fix_mem_leaks

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@NathanFlurry NathanFlurry mentioned this pull request May 2, 2026
11 tasks
@claude
Copy link
Copy Markdown

claude Bot commented May 2, 2026

Code Review: PR #4880 — fix(rivetkit-wasm): fix mem leaks

This PR addresses correctness, durability, and memory-leak issues in the wasm and NAPI runtime adapters. The scope is well-bounded, organized into 14 named user stories. Overall quality is high, but there are a few issues worth addressing before merge.


What the PR does

  1. US-001WasmActorContext::request_save_and_wait now actually awaits the save instead of fire-and-forgetting it.
  2. US-002 / US-011BRIDGE_RIVET_ERROR_SCHEMAS global cache interns RivetErrorSchema by (group, code) so Box::leak is amortized to at most once per unique error code, not once per error instance.
  3. US-003engine_binary_path is rejected with a structured wasm.invalid_config error before any core side-effects run.
  4. US-004WasmActorContext::register_task now calls rivetkit_core::ActorContext::register_task (shutdown-deadline-aware) instead of wait_until.
  5. US-005 — HTTP status promotion (auth.forbidden → 403, etc.) moves from duplicated TS promotion tables into rivetkit_core::error::public_error_status_code.
  6. US-006wasm-pack is pinned as a devDependency instead of fetched via npx -y wasm-pack.
  7. US-007 — New runtime-parity.test.ts drives both NapiCoreRuntime and WasmCoreRuntime with fake bindings through the shared buildNativeFactory glue.
  8. US-008 / US-009 — Per-call Box::leak in action_not_found and structured_timeout_schema replaced with a static constant and an scc::HashMap-backed intern cache.
  9. US-010websocket_callback_regions converted from a Vec<Option<...>> that never shrinks to a HashMap<u32, ...> with entries removed on end_websocket_callback.
  10. US-012register_task shutdown drain is bounded against shutdown_deadline_token.
  11. US-013begin_websocket_callback wraps the u32 ID counter instead of panicking on overflow.
  12. US-014 — CLAUDE.md / napi-bridge.md updated with BRIDGE_RIVET_ERROR_SCHEMAS cache rules and mem::forget bounded-leak notes.

Issues

1. wasm.invalid_config.json is missing a trailing newline

engine/artifacts/errors/wasm.invalid_config.json ends without a newline. Every other artifact JSON in that directory ends with one. This will cause git diff noise on any future touch of the file.

2. WasmServeConfig gained serde::Serialize only to support a wasm32 test; the derive leaks into the production struct

WasmServeConfig now derives Serialize so the #[cfg(target_arch = "wasm32")] test serve_rejects_engine_binary_path_before_core_setup can call serde_wasm_bindgen::to_value. This is a test-only need and should be #[cfg(test)] gated, or the test should construct the JsValue through a #[cfg(test)] wrapper that derives Serialize on a parallel struct.

3. websocket_callback_regions_are_removed_after_end assertion will false-pass at wrapping

The test asserts region_id > previous_id on every iteration. The loop runs only 1000 times so the counter never wraps in practice, but the assertion would silently hold false at u32::MAX → 1. The wrap test (websocket_callback_region_ids_wrap_without_collision) already covers wrap-without-collision — consider replacing the > assertion with != 0 (the sentinel check) or dropping it.

4. anyhow_to_bridge_rivet_error_payload in rivetkit-wasm has no diagnostic logging

The NAPI equivalent emits tracing::error! with group, code, message, metadata, error_chain, public_, and status_code every time a bridge error is encoded. The wasm version has no equivalent. Because tracing is unavailable in wasm, at minimum a console.error call via web_sys::console::error_1 (or gloo_console::error!) should be added here for parity, so Cloudflare Workers operators have the same diagnostic output as NAPI operators. This is the single chokepoint through which all actor errors flow on the wasm path.

5. STRUCTURED_TIMEOUT_SCHEMAS intern cache default_message is first-write-wins; this should be documented

structured_timeout_schema interns schemas by (group, code) using entry_sync. The default_message is captured from the first call for a given pair; all subsequent calls with the same pair reuse that schema even if they pass a different message. The per-error message is correctly set on RivetTransportError::message so runtime behavior is correct, but a comment should call out this "first-wins default_message, per-call message" split to prevent future confusion.

6. Pinned wasm-pack@0.14.0 brings in deprecated transitive dependencies

wasm-pack@0.14.0 depends on binary-install@1.1.2 (deprecated) which pulls in axios@0.26.1 and a tar version with publicized security vulnerabilities. Since this is a devDependency it never ships in production, but it will surface in pnpm audit gates. Consider a newer wasm-pack pin if one exists without these transitive deps, or add an explicit pnpm audit exemption with a scope comment.

7. should_promote guard behavior when bridge context has a non-500 status code should be documented

In both rivetkit-napi/src/lib.rs and rivetkit-wasm/src/lib.rs, when bridge_context.public_ = Some(true) and bridge_context.status_code is a non-500 value, should_promote is false and the bridge's code wins over the promotion table. This is probably intentional (trust an explicitly-set bridge status over the core table), but a comment adjacent to the should_promote block would prevent someone from "fixing" it in the future.

8. Minor: WasmActorContext clones share the Rc<Cell<u32>> ID counter — worth a comment

WasmActorContext derives Clone. next_websocket_callback_region_id is Rc<Cell<u32>>, so two clones share both the websocket_callback_regions map and the ID counter. This is intentional and correct for wasm's single-threaded model, but a brief comment would help future readers who might question why Rc is used here instead of a value type.


Positive observations

  • Consolidating the status-promotion table into a single public_error_status_code function in rivetkit-core removes a class of drift bugs.
  • The scc::HashMap-backed intern caches for BRIDGE_RIVET_ERROR_SCHEMAS and STRUCTURED_TIMEOUT_SCHEMAS are idiomatic with the project's conventions.
  • run_registered_task correctly races the user future against shutdown_deadline_token() and logs a structured warning with actor_id and reason fields on deadline.
  • The allocate_websocket_callback_region_id wrap + skip-sentinel-0 handling is correct and covered by a matching wrap test.
  • Using UserTaskKind::RegisteredTask as a distinct variant rather than reusing WaitUntil respects the project's "never use _ =>" rule.
  • The wasm-pack bin resolution via createRequire + require.resolve("wasm-pack/package.json") is a clean offline-friendly pattern with an explicit error on a missing dependency.
  • The parity tests use deterministic Gate/Promise sequencing with no polling, consistent with the vi.waitFor prohibition.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

Preview packages published to npm

Install with:

npm install rivetkit@pr-4880

All packages published as 0.0.0-pr.4880.34d3f72 with tag pr-4880.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-34d3f72
docker pull rivetdev/engine:full-34d3f72
Individual packages
npm install rivetkit@pr-4880
npm install @rivetkit/react@pr-4880
npm install @rivetkit/rivetkit-napi@pr-4880
npm install @rivetkit/workflow-engine@pr-4880

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant