Skip to content

fix(rivetkit): decode legacy v4 actor schedule args#4976

Merged
NathanFlurry merged 1 commit intomainfrom
actor-persist/legacy-v4-args-fallback
May 5, 2026
Merged

fix(rivetkit): decode legacy v4 actor schedule args#4976
NathanFlurry merged 1 commit intomainfrom
actor-persist/legacy-v4-args-fallback

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

@NathanFlurry NathanFlurry commented May 5, 2026

Stack Context

This PR mitigates a persisted Rivet Actor startup failure caused by a reused v4 actor-persist layout.

What?

Adds a v4 decode fallback for the short-lived legacy layout where scheduled event args was encoded as raw BARE data, then normalizes it to the current optional<Cbor> shape.

The current-v4 path is only accepted when it round-trips to canonical bytes. This avoids accidentally accepting legacy raw-v4 payloads that serde_bare can decode because nonzero bool bytes are treated as true.

Adds raw-byte regression tests for:

  • The bad 01 80 schedule-args payload found in production.
  • A legacy payload that the current v4 decoder can accept noncanonically.

Why?

Some production actors were written by an older v4 writer and later decoded by the fixed v4 schema. Because the persisted version stayed 4, startup failed with unexpected end of input when decoding scheduled event args.

Verification

  • cargo test -p rivetkit-actor-persist passed.

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 5, 2026

🚅 Deployed to the rivet-pr-4976 environment in rivet-frontend

Service Status Web Updated (UTC)
kitchen-sink ❌ Build Failed (View Logs) Web May 5, 2026 at 1:40 pm
website 😴 Sleeping (View Logs) Web May 5, 2026 at 1:14 pm
frontend-inspector ❌ Build Failed (View Logs) Web May 5, 2026 at 1:06 pm
mcp-hub ✅ Success (View Logs) Web May 5, 2026 at 1:06 pm
ladle ❌ Build Failed (View Logs) Web May 5, 2026 at 1:05 pm
frontend-cloud ❌ Build Failed (View Logs) Web May 5, 2026 at 1:05 pm

@railway-app railway-app Bot temporarily deployed to rivet-frontend / rivet-pr-4976 May 5, 2026 13:04 Destroyed
Copy link
Copy Markdown
Member Author

NathanFlurry commented May 5, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 5, 2026

Code Review: fix(rivetkit): decode legacy v4 actor schedule args

Overview

Targeted production hotfix adding a fallback decoder for v4 actor persist payloads written by a short-lived buggy writer that stored ScheduleEvent.args as raw data instead of optional Cbor. The fix is well-scoped and the approach is sound.


What Works Well

  • Canonicality round-trip check is the right way to guard against serde_bare accepting nonzero bools as true. The logic is clean and correct.
  • Local struct definitions (LegacyRawV4Actor, LegacyRawV4ScheduleEvent) inside the function are appropriate here - single-use legacy types that should not pollute the module namespace.
  • Field mapping is correct: BARE is positionally encoded, so renaming timestamp_ms in the legacy struct is cosmetic-only and does not affect decoding.
  • Empty args handling: (!event.args.is_empty()).then_some(event.args) correctly normalizes an empty legacy raw payload to None.
  • Tests use raw bytes from actual production payloads, which is the right approach for format regression tests.

Issues / Suggestions

1. Ambiguous error when v4 decodes non-canonically and legacy also fails

When current v4 decodes successfully but is non-canonical (current_error = None) and the legacy decoder also fails, the returned error is legacy_error. This gives no signal that v4 did decode but failed the round-trip check. A tracing::warn! when the non-canonical case is hit would make the path visible in production logs.

2. Missing happy-path and error-path test coverage

The two tests cover the production legacy payload and the non-canonical v4 acceptance case. Missing:

  • A canonical current-v4 payload that should decode via the fast path without hitting the legacy fallback. This ensures the round-trip check correctly admits valid current actors without regressing them.
  • An invalid payload that should fail both decoders, to verify the error path returns something sensible.

3. serde_bare::to_vec failure is silently treated as non-canonical

If serialization fails (e.g. OOM), is_ok_and returns false and falls through to the legacy decoder. This is safe but silently discards the serialization error. A tracing::warn from point 1 would cover this case too.


Summary

The fix is correct and safe for the stated production issue. The three points above are minor: the logging suggestion would improve debuggability, and the additional test cases would harden the regression suite. None are blockers for a production mitigation. LGTM with the logging suggestion as a nice-to-have.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Preview packages published to npm

Install with:

npm install rivetkit@pr-4976

All packages published as 0.0.0-pr.4976.554eccf with tag pr-4976.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-554eccf
docker pull rivetdev/engine:full-554eccf
Individual packages
npm install rivetkit@pr-4976
npm install @rivetkit/react@pr-4976
npm install @rivetkit/rivetkit-napi@pr-4976
npm install @rivetkit/workflow-engine@pr-4976

@NathanFlurry NathanFlurry force-pushed the engine-pools/install-rustls-provider branch from 25f115a to e745719 Compare May 5, 2026 13:39
@NathanFlurry NathanFlurry force-pushed the actor-persist/legacy-v4-args-fallback branch from 33cc9e7 to 1f35f57 Compare May 5, 2026 13:40
@railway-app railway-app Bot temporarily deployed to rivet-frontend / rivet-pr-4976 May 5, 2026 13:40 Destroyed
Base automatically changed from engine-pools/install-rustls-provider to main May 5, 2026 14:58
@NathanFlurry NathanFlurry merged commit 1f35f57 into main May 5, 2026
9 of 13 checks passed
@NathanFlurry NathanFlurry deleted the actor-persist/legacy-v4-args-fallback branch May 5, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant