Skip to content

fix(ocap-kernel): regenerate incarnationId on resetStorage=true#950

Merged
FUDCo merged 2 commits into
mainfrom
chip/fix-incarnation-survives-resetStorage
May 12, 2026
Merged

fix(ocap-kernel): regenerate incarnationId on resetStorage=true#950
FUDCo merged 2 commits into
mainfrom
chip/fix-incarnation-survives-resetStorage

Conversation

@FUDCo
Copy link
Copy Markdown
Contributor

@FUDCo FUDCo commented May 12, 2026

Summary

Why

The peer-restart detection added in #948 only works if the sender signals a new incarnation when its state has been wiped. The except-list in `#resetKernelState` was preserving `incarnationId` across `resetStorage=true` resets, so the sender came back signalling the same incarnation, the receiver's handshake said "no restart", `highestReceivedSeq` stayed put, and the sender's fresh `seq=1` messages were dropped as duplicates.

The browser/extension kernel-worker URL carries `?reset-storage=true` and hits this branch on every offscreen-page reload, so in production this meant every plugin reload that re-registered a service silently lost its first message. Repro on metamask-extension chip/agentmask: full reset → first reload registers cleanly → second reload's PMS `registerServiceByRef` arrives at the matcher with seq=1, matcher logs `12D3KooW:: ignoring duplicate message seq=1 (highestReceived=4)` and the message disappears.

Confirmed empirically by dumping the matcher's KV after the second-reload symptom: `peerIncarnation.` was equal to the incarnation the extension was currently sending — exactly the case `#handleIncarnationChange` exits early on. The `provideIncarnationId` doc comment already advertised the post-fix behaviour ("regenerated when storage is cleared (when state is actually lost)"); the code just wasn't doing it.

Test plan

  • `yarn workspace @metamask/ocap-kernel test` (2341 passing locally)
  • Production repro on chip/agentmask: full reset, first reload registers, second reload registers again instead of being dropped
  • CI green, including the new e2e `detects restart when sender keeps DB but uses resetStorage=true`

🤖 Generated with Claude Code


Note

Medium Risk
Changes how resetStorage=true resets persistent kernel state, which affects remote-communication handshake/dedup behavior across restarts; mistakes could cause message loss or unexpected peer identity changes.

Overview
Fixes the resetStorage=true reset path to regenerate incarnationId while still preserving network identity keys (keySeed, peerId, ocapURLKey), so peers can reliably detect a state wipe and clear stale seq-dedup/c-list state.

Adds an e2e regression test covering the browser/extension-style restart where the sender keeps the same DB file but starts with resetStorage=true, ensuring seq=1 messages are no longer silently dropped after reloads, and documents the fix in the CHANGELOG.

Reviewed by Cursor Bugbot for commit c5cb8ea. Bugbot is set up for automated code reviews on this repo. Configure here.

The peer-restart detection added in #948 only works if the sender
signals a new incarnation after a state wipe. The previous except-list
in Kernel.#resetKernelState kept `incarnationId` alive across
`resetStorage=true`, so a kernel-worker rebooted with the same
incarnation as before — the receiver's persisted comparison decided
"no restart," `highestReceivedSeq` stayed put, and fresh `seq=1`
messages were dropped as duplicates.

The browser kernel-worker hits this path on every offscreen-page
reload (the URL carries `?reset-storage=true`), so every extension
reload that re-registered a service silently dropped its first
message. Dropping `incarnationId` from the except-list lets
`provideIncarnationId` regenerate it, which is what the doc comment
on that function already advertised.

Adds an e2e regression covering the resetStorage path specifically
(the existing #944 test goes through fresh-DB, which exercises a
different code branch and didn't catch the bug).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@FUDCo FUDCo requested a review from a team as a code owner May 12, 2026 21:30
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 12, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 71.43%
🟰 ±0%
8259 / 11561
🔵 Statements 71.26%
🟰 ±0%
8395 / 11780
🔵 Functions 72.25%
🟰 ±0%
1992 / 2757
🔵 Branches 65.09%
🟰 ±0%
3342 / 5134
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
packages/ocap-kernel/src/Kernel.ts 88.39%
🟰 ±0%
77.77%
🟰 ±0%
82.6%
🟰 ±0%
88.39%
🟰 ±0%
294-297, 314, 338, 413-423, 511, 579, 645-648, 661, 671-672, 725, 744
Generated in workflow #4412 for commit c5cb8ea by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

@sirtimid sirtimid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and easy!

@FUDCo FUDCo added this pull request to the merge queue May 12, 2026
Merged via the queue into main with commit 052f4d4 May 12, 2026
33 checks passed
@FUDCo FUDCo deleted the chip/fix-incarnation-survives-resetStorage branch May 12, 2026 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants