Skip to content

Commit e3c96e7

Browse files
Lodekeeperlodekeepernflaig
authored
fix: prevent seed state cache poisoning in loadState clone (#9246)
## Summary Fixes the v1.42.0 "Withdrawal mismatch at index=0" regression by changing the two `clone()` calls in `loadState.ts` to `clone(true)` (i.e. `dontTransferCache=true`). The default `clone()` in `@chainsafe/ssz` transfers the source sub-view's cache to the new instance, which means the `migratedState.validators` sub-view and the seed container's cached child-view snapshot share the **same** internal `nodes[]` and `caches[]` arrays. A subsequent `migratedState.commit()` writes modified validator / inactivity-score nodes into those shared arrays, silently corrupting the seed state's cache snapshot at the modified indices. The corruption stays latent until the seed state is later cloned with transfer-cache enabled — the path `verifyBlock` takes via `preState.clone({dontTransferCache: false})`. On that next-block clone, reads at the modified index return the migrated state's validator instead of the seed's, which surfaces as `Withdrawal mismatch at index=0` divergences between Lodestar and EL. ### Timeline of the corruption ``` // Inside loadState() with seedState = head state: migratedState.validators = seedState.validators.clone(); // ^ migrated.validators.nodes === seedState.caches[validatorsIndex].nodes (shared) for (const i of modifiedValidators) { migratedState.validators.set(i, loadValidator(...)); // staged in viewsChanged } migratedState.commit(); // ^ arrayComposite.js commit(): `this.nodes[index] = node` // writes newValidator into the SHARED nodes[] array. // seedState.caches[validatorsIndex].nodes[modifiedIndex] is now poisoned. ``` On the next block: ``` const preState = headState.clone(); // default dontTransferCache=false // ^ transfers the poisoned caches[] snapshot to preState preState.validators.getReadonly(i); // returns migrated validator, not seed's // -> getExpectedWithdrawals reads wrong validator at index 0 // -> "Withdrawal mismatch at index=0" ``` ### Root cause Introduced by #8857 (`chore: consume BeaconStateView`) which added the `loadOtherState` / shared-head seed path that exercises this clone in production. ## Test plan - [x] New regression test `loadState does not poison seed state's cache` in `packages/state-transition/test/unit/util/loadState.test.ts` - [x] Verified the test FAILS without the fix (reads `0xaa`-filled validator on `postState.clone()`) and PASSES with the fix - [x] All 176 existing `state-transition` util tests pass - [x] `check-types` and `lint` clean ## Relation to #9245 PR #9245 (`fix: gate loadOtherState validators/balances preload behind opt-in`) addresses a different regression from the same #8857-era changes — the eager `getAllReadonlyValues()` preload causing memory spikes on the API path. The two fixes are **independent and both needed**: - #9245 fixes the API-path memory spike. - This PR fixes the correctness bug (withdrawal mismatch). Notably, `persistentCheckpointsCache` also exercises `loadState()` with the problematic `clone()`, so #9245's preload gating alone doesn't cover the full surface. 🤖 Generated with AI assistance --------- Co-authored-by: lodekeeper <lodekeeper@users.noreply.github.com> Co-authored-by: Nico Flaig <nflaig@protonmail.com>
1 parent b05ea98 commit e3c96e7

2 files changed

Lines changed: 74 additions & 6 deletions

File tree

packages/state-transition/src/util/loadState/loadState.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -110,8 +110,8 @@ function loadInactivityScores(
110110
seedState: BeaconStateAltair,
111111
inactivityScoresBytes: Uint8Array
112112
): void {
113-
// migratedState starts with the same inactivityScores to seed state
114-
migratedState.inactivityScores = seedState.inactivityScores.clone();
113+
// true = do not transfer cache
114+
migratedState.inactivityScores = seedState.inactivityScores.clone(true);
115115
const oldValidator = migratedState.inactivityScores.length;
116116
// UintNum64 = 8 bytes
117117
const newValidator = inactivityScoresBytes.length / 8;
@@ -187,8 +187,8 @@ function loadValidators(
187187
const newValidatorCount = Math.floor(newValidatorsBytes.length / VALIDATOR_BYTES_SIZE);
188188
const isMoreValidator = newValidatorCount >= seedValidatorCount;
189189
const minValidatorCount = Math.min(seedValidatorCount, newValidatorCount);
190-
// migrated state starts with the same validators to seed state
191-
migratedState.validators = seedState.validators.clone();
190+
// true = do not transfer cache
191+
migratedState.validators = seedState.validators.clone(true);
192192
// 80% of validators serialization time comes from memory allocation
193193
// seedStateValidatorsBytes is an optimization at beacon-node side to avoid memory allocation here
194194
const seedValidatorsBytes = seedStateValidatorsBytes ?? seedState.validators.serialize();

packages/state-transition/test/unit/util/loadState.test.ts

Lines changed: 70 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@ import {createChainForkConfig} from "@lodestar/config";
33
import {mainnetChainConfig} from "@lodestar/config/networks";
44
import {ForkName, SLOTS_PER_EPOCH} from "@lodestar/params";
55
import {ssz} from "@lodestar/types";
6-
import {loadStateAndValidators} from "../../../src/util/loadState/loadState.js";
6+
import {BeaconStateAltair} from "../../../src/types.js";
7+
import {loadState, loadStateAndValidators} from "../../../src/util/loadState/loadState.js";
78

89
describe("loadStateAndValidators", () => {
910
const numValidator = 10;
@@ -22,7 +23,7 @@ describe("loadStateAndValidators", () => {
2223
state.slot = slot;
2324
for (let i = 0; i < numValidator; i++) {
2425
const validator = ssz.phase0.Validator.defaultViewDU();
25-
validator.pubkey = Buffer.alloc(48, i);
26+
validator.pubkey = new Uint8Array(48).fill(i);
2627
state.validators.push(validator);
2728
state.balances.push(32 * 1e9);
2829
}
@@ -38,3 +39,70 @@ describe("loadStateAndValidators", () => {
3839
});
3940
}
4041
});
42+
43+
describe("loadState does not poison seed state's cache", () => {
44+
// Regression test for a subtle SSZ TreeViewDU cache-transfer bug. loadState() used to clone the
45+
// seed's validators / inactivityScores subviews with the default transfer-cache semantics.
46+
// That does transfer cache away from the source subview, but the transferred `nodes[]` /
47+
// `caches[]` arrays are still the same ones referenced by the seed container's cached child
48+
// snapshot. A subsequent `migratedState.commit()` writes modified validator nodes into those
49+
// shared arrays, silently corrupting the seed container cache. The corruption is only
50+
// observed when the seed is later cloned with transfer-cache (the path verifyBlock takes via
51+
// the default `preState.clone()`) and the new clone reads a modified index, at which point
52+
// it returns the migrated state's validator instead of the seed's.
53+
//
54+
// Fix: clone the seed's validators/inactivityScores subviews with `clone(true)` so the
55+
// migrated subview gets a fresh (empty) cache and its commit cannot reach into the seed's
56+
// cache arrays.
57+
const numValidator = 10;
58+
const config = createChainForkConfig(mainnetChainConfig);
59+
// altair+ so both validators and inactivityScores paths are exercised
60+
const slot = mainnetChainConfig.ALTAIR_FORK_EPOCH * SLOTS_PER_EPOCH + 100;
61+
const modifiedIndex = 0;
62+
63+
function buildState(mutateFn?: (state: BeaconStateAltair) => void): BeaconStateAltair {
64+
const state = config.getForkTypes(slot).BeaconState.defaultViewDU() as BeaconStateAltair;
65+
state.slot = slot;
66+
for (let i = 0; i < numValidator; i++) {
67+
const validator = ssz.phase0.Validator.defaultViewDU();
68+
validator.pubkey = new Uint8Array(48).fill(i);
69+
validator.withdrawalCredentials = new Uint8Array(32).fill(i);
70+
validator.effectiveBalance = 32 * 1e9;
71+
state.validators.push(validator);
72+
state.balances.push(32 * 1e9);
73+
state.inactivityScores.push(i);
74+
}
75+
mutateFn?.(state);
76+
state.commit();
77+
return state;
78+
}
79+
80+
it("seed state's transfer-cache clone does not surface the migrated validator at a modified index", () => {
81+
const seedState = buildState();
82+
const originalRoot = seedState.hashTreeRoot();
83+
const originalWC = new Uint8Array(32).fill(modifiedIndex);
84+
85+
const modifiedState = buildState((state) => {
86+
const v = state.validators.get(modifiedIndex);
87+
v.withdrawalCredentials = new Uint8Array(32).fill(0xaa);
88+
state.validators.set(modifiedIndex, v);
89+
state.inactivityScores.set(modifiedIndex, 999);
90+
});
91+
const modifiedBytes = modifiedState.serialize();
92+
93+
const {state: migrated} = loadState(config, seedState, modifiedBytes);
94+
expect(migrated.validators.getReadonly(modifiedIndex).withdrawalCredentials).toEqual(new Uint8Array(32).fill(0xaa));
95+
expect((migrated as BeaconStateAltair).inactivityScores.get(modifiedIndex)).toEqual(999);
96+
97+
// IMPORTANT: do not read seedState.validators / call seedState.commit() between
98+
// loadState() and the clone. In production the head state keeps its cache snapshot
99+
// untouched across loadState, and the poisoning only surfaces on the NEXT block where
100+
// getPreState() clones with transfer-cache. Calling hashTreeRoot() / commit() here
101+
// would rewrite `seedState.caches[validatorsFieldIndex]` with the cleared subview's
102+
// cache and hide the bug.
103+
const postState = seedState.clone() as BeaconStateAltair;
104+
expect(postState.validators.getReadonly(modifiedIndex).withdrawalCredentials).toEqual(originalWC);
105+
expect(postState.inactivityScores.get(modifiedIndex)).toEqual(modifiedIndex);
106+
expect(postState.hashTreeRoot()).toEqual(originalRoot);
107+
});
108+
});

0 commit comments

Comments
 (0)