Skip to content

test(e2e): fix race in 'proposer invalidates multiple checkpoints'#23259

Merged
PhilWindle merged 1 commit into
merge-train/spartanfrom
spl/fix-invalidate-block-race
May 14, 2026
Merged

test(e2e): fix race in 'proposer invalidates multiple checkpoints'#23259
PhilWindle merged 1 commit into
merge-train/spartanfrom
spl/fix-invalidate-block-race

Conversation

@spalladino
Copy link
Copy Markdown
Contributor

Addresses a config-timing race in epochs_invalidate_block.parallel.test.ts > "proposer invalidates multiple checkpoints" that caused intermittent CI failures with expect(validCount).toBeLessThan(quorum) (e.g. 5/6 attestations when quorum=5).

The race

The test reads currentSlot via monitor.run() right after waiting for the first checkpoint to land — that read can land anywhere within the current L2 slot, including near its end. It then computes badSlot1 = currentSlot + 2 and races to push malicious config (skipCollectingAttestations: true, …) to that slot's proposer via await node.setConfig({...}).

CheckpointProposalJob is constructed with this.config passed by reference (sequencer-client/src/sequencer/sequencer.ts:559), and Sequencer.updateConfig reassigns this.config = merge(...) rather than mutating, so a job built before setConfig lands keeps the old config object. Under proposer pipelining (PROPOSER_PIPELINING_SLOT_OFFSET = 1, epoch-cache/src/epoch_cache.ts:26), the job for badSlot1 is built during the last L1 slot of L2 slot badSlot1 - 1. With 32s L2 slots and 8s L1 slots, that's ~24s into the previous L2 slot — so if currentSlot was read late, badSlot1's proposer can snapshot the old config before our setConfig round-trip completes.

Fix

  • Wait for an L2 slot boundary (monitor.waitUntilNextL2Slot()) before reading currentSlot, so we start from the beginning of a slot rather than wherever we happened to land.
  • Bump the gap from +2/+3 to +3/+4 for a second slot of margin.

Cost is up to one additional L2 slot of test runtime in the worst case; the existing 8-slot wait window for both checkpoints still fits.

Wait for an L2 slot boundary before computing bad slots, and add an extra
slot of margin, so the malicious config is applied to badSlot1's proposer
well before it constructs its CheckpointProposalJob (which snapshots
this.config). Previously, monitor.run() could return a slot value near
its end, leaving only milliseconds before the next pipelined work() loop
captured the old config.
@PhilWindle PhilWindle merged commit 498b961 into merge-train/spartan May 14, 2026
14 checks passed
@PhilWindle PhilWindle deleted the spl/fix-invalidate-block-race branch May 14, 2026 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants