Skip to content

Commit 206eb0f

Browse files
authored
test(e2e): fix proposer invalidates multiple checkpoints timeout (#23608)
Fixes flake in `proposer invalidates multiple checkpoints` `e2e_epochs/epochs_invalidate_block.parallel.test.ts` test that caused a timeout (see [this run](http://ci.aztec-labs.com/8b1c0f4ec6031f2b)). See below for the Codex analysis and fix. --- **Test Summary** `proposer invalidates multiple checkpoints` verifies that two intended bad checkpoints land with insufficient attestations, a later good proposer invalidates the first bad checkpoint, and the chain then progresses. **Failed Run Error** CI run `8b1c0f4ec6031f2b` timed out at Jest’s 600s limit. The failure was not the shutdown L1 send error; that happened after the timeout while teardown was interrupting pending work. **Failed vs Successful Divergence** First meaningful divergence: checkpoint 4 at slot 23. Failed log: slot 23 published checkpoint 4 with only 1 attestation, then archivers reported `Insufficient attestations ... actualAttestations:1`. Successful log: slot 23 collected all 5 attestations before publishing checkpoint 4, so the first intentionally bad checkpoints were later. **Timeline** Failed: - `15:59:11` selected intended bad slots 25/26, applied bad config to proposer `0x15...` - `15:59:35` slot 23 job prepared by that same proposer - `16:00:15` checkpoint 4 at slot 23 landed with 1 attestation - repeated rollback/retry consumed enough time to hit Jest timeout Successful: - slot 23 checkpoint landed cleanly with 5 attestations - intended bad checkpoints at slots 24/25 landed with 1 attestation - checkpoint 5 was invalidated - test completed successfully **Hypothesis** High confidence: the test’s bad-slot selection only excluded `candidateSlot1 - 1` as a pre-bad pipelined target. In the failed run, `candidateSlot1 - 2` was still unsnapshotted and owned by a bad proposer, so applying malicious config leaked into slot 23. **Evidence** - Logs: failed run selected slots 25/26 but slot 23 later published with 1 attestation from the newly bad proposer. - Source: pipelined checkpoint jobs snapshot sequencer config when the target-slot job is created, so applying config while sequencers are running can affect any not-yet-created pre-bad job. - Skeptic check: no contradiction found; it also caught a broken local timeout race. **Proposed Fix** Implemented in [epochs_invalidate_block.parallel.test.ts](/home/santiago/Projects/aztec-1/yarn-project/end-to-end/src/e2e_epochs/epochs_invalidate_block.parallel.test.ts:393): the selector now excludes bad proposers from every pre-bad target slot from `currentSlot + 2` through `candidateSlot1 - 1`, not just the immediately prior slot. Also fixed the broken timeout race at [line 475](/home/santiago/Projects/aztec-1/yarn-project/end-to-end/src/e2e_epochs/epochs_invalidate_block.parallel.test.ts:475) by removing the accidental inner `await`.
1 parent c335146 commit 206eb0f

1 file changed

Lines changed: 18 additions & 9 deletions

File tree

yarn-project/end-to-end/src/e2e_epochs/epochs_invalidate_block.parallel.test.ts

Lines changed: 18 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -390,31 +390,40 @@ describe('e2e_epochs/epochs_invalidate_block', () => {
390390
const { l2SlotNumber: currentSlot } = await test.monitor.run();
391391
logger.warn(`First checkpoint mined, current slot is ${currentSlot}`);
392392

393-
// The bad config is applied while sequencers are already running; skip pairs where the prior pipelined
394-
// target slot could snapshot that config before the intended bad slots.
393+
// The bad config is applied while sequencers are already running; skip pairs where a pipelined
394+
// pre-bad target slot could snapshot that config before the intended bad slots.
395395
let badSlot1: SlotNumber | undefined;
396396
let badSlot2: SlotNumber | undefined;
397397
let badProposers: EthAddress[] = [];
398398
const firstCandidateSlot = Number(currentSlot) + 3;
399+
const firstUnsnapshottedTargetSlot = SlotNumber.add(currentSlot, 2);
399400
const maxBadSlotSearchAttempts = 20;
400401
for (let attempt = 0; attempt < maxBadSlotSearchAttempts && badSlot1 === undefined; attempt++) {
401402
const candidateSlot1 = SlotNumber(firstCandidateSlot + attempt);
402403
const candidateSlot2 = SlotNumber.add(candidateSlot1, 1);
403-
const priorPipelinedTargetSlot = SlotNumber.add(candidateSlot1, -1);
404-
const [priorProposer, p1, p2] = await Promise.all([
405-
test.epochCache.getProposerAttesterAddressInSlot(priorPipelinedTargetSlot),
404+
const preBadTargetSlots = range(
405+
Math.max(0, Number(candidateSlot1) - Number(firstUnsnapshottedTargetSlot)),
406+
Number(firstUnsnapshottedTargetSlot),
407+
).map(SlotNumber);
408+
const [preBadProposers, p1, p2] = await Promise.all([
409+
Promise.all(preBadTargetSlots.map(slot => test.epochCache.getProposerAttesterAddressInSlot(slot))),
406410
test.epochCache.getProposerAttesterAddressInSlot(candidateSlot1),
407411
test.epochCache.getProposerAttesterAddressInSlot(candidateSlot2),
408412
]);
409413

410414
logger.warn(`Checking bad checkpoint slots ${candidateSlot1} and ${candidateSlot2}`, {
411-
priorPipelinedTargetSlot,
412-
priorProposer: priorProposer?.toString(),
415+
preBadTargetSlots,
416+
preBadProposers: preBadProposers.map(proposer => proposer?.toString()),
413417
p1: p1?.toString(),
414418
p2: p2?.toString(),
415419
});
416420

417-
if (p1 && p2 && !priorProposer?.equals(p1) && !priorProposer?.equals(p2)) {
421+
const badProposerHasUnsnapshottedPreBadSlot =
422+
p1 !== undefined &&
423+
p2 !== undefined &&
424+
preBadProposers.some(proposer => proposer !== undefined && (proposer.equals(p1) || proposer.equals(p2)));
425+
426+
if (p1 && p2 && !badProposerHasUnsnapshottedPreBadSlot) {
418427
badSlot1 = candidateSlot1;
419428
badSlot2 = candidateSlot2;
420429
badProposers = [p1, p2];
@@ -464,7 +473,7 @@ describe('e2e_epochs/epochs_invalidate_block', () => {
464473
// Wait for both checkpoints to be mined
465474
logger.warn(`Waiting for two checkpoints to be mined on slots ${expectedFirstSlot} and ${expectedSecondSlot}`);
466475
const [firstCheckpoint, secondCheckpoint] = await Promise.race([
467-
await Promise.all([firstCheckpointPromise.promise, secondCheckpointPromise.promise]),
476+
Promise.all([firstCheckpointPromise.promise, secondCheckpointPromise.promise]),
468477
timeoutPromise(test.L2_SLOT_DURATION_IN_S * 8 * 1000).then(() => [CheckpointNumber(0), CheckpointNumber(0)]),
469478
]);
470479

0 commit comments

Comments
 (0)