Commit 206eb0f
authored
test(e2e): fix
Fixes flake in `proposer invalidates multiple checkpoints`
`e2e_epochs/epochs_invalidate_block.parallel.test.ts` test that caused a
timeout (see [this run](http://ci.aztec-labs.com/8b1c0f4ec6031f2b)). See
below for the Codex analysis and fix.
---
**Test Summary**
`proposer invalidates multiple checkpoints` verifies that two intended
bad checkpoints land with insufficient attestations, a later good
proposer invalidates the first bad checkpoint, and the chain then
progresses.
**Failed Run Error**
CI run `8b1c0f4ec6031f2b` timed out at Jest’s 600s limit. The failure
was not the shutdown L1 send error; that happened after the timeout
while teardown was interrupting pending work.
**Failed vs Successful Divergence**
First meaningful divergence: checkpoint 4 at slot 23.
Failed log: slot 23 published checkpoint 4 with only 1 attestation, then
archivers reported `Insufficient attestations ... actualAttestations:1`.
Successful log: slot 23 collected all 5 attestations before publishing
checkpoint 4, so the first intentionally bad checkpoints were later.
**Timeline**
Failed:
- `15:59:11` selected intended bad slots 25/26, applied bad config to
proposer `0x15...`
- `15:59:35` slot 23 job prepared by that same proposer
- `16:00:15` checkpoint 4 at slot 23 landed with 1 attestation
- repeated rollback/retry consumed enough time to hit Jest timeout
Successful:
- slot 23 checkpoint landed cleanly with 5 attestations
- intended bad checkpoints at slots 24/25 landed with 1 attestation
- checkpoint 5 was invalidated
- test completed successfully
**Hypothesis**
High confidence: the test’s bad-slot selection only excluded
`candidateSlot1 - 1` as a pre-bad pipelined target. In the failed run,
`candidateSlot1 - 2` was still unsnapshotted and owned by a bad
proposer, so applying malicious config leaked into slot 23.
**Evidence**
- Logs: failed run selected slots 25/26 but slot 23 later published with
1 attestation from the newly bad proposer.
- Source: pipelined checkpoint jobs snapshot sequencer config when the
target-slot job is created, so applying config while sequencers are
running can affect any not-yet-created pre-bad job.
- Skeptic check: no contradiction found; it also caught a broken local
timeout race.
**Proposed Fix**
Implemented in
[epochs_invalidate_block.parallel.test.ts](/home/santiago/Projects/aztec-1/yarn-project/end-to-end/src/e2e_epochs/epochs_invalidate_block.parallel.test.ts:393):
the selector now excludes bad proposers from every pre-bad target slot
from `currentSlot + 2` through `candidateSlot1 - 1`, not just the
immediately prior slot.
Also fixed the broken timeout race at [line
475](/home/santiago/Projects/aztec-1/yarn-project/end-to-end/src/e2e_epochs/epochs_invalidate_block.parallel.test.ts:475)
by removing the accidental inner `await`.proposer invalidates multiple checkpoints timeout (#23608)1 parent c335146 commit 206eb0f
1 file changed
Lines changed: 18 additions & 9 deletions
Lines changed: 18 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
390 | 390 | | |
391 | 391 | | |
392 | 392 | | |
393 | | - | |
394 | | - | |
| 393 | + | |
| 394 | + | |
395 | 395 | | |
396 | 396 | | |
397 | 397 | | |
398 | 398 | | |
| 399 | + | |
399 | 400 | | |
400 | 401 | | |
401 | 402 | | |
402 | 403 | | |
403 | | - | |
404 | | - | |
405 | | - | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
406 | 410 | | |
407 | 411 | | |
408 | 412 | | |
409 | 413 | | |
410 | 414 | | |
411 | | - | |
412 | | - | |
| 415 | + | |
| 416 | + | |
413 | 417 | | |
414 | 418 | | |
415 | 419 | | |
416 | 420 | | |
417 | | - | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
418 | 427 | | |
419 | 428 | | |
420 | 429 | | |
| |||
464 | 473 | | |
465 | 474 | | |
466 | 475 | | |
467 | | - | |
| 476 | + | |
468 | 477 | | |
469 | 478 | | |
470 | 479 | | |
| |||
0 commit comments