Skip to content

feat: merge-train/spartan-v5#24331

Open
AztecBot wants to merge 5 commits into
v5-nextfrom
merge-train/spartan-v5
Open

feat: merge-train/spartan-v5#24331
AztecBot wants to merge 5 commits into
v5-nextfrom
merge-train/spartan-v5

Conversation

@AztecBot

@AztecBot AztecBot commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

BEGIN_COMMIT_OVERRIDE
refactor(e2e): consolidate the multi-node and single-node test categories (#24201)
fix(e2e): keep timing_env testEnvironment through yarn prepare (#24328)
test(e2e): speed up setup by automining L1 contract deployment (#24313)
test(e2e): increase slashing proposer search attempts (#24339)
test(sequencer-client): fix checkpoint proposal job timetable setup (#24346)
END_COMMIT_OVERRIDE

…ries (#24201)

## Motivation

The e2e suite under `end-to-end/src/` had accumulated ad-hoc base
classes with overlapping
responsibilities and large multi-`describe` files that were hard to
navigate. The A-1175 survey settled on
consolidating the suite into a few setup/lifecycle categories, each
backed by one base class that owns the
*environment*, with domain behavior composed on top. This PR delivers
the **`single-node`** and
**`multi-node`** categories, organizes each into behavior-named
subfolders, and breaks the suite into small,
one-`describe` files.

## Approach

`single-node` and `multi-node` are sibling categories.
`SingleNodeTestContext` owns the environment (one
sequencer + optional fake prover, prover lifecycle, epoch/proof/reorg
waiters); `MultiNodeTestContext extends`
it, adding a validator committee, gossip helpers, and
validator-registration sugar. Every file with more than
one top-level `describe` was exploded into a subfolder of one-`describe`
files, with shared describe-level
setup pulled into a co-located `setup.ts`.

### Hierarchy: before → after

**Before** — one flat directory plus two strays:

```
e2e_epochs/                 ~30 epochs_*.test.ts  +  epochs_test.ts (base context)
e2e_p2p/duplicate_attestation_slash.test.ts
e2e_p2p/duplicate_proposal_slash.test.ts
```

**After**:

```
single-node/                          # one sequencer (+ optional fake prover)
  single_node_test_context.ts         # base context + shared timing profiles
  proving/                            # epoch/proof lifecycle is the subject
    world_state_pruning  
    empty_blocks  
    long_proving_time  
    multi_proof
    optimistic.parallel  
    proof_fails.parallel  
    cross_chain_public_message  
    upload_failed_proof
  partial-proofs/                     # mid-epoch / multi-root / Outbox semantics
    single_root  
    multi_root
  l1-reorgs/                          # split along its describes
    blocks.parallel  
    messages.parallel
  recovery/                           # reorg + pending-chain recovery
    manual_rollback  
    sync_after_reorg  
    prune_when_cannot_build
  misc/
    missed_l1_slot                    # single-node sequencer sync/timetable regression

multi-node/                           # N validators on mock gossip
  multi_node_test_context.ts          # also owns validator registration (harness folded in)
  block-production/                   # happy-path committee production
    simple  
    high_tps  
    first_slot  
    proof_boundary.parallel  
    proposed_chain.parallel
    cross_chain_messages.parallel  
    deploy_and_call_ordering  
    blob_promotion  
    redistribution.parallel
  recovery/                           # detect a bad/withheld/conflicting proposal → recover
    proposal_failure_recovery.parallel  
    pipeline_prune  
    equivocation_recovery
  invalid-attestations/
    invalidate_block.parallel         # invalid checkpoints detected/invalidated, chain progresses
  high-availability/
    ha_sync  
    ha_checkpoint_handoff
  slashing/                           # pure offense detection
    duplicate_proposal  
    duplicate_attestation
```

### Criteria for the hierarchy

- **Top level = topology / setup model**: `single-node` (one sequencer)
vs `multi-node` (validator committee
+ gossip) — it names the environment a test needs, never the feature it
happens to touch.
- **Second level = the primary system behavior under test** — not the
shared setup, and not a flag. "Prover"
is a flag, so a multi-validator test isn't a "proving" test just because
a prover participates (hence
  multi-node has no `proving/` folder, single-node does).
- **Third level only for behavior that is still genuinely exceptional.**
MBPS and pipelining are now default
production traits, not toggleable features, so they no longer earn a
folder or a filename.
- **Folder homogeneity**: every test in a subfolder uses that
subfolder's context — placement never follows
helper/context ownership (e.g. `prune_when_cannot_build` stands up one
solo sequencer, so it's a single-node
  test even though it came from the multi-node setup).
- One top-level `describe` per file; shared setup in a co-located
`setup.ts`; the `.parallel` suffix iff a
file has more than one `it` (preserving CI's per-`it` job splitting
without extra anvils).

### Tests removed / replaced

- **One test removed** (net `it` count 64 → 63): the MBPS case *"builds
multiple blocks per slot with
transactions anchored to checkpointed block"*, dropped as redundant with
the retained *"...anchored to
  proposed blocks"* variant (now `block-production/proposed_chain`).
- **No other test removed.** The two `e2e_p2p` duplicate-slash tests
were **moved**, not deleted, into
`multi-node/slashing/` (and switched to mock gossip). Several files were
**merged** (combining files while
keeping every distinct `it`): `mbps/l2_to_l1`+`l1_to_l2` →
`cross_chain_messages.parallel`;
`mbps/proposed_anchor`+`non_validator_sync` → `proposed_chain.parallel`;
`prune/missed_l1_publish`+`orphan_block_prune` →
`proposal_failure_recovery.parallel`. `l1_reorgs.parallel`
was **split** along its `describe`s into `l1-reorgs/blocks.parallel` +
`messages.parallel`.
- A few `it` titles changed with their files: the two `block_building`
*"builds blocks without any errors"*
tests became *"builds simple/high-tps blocks..."*; *"manually rolls
back"* → *"...to an unfinalized block"*;
and the `pipelining` test was renamed `blob_promotion`
(*"promotion-disabled node fetches blobs while peers
skip them, and the checkpoint proves"*), trimmed to its unique
blob-promotion + proving assertions (the
redundant MBPS/pipelining-offset re-checks are still asserted by
`recovery/pipeline_prune`).

### Helpers introduced

- **`SingleNodeTestContext`** — base context (environment, prover
lifecycle, epoch/proof/reorg waiters); owns
the shared timing profiles `REORG_TIMING_BASE`, `FAST_REORG_TIMING`,
`MULTI_VALIDATOR_REORG_TIMING`,
`MULTI_VALIDATOR_BLOCK_PRODUCTION_TIMING`, and `WIDE_SLOT_TIMING` (the
72s wide-slot cadence for
  prover-backed multi-block-per-slot tests).
- **`MultiNodeTestContext`** — extends the above with the validator set,
committee/gossip helpers, and the
validator-registration API
(`validatorAt`/`addressAt`/`privateKeyAt`/`createValidatorNodeAt`,
  `getSlashingContracts()`, and a `slasherEnabled` setup preset).
- **`fixtures/wait_helpers.ts`** — intent-revealing waiters:
`waitForBlockNumber`, `waitForProvenBlock`,
  `waitForNodeCheckpoint`, `waitForNodeProvenCheckpoint`, `waitForTxs`.
- **Co-located `setup.ts`** per multi-file subfolder (`proving`,
`partial-proofs`, `l1-reorgs`, `recovery`
under single-node; `block-production`, `slashing` under multi-node). The
`block-production` setup factors a
shared `buildValidatorCluster` spine into two presets:
`setupSimpleBlockProduction` (lean, prover-less
liveness/throughput canary) and `setupBlockProductionWithProver`
(prover-backed, with a wallet + contract +
  fail-event tracking wired up for content/proving assertions).

A follow-up pass pulled the most-repeated hand-rolled steps up into
higher-level helpers so test bodies
read as a sequence of named steps (the long tail of smaller helpers is
omitted here):

- **`warpToBuildWindowForSlot` / `waitForBuildWindowForSlot`** (on the
context) — the pipelining "build
window" warp/wait (one L1 block before an L2 slot), replacing the
`getTimestampForSlot(slot) -
L1_BLOCK_TIME` + warp arithmetic copied across the consensus/recovery/HA
tests.
- **`proveAndSendTxs`** (`test-wallet/utils.ts`) — pre-prove a batch of
interactions and send them
`NO_WAIT`, collapsing the `timesAsync(proveInteraction)` +
`Promise.all(send)` pair duplicated in ~13 tests.
- **`startSequencers` / `watchNodeSequencerEvents`** (on the context) —
start and watch a committee of node
  sequencers without the `nodes.map(getSequencer)` boilerplate.
- **`waitForAllNodesToReachCheckpoint` / `waitForOffenseOnNodes`** (on
`MultiNodeTestContext`) — the
multi-node fan-out polls for a checkpointed tip and for slash-offense
convergence.
- **`assertMultipleBlocksPerSlot`** consolidated onto one context method
that owns both wait modes
(some-checkpoint-has-N-blocks and checkpointed-tip-reaches-block-X),
replacing three divergent copies.

### Semantic changes (setups that differ)

- **Slashing/equivocation tests run on the in-memory mock gossip bus**
instead of real libp2p (they moved out
  of `e2e_p2p`) — deterministic and faster.
- **`prune_when_cannot_build` migrated from `MultiNodeTestContext` to
`SingleNodeTestContext`** — it always
stood up a single solo sequencer (zero extra validators), so it now uses
only the base-class environment.
- **Hand-rolled `retryUntil` checkpoint / block-number polls replaced**
by the intent-revealing wait-helpers
above. (`retryUntil` remains the primitive for tx-status and other
custom predicates.)
- **Incidental per-test timing values converged** onto the shared timing
profiles owned by
  `SingleNodeTestContext`.

### Test fixes introduced

- **Wait-helper falsy-`0` fix** (`bbfbe69`): `waitForNodeCheckpoint` /
`waitForBlockNumber` now wrap their
matched value so a correct answer of `0` is no longer read as falsy by
`retryUntil` (which otherwise polls to
timeout). This was the actual cause behind the `l1-reorgs` proof-reorg
cases that were temporarily skipped
  during development (A-1266); they now run.
- **`proving/optimistic.parallel` deflake**: the top-tree-proving gate
now captures the live prover session
directly instead of polling for an `awaiting-checkpoints` job — the
session flips to `awaiting-root` before
the hook fires, so the old predicate never matched and the gate never
engaged, letting proving race the
  prune. Test-only.

## API changes

Test-infrastructure only (no product/public API). `EpochsTestContext` →
`MultiNodeTestContext` (extends the new
`SingleNodeTestContext`); the standalone `ValidatorRegistrationHarness`
is gone (folded into
`MultiNodeTestContext`); `e2e_epochs/` removed and the `epochs_` prefix
dropped; all importers, the CI
test-discovery globs in `bootstrap.sh`, and the `.test_patterns.yml`
path entries were updated to match.

Part of A-1176 (consolidation roadmap) / A-1064.

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
## Problem

PR #24281 added `end-to-end/src/shared/timing_env.mjs` and set the jest
`testEnvironment` to `./shared/timing_env.mjs` directly in
`end-to-end/package.json`. But `testEnvironment` is an **inherited**
field: the package.json generator (`scripts/update_package_jsons.mjs`)
shallow-merges each package's `jest` block with the parent winning, and
`package.common.json` sets `testEnvironment` to
`../../foundation/src/jest/env.mjs`.

As a result, `yarn prepare` reverts the override back to the foundation
env, and `yarn prepare:check` (run by the pre-commit hook) fails. The
inconsistency reached `merge-train/spartan-v5` because GitHub
squash-merge doesn't run the local pre-commit hook, so anyone merging
this base and committing locally now hits the failure.

## Fix

Move the `testEnvironment` override into
`end-to-end/package.local.json`, which the generator applies **last**,
so the timing test environment survives `prepare` and the generated
`end-to-end/package.json` stays consistent with the inherits sources.

Verified: `node scripts/update_package_jsons.mjs --check` passes with
this change and leaves `end-to-end/package.json` pointing at
`./shared/timing_env.mjs`.

@ludamad ludamad left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Auto-approved

@AztecBot AztecBot added this pull request to the merge queue Jun 26, 2026
@AztecBot

Copy link
Copy Markdown
Collaborator Author

🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass.

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 26, 2026
## Summary

e2e `setup()` spent ~24s of its ~30s waiting on anvil's block interval
while deploying the L1
rollup contracts (Multicall3 + the forge broadcast). Mining those setup
txs under anvil automine
instead cuts setup from **~30s to ~11s (~65%)** with no change to the
test conditions seen by the
test body.

## What changed

- **`automineL1Setup` is now a global default** (`setupInner` does
`opts.automineL1Setup ??= true`),
so *every* e2e suite mines its L1 setup txs immediately rather than
stalling on the anvil block
interval, then restores interval mining at `ethereumSlotDuration` before
the node starts.
Previously this only happened for suites spreading `AUTOMINE_E2E_OPTS`;
that preset no longer
  needs to set the flag.
- **Permanent per-step trace instrumentation.** Replaced the temporary
timing machinery with
`logger.trace()` statements at each setup-step boundary. Per-step
durations are recovered from log
timestamps at `LOG_LEVEL=trace`, so this is cheap enough to keep in the
tree:
  ```bash
  EXIT_E2E_AFTER_SETUP=1 LOG_LEVEL="warn; trace: e2e" \
    yarn workspace @aztec/end-to-end test:e2e e2e_nft.test.ts
  ```
- **Added `EXIT_E2E_AFTER_SETUP`**, a dev knob that throws right after
setup so
  you can measure setup in isolation without running the test body.

## Scope / risk

- Suites that don't set `ethereumSlotDuration` already start anvil in
automine (`start_anvil.ts`
only passes `--block-time` when `l1BlockTime` is set), so the default is
a **no-op** for them. The
behavioral change applies to **interval-mining** suites (those that set
`ethereumSlotDuration`,
e.g. the pipelining/automine presets), which now deploy under automine
and then return to interval
  mining before the node starts.
- `e2e_genesis_timestamp` opts out (`automineL1Setup: false`): it pins
the proven tip at genesis and
asserts on genesis-relative L1 timing, which mining the setup txs
instantly would break.
- Any other genesis-timing-sensitive suite would need the same opt-out;
CI is the gate.

## Measured (`e2e_nft`, setup-only)

| step | before | after |
|---|---:|---:|
| deploy_multicall3 | 8040ms | ~36ms |
| deploy_l1_contracts | 15785ms | ~4285ms |
| total setup | ~30s | ~11s |

`e2e_nft` also passes end-to-end (6/6) with the change.
Increase the slashing invalid-proposal test helper attempt window from
30 to 100 epochs to fix flake.
@AztecBot

Copy link
Copy Markdown
Collaborator Author

🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass.

@AztecBot AztecBot added this pull request to the merge queue Jun 26, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 27, 2026
…24346)

## Summary
- Ensure fresh jobs created in the single-block checkpoint proposal
tests use the same one-block timetable as the describe setup.
- Make the pipelined parent helper explicitly use that one-block
timetable and start at the pipelined build-frame opening.

## Investigation
PR #24331 was dequeued after the merge-queue run for synthetic commit
`08e8db39e998b8a5ad95b09039de2befd48e144f` failed `ci/x3-full`. The
failed test log for `yarn-project/scripts/run_test.sh
sequencer-client/src/sequencer/checkpoint_proposal_job.test.ts` showed
`TypeError: Cannot read properties of undefined (reading 'getStats')` at
`block.getStats()` in `checkpoint_proposal_job.ts`.

The test helpers seeded a single mock block, but some freshly-created
pipelined jobs kept the default multi-block timetable instead of the
single-block timetable. With no-op waits in the test subclass, the job
attempted a second block build, exhausted `MockCheckpointBuilder`, and
returned an undefined block.

## Verification
- `MAKEFILE_TARGET=yarn-project yarn-project/scripts/run_test.sh
sequencer-client/src/sequencer/checkpoint_proposal_job.test.ts` (46
passed)
- `./bootstrap.sh ci` exits immediately with `Unknown command: ci` on
this branch; the root bootstrap defines `ci-fast`/`ci-full`, not plain
`ci`.

---
*Created by
[claudebox](https://claudebox.work/v2/sessions/c9e66148e5907541) ·
group: `slackbot`*
@AztecBot AztecBot enabled auto-merge June 27, 2026 08:52
@AztecBot

Copy link
Copy Markdown
Collaborator Author

🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants