feat: merge-train/spartan-v5#24331
Open
AztecBot wants to merge 5 commits into
Open
Conversation
…ries (#24201) ## Motivation The e2e suite under `end-to-end/src/` had accumulated ad-hoc base classes with overlapping responsibilities and large multi-`describe` files that were hard to navigate. The A-1175 survey settled on consolidating the suite into a few setup/lifecycle categories, each backed by one base class that owns the *environment*, with domain behavior composed on top. This PR delivers the **`single-node`** and **`multi-node`** categories, organizes each into behavior-named subfolders, and breaks the suite into small, one-`describe` files. ## Approach `single-node` and `multi-node` are sibling categories. `SingleNodeTestContext` owns the environment (one sequencer + optional fake prover, prover lifecycle, epoch/proof/reorg waiters); `MultiNodeTestContext extends` it, adding a validator committee, gossip helpers, and validator-registration sugar. Every file with more than one top-level `describe` was exploded into a subfolder of one-`describe` files, with shared describe-level setup pulled into a co-located `setup.ts`. ### Hierarchy: before → after **Before** — one flat directory plus two strays: ``` e2e_epochs/ ~30 epochs_*.test.ts + epochs_test.ts (base context) e2e_p2p/duplicate_attestation_slash.test.ts e2e_p2p/duplicate_proposal_slash.test.ts ``` **After**: ``` single-node/ # one sequencer (+ optional fake prover) single_node_test_context.ts # base context + shared timing profiles proving/ # epoch/proof lifecycle is the subject world_state_pruning empty_blocks long_proving_time multi_proof optimistic.parallel proof_fails.parallel cross_chain_public_message upload_failed_proof partial-proofs/ # mid-epoch / multi-root / Outbox semantics single_root multi_root l1-reorgs/ # split along its describes blocks.parallel messages.parallel recovery/ # reorg + pending-chain recovery manual_rollback sync_after_reorg prune_when_cannot_build misc/ missed_l1_slot # single-node sequencer sync/timetable regression multi-node/ # N validators on mock gossip multi_node_test_context.ts # also owns validator registration (harness folded in) block-production/ # happy-path committee production simple high_tps first_slot proof_boundary.parallel proposed_chain.parallel cross_chain_messages.parallel deploy_and_call_ordering blob_promotion redistribution.parallel recovery/ # detect a bad/withheld/conflicting proposal → recover proposal_failure_recovery.parallel pipeline_prune equivocation_recovery invalid-attestations/ invalidate_block.parallel # invalid checkpoints detected/invalidated, chain progresses high-availability/ ha_sync ha_checkpoint_handoff slashing/ # pure offense detection duplicate_proposal duplicate_attestation ``` ### Criteria for the hierarchy - **Top level = topology / setup model**: `single-node` (one sequencer) vs `multi-node` (validator committee + gossip) — it names the environment a test needs, never the feature it happens to touch. - **Second level = the primary system behavior under test** — not the shared setup, and not a flag. "Prover" is a flag, so a multi-validator test isn't a "proving" test just because a prover participates (hence multi-node has no `proving/` folder, single-node does). - **Third level only for behavior that is still genuinely exceptional.** MBPS and pipelining are now default production traits, not toggleable features, so they no longer earn a folder or a filename. - **Folder homogeneity**: every test in a subfolder uses that subfolder's context — placement never follows helper/context ownership (e.g. `prune_when_cannot_build` stands up one solo sequencer, so it's a single-node test even though it came from the multi-node setup). - One top-level `describe` per file; shared setup in a co-located `setup.ts`; the `.parallel` suffix iff a file has more than one `it` (preserving CI's per-`it` job splitting without extra anvils). ### Tests removed / replaced - **One test removed** (net `it` count 64 → 63): the MBPS case *"builds multiple blocks per slot with transactions anchored to checkpointed block"*, dropped as redundant with the retained *"...anchored to proposed blocks"* variant (now `block-production/proposed_chain`). - **No other test removed.** The two `e2e_p2p` duplicate-slash tests were **moved**, not deleted, into `multi-node/slashing/` (and switched to mock gossip). Several files were **merged** (combining files while keeping every distinct `it`): `mbps/l2_to_l1`+`l1_to_l2` → `cross_chain_messages.parallel`; `mbps/proposed_anchor`+`non_validator_sync` → `proposed_chain.parallel`; `prune/missed_l1_publish`+`orphan_block_prune` → `proposal_failure_recovery.parallel`. `l1_reorgs.parallel` was **split** along its `describe`s into `l1-reorgs/blocks.parallel` + `messages.parallel`. - A few `it` titles changed with their files: the two `block_building` *"builds blocks without any errors"* tests became *"builds simple/high-tps blocks..."*; *"manually rolls back"* → *"...to an unfinalized block"*; and the `pipelining` test was renamed `blob_promotion` (*"promotion-disabled node fetches blobs while peers skip them, and the checkpoint proves"*), trimmed to its unique blob-promotion + proving assertions (the redundant MBPS/pipelining-offset re-checks are still asserted by `recovery/pipeline_prune`). ### Helpers introduced - **`SingleNodeTestContext`** — base context (environment, prover lifecycle, epoch/proof/reorg waiters); owns the shared timing profiles `REORG_TIMING_BASE`, `FAST_REORG_TIMING`, `MULTI_VALIDATOR_REORG_TIMING`, `MULTI_VALIDATOR_BLOCK_PRODUCTION_TIMING`, and `WIDE_SLOT_TIMING` (the 72s wide-slot cadence for prover-backed multi-block-per-slot tests). - **`MultiNodeTestContext`** — extends the above with the validator set, committee/gossip helpers, and the validator-registration API (`validatorAt`/`addressAt`/`privateKeyAt`/`createValidatorNodeAt`, `getSlashingContracts()`, and a `slasherEnabled` setup preset). - **`fixtures/wait_helpers.ts`** — intent-revealing waiters: `waitForBlockNumber`, `waitForProvenBlock`, `waitForNodeCheckpoint`, `waitForNodeProvenCheckpoint`, `waitForTxs`. - **Co-located `setup.ts`** per multi-file subfolder (`proving`, `partial-proofs`, `l1-reorgs`, `recovery` under single-node; `block-production`, `slashing` under multi-node). The `block-production` setup factors a shared `buildValidatorCluster` spine into two presets: `setupSimpleBlockProduction` (lean, prover-less liveness/throughput canary) and `setupBlockProductionWithProver` (prover-backed, with a wallet + contract + fail-event tracking wired up for content/proving assertions). A follow-up pass pulled the most-repeated hand-rolled steps up into higher-level helpers so test bodies read as a sequence of named steps (the long tail of smaller helpers is omitted here): - **`warpToBuildWindowForSlot` / `waitForBuildWindowForSlot`** (on the context) — the pipelining "build window" warp/wait (one L1 block before an L2 slot), replacing the `getTimestampForSlot(slot) - L1_BLOCK_TIME` + warp arithmetic copied across the consensus/recovery/HA tests. - **`proveAndSendTxs`** (`test-wallet/utils.ts`) — pre-prove a batch of interactions and send them `NO_WAIT`, collapsing the `timesAsync(proveInteraction)` + `Promise.all(send)` pair duplicated in ~13 tests. - **`startSequencers` / `watchNodeSequencerEvents`** (on the context) — start and watch a committee of node sequencers without the `nodes.map(getSequencer)` boilerplate. - **`waitForAllNodesToReachCheckpoint` / `waitForOffenseOnNodes`** (on `MultiNodeTestContext`) — the multi-node fan-out polls for a checkpointed tip and for slash-offense convergence. - **`assertMultipleBlocksPerSlot`** consolidated onto one context method that owns both wait modes (some-checkpoint-has-N-blocks and checkpointed-tip-reaches-block-X), replacing three divergent copies. ### Semantic changes (setups that differ) - **Slashing/equivocation tests run on the in-memory mock gossip bus** instead of real libp2p (they moved out of `e2e_p2p`) — deterministic and faster. - **`prune_when_cannot_build` migrated from `MultiNodeTestContext` to `SingleNodeTestContext`** — it always stood up a single solo sequencer (zero extra validators), so it now uses only the base-class environment. - **Hand-rolled `retryUntil` checkpoint / block-number polls replaced** by the intent-revealing wait-helpers above. (`retryUntil` remains the primitive for tx-status and other custom predicates.) - **Incidental per-test timing values converged** onto the shared timing profiles owned by `SingleNodeTestContext`. ### Test fixes introduced - **Wait-helper falsy-`0` fix** (`bbfbe69`): `waitForNodeCheckpoint` / `waitForBlockNumber` now wrap their matched value so a correct answer of `0` is no longer read as falsy by `retryUntil` (which otherwise polls to timeout). This was the actual cause behind the `l1-reorgs` proof-reorg cases that were temporarily skipped during development (A-1266); they now run. - **`proving/optimistic.parallel` deflake**: the top-tree-proving gate now captures the live prover session directly instead of polling for an `awaiting-checkpoints` job — the session flips to `awaiting-root` before the hook fires, so the old predicate never matched and the gate never engaged, letting proving race the prune. Test-only. ## API changes Test-infrastructure only (no product/public API). `EpochsTestContext` → `MultiNodeTestContext` (extends the new `SingleNodeTestContext`); the standalone `ValidatorRegistrationHarness` is gone (folded into `MultiNodeTestContext`); `e2e_epochs/` removed and the `epochs_` prefix dropped; all importers, the CI test-discovery globs in `bootstrap.sh`, and the `.test_patterns.yml` path entries were updated to match. Part of A-1176 (consolidation roadmap) / A-1064. --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
## Problem PR #24281 added `end-to-end/src/shared/timing_env.mjs` and set the jest `testEnvironment` to `./shared/timing_env.mjs` directly in `end-to-end/package.json`. But `testEnvironment` is an **inherited** field: the package.json generator (`scripts/update_package_jsons.mjs`) shallow-merges each package's `jest` block with the parent winning, and `package.common.json` sets `testEnvironment` to `../../foundation/src/jest/env.mjs`. As a result, `yarn prepare` reverts the override back to the foundation env, and `yarn prepare:check` (run by the pre-commit hook) fails. The inconsistency reached `merge-train/spartan-v5` because GitHub squash-merge doesn't run the local pre-commit hook, so anyone merging this base and committing locally now hits the failure. ## Fix Move the `testEnvironment` override into `end-to-end/package.local.json`, which the generator applies **last**, so the timing test environment survives `prepare` and the generated `end-to-end/package.json` stays consistent with the inherits sources. Verified: `node scripts/update_package_jsons.mjs --check` passes with this change and leaves `end-to-end/package.json` pointing at `./shared/timing_env.mjs`.
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
## Summary
e2e `setup()` spent ~24s of its ~30s waiting on anvil's block interval
while deploying the L1
rollup contracts (Multicall3 + the forge broadcast). Mining those setup
txs under anvil automine
instead cuts setup from **~30s to ~11s (~65%)** with no change to the
test conditions seen by the
test body.
## What changed
- **`automineL1Setup` is now a global default** (`setupInner` does
`opts.automineL1Setup ??= true`),
so *every* e2e suite mines its L1 setup txs immediately rather than
stalling on the anvil block
interval, then restores interval mining at `ethereumSlotDuration` before
the node starts.
Previously this only happened for suites spreading `AUTOMINE_E2E_OPTS`;
that preset no longer
needs to set the flag.
- **Permanent per-step trace instrumentation.** Replaced the temporary
timing machinery with
`logger.trace()` statements at each setup-step boundary. Per-step
durations are recovered from log
timestamps at `LOG_LEVEL=trace`, so this is cheap enough to keep in the
tree:
```bash
EXIT_E2E_AFTER_SETUP=1 LOG_LEVEL="warn; trace: e2e" \
yarn workspace @aztec/end-to-end test:e2e e2e_nft.test.ts
```
- **Added `EXIT_E2E_AFTER_SETUP`**, a dev knob that throws right after
setup so
you can measure setup in isolation without running the test body.
## Scope / risk
- Suites that don't set `ethereumSlotDuration` already start anvil in
automine (`start_anvil.ts`
only passes `--block-time` when `l1BlockTime` is set), so the default is
a **no-op** for them. The
behavioral change applies to **interval-mining** suites (those that set
`ethereumSlotDuration`,
e.g. the pipelining/automine presets), which now deploy under automine
and then return to interval
mining before the node starts.
- `e2e_genesis_timestamp` opts out (`automineL1Setup: false`): it pins
the proven tip at genesis and
asserts on genesis-relative L1 timing, which mining the setup txs
instantly would break.
- Any other genesis-timing-sensitive suite would need the same opt-out;
CI is the gate.
## Measured (`e2e_nft`, setup-only)
| step | before | after |
|---|---:|---:|
| deploy_multicall3 | 8040ms | ~36ms |
| deploy_l1_contracts | 15785ms | ~4285ms |
| total setup | ~30s | ~11s |
`e2e_nft` also passes end-to-end (6/6) with the change.
Increase the slashing invalid-proposal test helper attempt window from 30 to 100 epochs to fix flake.
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
…24346) ## Summary - Ensure fresh jobs created in the single-block checkpoint proposal tests use the same one-block timetable as the describe setup. - Make the pipelined parent helper explicitly use that one-block timetable and start at the pipelined build-frame opening. ## Investigation PR #24331 was dequeued after the merge-queue run for synthetic commit `08e8db39e998b8a5ad95b09039de2befd48e144f` failed `ci/x3-full`. The failed test log for `yarn-project/scripts/run_test.sh sequencer-client/src/sequencer/checkpoint_proposal_job.test.ts` showed `TypeError: Cannot read properties of undefined (reading 'getStats')` at `block.getStats()` in `checkpoint_proposal_job.ts`. The test helpers seeded a single mock block, but some freshly-created pipelined jobs kept the default multi-block timetable instead of the single-block timetable. With no-op waits in the test subclass, the job attempted a second block build, exhausted `MockCheckpointBuilder`, and returned an undefined block. ## Verification - `MAKEFILE_TARGET=yarn-project yarn-project/scripts/run_test.sh sequencer-client/src/sequencer/checkpoint_proposal_job.test.ts` (46 passed) - `./bootstrap.sh ci` exits immediately with `Unknown command: ci` on this branch; the root bootstrap defines `ci-fast`/`ci-full`, not plain `ci`. --- *Created by [claudebox](https://claudebox.work/v2/sessions/c9e66148e5907541) · group: `slackbot`*
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
BEGIN_COMMIT_OVERRIDE
refactor(e2e): consolidate the multi-node and single-node test categories (#24201)
fix(e2e): keep timing_env testEnvironment through yarn prepare (#24328)
test(e2e): speed up setup by automining L1 contract deployment (#24313)
test(e2e): increase slashing proposer search attempts (#24339)
test(sequencer-client): fix checkpoint proposal job timetable setup (#24346)
END_COMMIT_OVERRIDE