Skip to content

feat(sequencer): AutomineSequencer for single-sequencer e2e tests#23354

Draft
spalladino wants to merge 13 commits into
merge-train/spartanfrom
palla/automine-sequencer
Draft

feat(sequencer): AutomineSequencer for single-sequencer e2e tests#23354
spalladino wants to merge 13 commits into
merge-train/spartanfrom
palla/automine-sequencer

Conversation

@spalladino
Copy link
Copy Markdown
Contributor

Motivation

E2e tests outside of e2e_p2p, e2e_epochs, e2e_slashing, and e2e_block_building don't exercise block-building or consensus — they just need their tx to land. Running them on the production Sequencer (~1100 LOC) + CheckpointProposalJob (~850 LOC) with 12s slot cadence is pure overhead: every test pays for proposer-turn checks, pipelining bookkeeping, validator attestations, and slot-aligned waits that aren't being tested. This PR explores giving those tests a minimal, deterministic, queue-driven alternative that runs txs effectively as fast as the block builder allows.

Approach

Adds an AutomineSequencer alongside the production one. It reuses SequencerPublisher, FullNodeCheckpointsBuilder, and GlobalVariableBuilder; skips proposer-turn checks, validator orchestration, attestations, pipelining, P2P gossip, timetable enforcement, and event emission. Anvil runs in automine mode with no interval mining; the sequencer pre-sets next L1 block timestamps at slot boundaries only when needed. All test time control (warps, empty-block requests) shares a single serial queue with mempool-driven builds — the three never interleave. Requires aztecTargetCommitteeSize == 0 (the e2e default), so an empty CommitteeAttestationsAndSigners is accepted by L1 via the verifyProposer / verifyAttestations early-return at ValidatorSelectionLib.sol:244-249.

Changes

  • sequencer-client: new AutomineSequencer (~370 LOC) with serial queue + mempool poller, buildIfPending/buildEmptyBlock/warpTo/warpBy. Wait for archiver to surface the published checkpoint before returning to avoid Rollup__InvalidArchive on the next build.
  • aztec-node: new useAutomineSequencer config flag. AztecNodeService.createAndSync constructs the AutomineSequencer inside the existing validator-enabled branch from the same L1 deps (l1TxUtils, publisher manager, publisher factory, checkpoints builder) instead of going through SequencerClient.new. mineBlock routes to AutomineSequencer.buildEmptyBlock when wired. Adds getters for WorldStateSynchronizer, L1ToL2MessageSource, EpochCache, GlobalVariableBuilder, and AutomineSequencer.
  • aztec/testing: CheatCodes.warpL2TimeAtLeastTo delegates to the queue when an AutomineSequencer is wired, so existing test helpers (warpL2TimeAtLeastBy, etc.) work unchanged.
  • end-to-end: new AUTOMINE_E2E_OPTS preset, useAutomineSequencer flag on SetupOptions, and e2e_automine_smoke.test.ts exercising sequential txs, parallel txs, warp, and mineBlock.

Plan: /home/santiago/.claude/plans/i-had-anotehr-agent-snazzy-forest.md (codex-reviewed). Smoke test passes locally (4/4, ~78s wall time). Next steps: migrate the first batch of single-sequencer tests to AUTOMINE_E2E_OPTS.

A minimal, deterministic, queue-driven sequencer for e2e tests that don't
exercise block-building or consensus. Reuses SequencerPublisher,
FullNodeCheckpointsBuilder, and GlobalVariableBuilder; skips proposer-turn
checks, validator orchestration, attestations, pipelining, P2P gossip,
timetable enforcement, and event emission.

Uses anvil setAutomine(true) with no interval mining; pre-sets next L1
block timestamp at slot boundaries when needed. Mempool-driven builds and
explicit warp/buildEmptyBlock requests share a single serial queue and
never interleave. Requires aztecTargetCommitteeSize=0 on the deployed
rollup (the e2e default) so empty CommitteeAttestationsAndSigners is
accepted by L1.
…chestration

Adds the AUTOMINE_E2E_OPTS preset that opts a single-sequencer non-block-building
test into the AutomineSequencer path. Adds a useAutomineSequencer flag to
SetupOptions for the fixture to branch on.

Adds four getters to AztecNodeService (getWorldStateSynchronizer,
getL1ToL2MessageSource, getEpochCache, getGlobalVariableBuilder) so the test
fixture can construct an AutomineSequencer alongside an otherwise-headless node
(disableValidator + dontStartSequencer).
…ture

Adds a useAutomineSequencer config flag. When set, AztecNodeService.createAndSync
constructs an AutomineSequencer inside the existing validator-enabled branch,
reusing the same L1 deps (l1TxUtils, publisher manager, publisher factory,
checkpoints builder) instead of going through SequencerClient.new.

AztecNodeService.mineBlock routes to AutomineSequencer.buildEmptyBlock when the
automine path is wired. CheatCodes.warpL2TimeAtLeastTo delegates to the queue
when an AutomineSequencer is wired, so existing test helpers work unchanged.

Adds the AutomineSequencer + AutomineSequencerDeps + AutomineSequencerConstants
exports from sequencer-client, exposes getPublisherConfigFromSequencerConfig,
and adds a USE_AUTOMINE_SEQUENCER env var.

Adds e2e_automine_smoke.test.ts exercising sequential txs, parallel txs, warp,
and mineBlock under AUTOMINE_E2E_OPTS.
Without waiting, the next mempool-driven build picks up a stale tip and L1
rejects the propose with Rollup__InvalidArchive — the freshly-built header
points at the pre-publish lastArchive, but L1's lastArchive has already
advanced to the just-published checkpoint.

Also cap the smoke-test warp at 24s (2 slots) so it doesn't cross the L1
proof-submission window and trigger a separate prune-related code path.
@spalladino spalladino added ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure labels May 18, 2026
Mechanical swap from PIPELINING_SETUP_OPTS to AUTOMINE_E2E_OPTS for tests
that don't exercise block-building or consensus:

- e2e_authwit
- e2e_keys
- e2e_partial_notes
- e2e_orderbook
- e2e_double_spend

Each tx now lands as soon as the AutomineSequencer picks it up from the
mempool rather than waiting for the next 12s slot boundary.
- Drop duplicate re-export of getPublisherConfigFromSequencerConfig from
  publisher/index.ts; it's already available via sequencer-client's top-level
  re-export through ./config.js -> ./publisher/config.js.
- Drop redundant async keyword from the buildIfPending queue callback
  (returns runBuild's promise directly, no await needed).
…OPTS

Mechanical swap from PIPELINING_SETUP_OPTS to AUTOMINE_E2E_OPTS for 28 more
single-sequencer non-block-building tests. Tests with hand-rolled warps
(e2e_lending_contract), multi-PXE setups (e2e_2_pxes), or block-building
semantics (e2e_block_building, e2e_pruned_blocks, e2e_sequencer_config,
e2e_l1_with_wall_time, e2e_expiration_timestamp, e2e_genesis_timestamp) are
intentionally held for follow-up.
Pulls the ~50 LOC of inline AutomineSequencer construction out of
AztecNodeService.createAndSync into a dedicated factory under the
sequencer-client package. Server-side code is back to a single function
call; the factory owns publisher-manager wiring, attestor lookup, and
EthCheatCodes construction.

Pure refactor — no behavior change.
…encer compat

Test-side helper relied on setConfig({ minTxsPerBlock: 0 }) and a polling
wait, expecting the production Sequencer's polling loop to fire an empty
block when no txs were pending. AutomineSequencer never fires unprompted —
it builds on tx arrival or explicit request only, so the wait timed out.

aztecNode.mineBlock() already routes to AutomineSequencer.buildEmptyBlock
when wired, or temporarily drops minTxsPerBlock and triggers the production
sequencer otherwise — works under both presets.
The test does a full L1 reorg via rollbackTo + resumeSync + forceEmptyBlock.
AutomineSequencer has no reorg awareness — after a rollback, its archiver-
sync wait races against the archiver re-ingesting the original checkpoint
from L1, causing either a Rollup__InvalidArchive on re-publish or a deadlock
on the sync wait.

Reorg semantics are out of scope for AUTOMINE_E2E_OPTS; this test stays on
PIPELINING_SETUP_OPTS where it works cleanly.
Implements revertToCheckpoint(targetCheckpoint) on AutomineSequencer,
running inside the serial queue so it never interleaves with builds.

The sequence:
1. Fetch the target checkpoint's L1 block number from the archiver.
2. Call archiverRollback() to reset the archiver to the target
   checkpoint boundary (must happen before the L1 reorg so the
   archiver can still read the target checkpoint's L1 block hash).
3. Call worldState.syncImmediate() to propagate the archiver prune
   event to world-state before the next build runs.
4. Reorg L1 with reorg(depth) to drop blocks strictly after the
   target checkpoint publish block, keeping that block as the new tip.
5. Drop all pending L1 txs (anvil_rollback re-queues them) and reset
   the publisher's cached nonce so the next propose tx uses the correct
   post-reorg nonce.
6. Reset lastBuiltSlot and sync the date provider.

Also adds resetNonce() to L1TxUtils to support step 5, and wires
archiverRollback / resetPublisherNonces callbacks through the factory.

Smoke test: adds "revertToCheckpoint rolls back L1+L2 state" to
e2e_automine_smoke.test.ts, verifying the L2 tip drops and a new
tx lands cleanly after the revert. All 5 scenarios pass.
… revertToCheckpoint

Replaces the PIPELINING_SETUP_OPTS + pauseSync/rollbackTo/resumeSync
pattern with AUTOMINE_E2E_OPTS + AutomineSequencer.revertToCheckpoint().

Changes:
- Switch setup preset from PIPELINING_SETUP_OPTS to AUTOMINE_E2E_OPTS.
- forceReorg now captures checkpointBeforeTx (the checkpointed tip
  before the transfer tx lands) and calls revertToCheckpoint() with
  that number. This atomically rolls back L1, the archiver, and
  world-state in one call.
- forceEmptyBlock uses aztecNodeService.mineBlock() instead of
  setConfig({ minTxsPerBlock: 0 }), which is the correct path under
  AutomineSequencer.
- Remove pauseSync/resumeSync calls (no longer needed; the archiver
  never pauses under AutomineSequencer).
- After the reorg, the p2p tx pool restores the rolled-back transfer tx
  to pending, so the AutomineSequencer re-mines it automatically.
Three bugs caused the reorg test to fail:

1. After `archiverRollback`, the P2P block stream's `chain-pruned` event was
   only processed on the next poll cycle (~50ms later), so rolled-back txs
   weren't restored to pending before the next build ran. Fix: call `syncP2P()`
   in `runRevert` to force one immediate P2P block stream work cycle.

2. `runRevert` was calling `dateProvider.setTime(newL1Ts * 1000)` to roll the
   date provider back to the target checkpoint's L1 timestamp. Since the
   restored tx's `receivedAt` was recorded after that timestamp,
   `getEligiblePendingTxHashes` filtered it out as "too new". Fix: remove the
   `setTime` call from `runRevert` so the date provider keeps its current time.

3. `runBuild` called `worldState.fork(syncedToBlockNumber)` using the archiver's
   tip, but world state syncs from the archiver asynchronously. When the
   mempool-driven build for the re-queued tx completed and the follow-up empty
   block build ran immediately after, world state could still be one block behind,
   causing "Unable to initialize from future block". Fix: call
   `worldState.syncImmediate(syncedToBlockNumber)` before forking.

Also adds a `retryUntil` loop in the test to wait for PXE to process the
re-mined block's notes, since PXE syncs asynchronously from the archiver.
@spalladino
Copy link
Copy Markdown
Contributor Author

Codex review (gpt-5.5, high reasoning)

Verdict: needs fixes before merge; the shape is defensible for e2e-only automining, but there are a few concurrency/lifecycle holes that can make the fast path flaky.

Findings

  • yarn-project/sequencer-client/src/sequencer/automine_sequencer.ts:172-175 clears buildQueued before runBuild() starts. That means the poller can enqueue another mempool build while the current build is still publishing. Since runBuild() only waits for the archiver tip at :367-375 and does not force P2P to process the mined block, that second job can observe the old tx pool and try to build already-mined txs. Keep the coalescing flag set until the build finishes, or add a separate buildInProgress flag, and consider await this.deps.syncP2P() after the archiver-sync wait so the tx pool has consumed the new blocks-added event before the next queue item runs.

  • yarn-project/sequencer-client/src/sequencer/automine_sequencer.ts:328-337 does not handle failed tx execution. Production code drops failed txs from P2P after buildBlock returns/throws; here an invalid pending tx can cause the poller to retry and log forever. Catch InsufficientValidTxsError/failed tx results and call p2pClient.handleFailedExecution(failedTxHashes) before returning undefined or rethrowing. This matters for migrated tests that intentionally create invalid/double-spend conditions.

  • yarn-project/sequencer-client/src/sequencer/automine_factory.ts:108-109 makes syncP2P an optional duck-typed method and silently turns it into a no-op. But revertToCheckpoint() relies on this call for correctness at automine_sequencer.ts:434-438. If a future P2P implementation satisfies the exported P2P type but lacks the concrete sync() method, reorg tests become racy without any construction-time failure. Please make sync() part of the relevant interface, or fail fast in the factory instead of no-oping.

  • yarn-project/sequencer-client/src/sequencer/automine_factory.ts:74-77 creates a PublisherManager and starts it at :138, but AutomineSequencer.stop() only stops the poller and queue at automine_sequencer.ts:150-157. The manager's funding loop and L1 tx utils are never stopped/interrupted in the automine path. Store enough ownership to call publisherFactory.stopAll() (or pass a stop hook) from AutomineSequencer.stop().

  • yarn-project/sequencer-client/src/sequencer/automine_factory.ts:107 resets nonces for l1TxUtils but not funderL1TxUtils. If funding is enabled and a funding tx is rolled back by revertToCheckpoint(), the funder can retain the same stale nonce state this PR is fixing for publishers. Either include the funder in the reset hook or explicitly disallow publisher funding under automine.

  • yarn-project/aztec-node/src/aztec-node/server.ts:1018-1038 adds public test-only accessors for world-state, L1-to-L2 messages, epoch cache, and globals builder, but they are not used anywhere in this diff. These widen the production node API surface for no current benefit. Please remove them unless a caller is added in this PR.

Things I checked that look OK

  • The main useAutomineSequencer branch in server.ts:860-887 is acceptable as a contained test-mode branch. I would not move this into aztec/src/testing if it needs real sequencer-client publisher/checkpoint-builder internals.

  • mineBlock() routing to buildEmptyBlock() at server.ts:1931-1935 is the right abstraction: callers want "advance L2 by one block," not a separate automine-only debug API.

  • The reorg ordering mostly makes sense: rollbackTo() needs the target L1 block hash before the L1 reorg (archiver.ts:545-553), anvil_dropAllTransactions after reorg(depth) is the right anvil-specific cleanup, and keeping dateProvider from moving backward avoids making pending txs ineligible.

  • EpochTestSettler does not depend on sequencer events. It starts an EpochMonitor over the archiver/L2 block source at epoch_test_settler.ts:23-27 and marks epochs proven from fetched checkpointed blocks at :33-64, so the lack of automine event emission is fine.

  • closeDelayMs: 0 looks acceptable for this path. Unlike the production job, automine is not keeping a proposal fork alive for validator-side work; after completeCheckpoint() the publisher uses serialized checkpoint data, and downstream consumers sync from L1/archiver.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant