Skip to content

feat(sequencer): add waitForBuildDeadlineOnFinalBlock flag#24101

Closed
PhilWindle wants to merge 2 commits into
merge-train/spartan-v5from
pw/last-block-wait-for-txs-v5
Closed

feat(sequencer): add waitForBuildDeadlineOnFinalBlock flag#24101
PhilWindle wants to merge 2 commits into
merge-train/spartan-v5from
pw/last-block-wait-for-txs-v5

Conversation

@PhilWindle

Copy link
Copy Markdown
Collaborator

What

Adds an opt-in sequencer config flag, waitForBuildDeadlineOnFinalBlock (default false), that makes the final block of a checkpoint keep collecting txs until its build deadline instead of sealing as soon as minTxsPerBlock are available. Enables it in e2e_epochs/epochs_mbps_redistribution to make that test deterministic.

Why

epochs_mbps_redistribution dumps a burst of late txs and asserts they all land in the checkpoint's final block (new Set(lateBlockNumbers).size === 1). The proposer releases the final block as soon as getPendingTxCount() >= 1 and then iterateEligiblePendingTxs takes a single mempool snapshot. The burst is submitted with Promise.all(...send()) and reaches the proposer's pool over gossip one tx at a time, so the snapshot captures only a subset and the rest spill into the next checkpoint — an intermittent failure, not a redistribution bug.

Rather than change production block-sealing (which would idle the final block or eat into execution time), this gates the wait behind a flag that only tests enable:

  • Default (false): unchanged — every block, including the final one, seals as soon as it reaches minTxsPerBlock.
  • true: the final block of a checkpoint waits until its build deadline (block_build_deadline − min_block_duration) before sealing, so a late burst is fully present at the snapshot. The build deadline already reserves the post-build window for attestation collection, so checkpoint timing is unaffected.

Changes

  • stdlib/src/interfaces/configs.ts: new optional SequencerConfig field + schema entry.
  • sequencer-client/src/config.ts: default (false) and CLI mapping (no env var — testing-only, like skipInvalidateBlockAsProposer, so it does not appear in the operator CLI reference).
  • sequencer-client/src/sequencer/checkpoint_proposal_job.ts: thread isLastBlock into waitForMinTxs; when the flag is set, the final block collects until its build deadline.
  • end-to-end/src/e2e_epochs/epochs_mbps_redistribution.test.ts: enable the flag.

Testing

  • New checkpoint_proposal_job.timing.test.ts cases for both behaviours (seal-early by default; wait-to-deadline when set).
  • sequencer-client unit suites pass (timing 23/23, job 46/46) and the package typechecks. The e2e itself runs in CI (no docker locally).

Follow-up

Supersedes the flake marking in #24098 for this test. Once both land, the epochs_mbps_redistribution entry added to .test_patterns.yml by #24098 can be removed.

The proposer seals the final block of a checkpoint as soon as minTxsPerBlock are
available and then snapshots the mempool. When txs are submitted as a burst late
in the slot, they propagate over gossip one at a time, so the snapshot captures
only a subset and the remainder spill into the next checkpoint.

Add an opt-in sequencer flag (default off, so production behaviour is unchanged)
that makes the final block of a checkpoint keep collecting txs until its build
deadline before sealing. The build deadline already reserves the post-build
window for attestation collection, so this does not change checkpoint timing.

Enable the flag in e2e_epochs/epochs_mbps_redistribution, which dumps a burst of
late txs and asserts they all land in the final block; without this it raced the
proposer's snapshot and flaked. Covered by new checkpoint_proposal_job timing
unit tests for both the default (seal early) and flag-on (wait) behaviour.
The waitForBuildDeadlineOnFinalBlock flag enabled in this PR makes the test
deterministic, so it no longer needs to be tolerated as a flake (added in #24098).
@PhilWindle PhilWindle force-pushed the pw/last-block-wait-for-txs-v5 branch from 5f27923 to 12afdb7 Compare June 15, 2026 16:28
@AztecBot

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/49af6f91d30d1de9�49af6f91d30d1de98;;�): yarn-project/end-to-end/scripts/run_test.sh ha src/composed/ha/e2e_ha_full.parallel.test.ts "should distribute work across multiple HA nodes" (228s) (code: 0)

PhilWindle pushed a commit that referenced this pull request Jun 18, 2026
## Motivation

`e2e_epochs/epochs_mbps_redistribution` asserted that a burst of late
txs all land in a single (last) block of a checkpoint. But the proposer
snapshots the mempool once per block as soon as `minTxsPerBlock` are
available, and the burst (sent via `Promise.all(send())`) reaches the
proposer over gossip one tx at a time — so the last block's snapshot
routinely captured only a subset and the rest spilled into the next
checkpoint, flaking the assertion. Empirically the one-before-last block
grabs its full redistributed allotment, leaving the last block with
fewer than its static share, so "all in the last block" was never a
reliable invariant in the first place.

## Approach

Check the redistribution claim against a race-robust invariant rather
than an exact block placement:

- Use a 4-block-per-checkpoint profile (`blockDurationMs` 8000 → 6000,
slot unchanged at 36s) with a tight `maxTxsPerCheckpoint = 9`, giving a
static per-block cap `S = ceil(9 * 1.2 / 4) = 3` under the 1.2
multiplier floor.
- Two early blocks take one tx each; the 7-tx burst is dumped
immediately after the second early block is in, then splits arbitrarily
(`x` / `7 - x`) across the last two blocks as it propagates.
- Assert that, within the checkpoint holding the early txs, the **last
two block indices jointly hold all 7 late txs** (`> 2 * S = 6`). Without
redistribution each block is capped at `S`, so the two could hold at
most 6 and the 7th spills into the next checkpoint. The count is by
checkpoint-relative block index (not global block number) so a spilled
tx can't masquerade as redistributed, and the checkpoint block count is
asserted to be 4 so a build collapse fails diagnostically rather than
misleadingly.

This removes the dependence on which of the last two blocks wins the
race. Verified robust by temporarily forcing the one-before-last block
to both extremes (1 and 5 txs) — the joint count stays 7 in both cases.

## Changes

- **end-to-end (tests)**: rework the first `epochs_mbps_redistribution`
test (C=4, immediate burst dump, joint last-two-blocks assertion) and
refresh its comments; update the second test's comments to reflect the
shared C=4 timing (its logic and assertions are unchanged).

Related: this investigation surfaced a separate production bug (A-1251)
where `waitForMinTxs` gates on a non-age-filtered pending count while
the block-building iterator filters by `minTxPoolAgeMs`; that is tracked
separately and not addressed here.

This PR is an alternative to
#24101
@PhilWindle PhilWindle closed this Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants