feat(sequencer): add waitForBuildDeadlineOnFinalBlock flag#24101
Closed
PhilWindle wants to merge 2 commits into
Closed
feat(sequencer): add waitForBuildDeadlineOnFinalBlock flag#24101PhilWindle wants to merge 2 commits into
PhilWindle wants to merge 2 commits into
Conversation
The proposer seals the final block of a checkpoint as soon as minTxsPerBlock are available and then snapshots the mempool. When txs are submitted as a burst late in the slot, they propagate over gossip one at a time, so the snapshot captures only a subset and the remainder spill into the next checkpoint. Add an opt-in sequencer flag (default off, so production behaviour is unchanged) that makes the final block of a checkpoint keep collecting txs until its build deadline before sealing. The build deadline already reserves the post-build window for attestation collection, so this does not change checkpoint timing. Enable the flag in e2e_epochs/epochs_mbps_redistribution, which dumps a burst of late txs and asserts they all land in the final block; without this it raced the proposer's snapshot and flaked. Covered by new checkpoint_proposal_job timing unit tests for both the default (seal early) and flag-on (wait) behaviour.
The waitForBuildDeadlineOnFinalBlock flag enabled in this PR makes the test deterministic, so it no longer needs to be tolerated as a flake (added in #24098).
5f27923 to
12afdb7
Compare
Collaborator
Flakey Tests🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry. |
PhilWindle
pushed a commit
that referenced
this pull request
Jun 18, 2026
## Motivation `e2e_epochs/epochs_mbps_redistribution` asserted that a burst of late txs all land in a single (last) block of a checkpoint. But the proposer snapshots the mempool once per block as soon as `minTxsPerBlock` are available, and the burst (sent via `Promise.all(send())`) reaches the proposer over gossip one tx at a time — so the last block's snapshot routinely captured only a subset and the rest spilled into the next checkpoint, flaking the assertion. Empirically the one-before-last block grabs its full redistributed allotment, leaving the last block with fewer than its static share, so "all in the last block" was never a reliable invariant in the first place. ## Approach Check the redistribution claim against a race-robust invariant rather than an exact block placement: - Use a 4-block-per-checkpoint profile (`blockDurationMs` 8000 → 6000, slot unchanged at 36s) with a tight `maxTxsPerCheckpoint = 9`, giving a static per-block cap `S = ceil(9 * 1.2 / 4) = 3` under the 1.2 multiplier floor. - Two early blocks take one tx each; the 7-tx burst is dumped immediately after the second early block is in, then splits arbitrarily (`x` / `7 - x`) across the last two blocks as it propagates. - Assert that, within the checkpoint holding the early txs, the **last two block indices jointly hold all 7 late txs** (`> 2 * S = 6`). Without redistribution each block is capped at `S`, so the two could hold at most 6 and the 7th spills into the next checkpoint. The count is by checkpoint-relative block index (not global block number) so a spilled tx can't masquerade as redistributed, and the checkpoint block count is asserted to be 4 so a build collapse fails diagnostically rather than misleadingly. This removes the dependence on which of the last two blocks wins the race. Verified robust by temporarily forcing the one-before-last block to both extremes (1 and 5 txs) — the joint count stays 7 in both cases. ## Changes - **end-to-end (tests)**: rework the first `epochs_mbps_redistribution` test (C=4, immediate burst dump, joint last-two-blocks assertion) and refresh its comments; update the second test's comments to reflect the shared C=4 timing (its logic and assertions are unchanged). Related: this investigation surfaced a separate production bug (A-1251) where `waitForMinTxs` gates on a non-age-filtered pending count while the block-building iterator filters by `minTxPoolAgeMs`; that is tracked separately and not addressed here. This PR is an alternative to #24101
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an opt-in sequencer config flag,
waitForBuildDeadlineOnFinalBlock(defaultfalse), that makes the final block of a checkpoint keep collecting txs until its build deadline instead of sealing as soon asminTxsPerBlockare available. Enables it ine2e_epochs/epochs_mbps_redistributionto make that test deterministic.Why
epochs_mbps_redistributiondumps a burst of late txs and asserts they all land in the checkpoint's final block (new Set(lateBlockNumbers).size === 1). The proposer releases the final block as soon asgetPendingTxCount() >= 1and theniterateEligiblePendingTxstakes a single mempool snapshot. The burst is submitted withPromise.all(...send())and reaches the proposer's pool over gossip one tx at a time, so the snapshot captures only a subset and the rest spill into the next checkpoint — an intermittent failure, not a redistribution bug.Rather than change production block-sealing (which would idle the final block or eat into execution time), this gates the wait behind a flag that only tests enable:
false): unchanged — every block, including the final one, seals as soon as it reachesminTxsPerBlock.true: the final block of a checkpoint waits until its build deadline (block_build_deadline − min_block_duration) before sealing, so a late burst is fully present at the snapshot. The build deadline already reserves the post-build window for attestation collection, so checkpoint timing is unaffected.Changes
stdlib/src/interfaces/configs.ts: new optionalSequencerConfigfield + schema entry.sequencer-client/src/config.ts: default (false) and CLI mapping (no env var — testing-only, likeskipInvalidateBlockAsProposer, so it does not appear in the operator CLI reference).sequencer-client/src/sequencer/checkpoint_proposal_job.ts: threadisLastBlockintowaitForMinTxs; when the flag is set, the final block collects until its build deadline.end-to-end/src/e2e_epochs/epochs_mbps_redistribution.test.ts: enable the flag.Testing
checkpoint_proposal_job.timing.test.tscases for both behaviours (seal-early by default; wait-to-deadline when set).sequencer-clientunit suites pass (timing 23/23, job 46/46) and the package typechecks. The e2e itself runs in CI (no docker locally).Follow-up
Supersedes the flake marking in #24098 for this test. Once both land, the
epochs_mbps_redistributionentry added to.test_patterns.ymlby #24098 can be removed.