Skip to content

apollo_staking: bound next-epoch resolution window by the epoch length#14689

Open
matanl-starkware wants to merge 1 commit into
mainfrom
matanl/staking-bound-next-epoch-window
Open

apollo_staking: bound next-epoch resolution window by the epoch length#14689
matanl-starkware wants to merge 1 commit into
mainfrom
matanl/staking-bound-next-epoch-window

Conversation

@matanl-starkware

Copy link
Copy Markdown
Collaborator

Problem

EpochCache::try_resolve_epoch_id resolves heights up to MIN_EPOCH_LENGTH (150) blocks past the current epoch's end as "current epoch + 1". When epochs are shorter than MIN_EPOCH_LENGTH (e.g. MockStakingContract's 30-block epochs), heights belonging to later epochs are attributed to a stale epoch ID for up to 150 heights, and the epoch cache re-syncs at a node-specific phase. As a result, override_committee.start_epoch is evaluated against the stale label and a committee change takes effect at a different height on each node, diverging the round-robin proposer schedules ((height + round) % len with different len) for several minutes.

Fix

Bound the next-epoch resolution window by min(MIN_EPOCH_LENGTH, epoch_length):

  • Short epochs: the window is exactly one epoch, so resolved epoch IDs are always exact and an override activates at the same height on all nodes. Heights further ahead force an epoch re-sync; if the contract view lags, resolution fails (per-height sync fallback) rather than mislabeling.
  • Epochs ≥ MIN_EPOCH_LENGTH: behavior unchanged.

Testing

  • New regression test get_committee_short_epochs_override_activates_at_start_epoch: with 30-block epochs and the epoch cache synced two epochs before the override's start_epoch, the last pre-override height gets the default committee and the first override height gets the override committee. Fails without the fix (stale default committee served past the activation epoch).
  • Tightened the existing next-epoch bounds coverage to the exact boundary: E2_H2 is now start + MIN_EPOCH_LENGTH (first invalid height) and get_committee_for_next_epoch also asserts the last height within bounds resolves.

🤖 Generated with Claude Code

Heights were resolved to "current epoch + 1" for up to MIN_EPOCH_LENGTH blocks
past the current epoch's end. When epochs are shorter than MIN_EPOCH_LENGTH,
this attributed heights of later epochs to a stale epoch ID, so epoch-gated
committee changes took effect at a different height on each node. Bound the
window by min(MIN_EPOCH_LENGTH, epoch_length) so the resolved epoch is always
exact; behavior is unchanged for epochs of at least MIN_EPOCH_LENGTH.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@cursor

cursor Bot commented Jul 2, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Touches epoch-to-height resolution used for committee and proposer selection; incorrect bounds could still cause sync failures or schedule skew, but the change narrows mis-labeling for short epochs and is covered by regression tests.

Overview
Fixes cross-node consensus divergence when epochs are shorter than MIN_EPOCH_LENGTH (e.g. 30-block mock epochs): the epoch cache could label heights up to 150 blocks past an epoch end as “current + 1”, so override_committee.start_epoch fired at different heights and round-robin proposer schedules disagreed.

Epoch::within_next_epoch_min_bounds now uses min(MIN_EPOCH_LENGTH, self.epoch_length) for the lookahead window instead of always MIN_EPOCH_LENGTH. Short epochs only resolve the next epoch for heights guaranteed inside one epoch; longer epochs behave as before.

Tests add get_committee_short_epochs_override_activates_at_start_epoch and tighten next-epoch boundary checks (E2_H2 = first invalid height; assert last in-bounds height succeeds).

Reviewed by Cursor Bugbot for commit d675c18. Bugbot is set up for automated code reviews on this repo. Configure here.

@reviewable-StarkWare

Copy link
Copy Markdown

This change is Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants