fix(fault-proof): robustness follow-ups to #865 + cost-estimator parity by seolaoh · Pull Request #894 · succinctlabs/op-succinct

seolaoh · 2026-04-29T15:56:36Z

Summary

Three independent follow-ups to #865 in the proposer, plus a cost-estimator flag that builds on #869 to align the estimator's split behavior with how the proposer actually proves.

1. `fault-proof: warn-log L1 head regression in sync_state`

Distinguish the two skip cases: WARN when confirmed_number moves backwards (load-balanced backend regression or deep reorg) so operators can detect unhealthy backends, DEBUG for the normal "L1 hasn't ticked" path. Pure observability; no behavior change.

2. `fault-proof: reset creation guard when tracked game is pruned`

When sync_games prunes future games due to abnormal cache states (backup restore into a shorter chain, deep L1 reorg, factory reset), the duplicate-creation guard could point at a game that no longer exists. Without resetting, should_create_game blocks indefinitely. Reset is gated on the guarded address being among the entries this prune actually removes — checking "absent from post-prune cache" instead would over-clear when a just-created game hasn't been added to the cache yet, allowing duplicate submission.

3. `fault-proof: pre-flight on-chain status check before prove/resolve/claim`

With sync_l1_confirmations > 0, the cache lags by ~sync_l1_confirmations × block_time. During that window, recently confirmed prove/resolve/claimCredit txs are invisible to the cache, so should_* flags re-fire and the proposer re-submits — wasting prover-network spend (full range + agg proof regeneration) and gas (reverted txs).

Each path now does one eth_call at latest before submission:

resolve_games: skip if GameStatus != IN_PROGRESS
claim_bonds: skip if credit(signer) == 0
should_skip_proving: skip if ProposalStatus is *ValidProofProvided or Resolved (covers both already-proven and timeout default-loss; Resolved is set whenever GameStatus moves out of IN_PROGRESS)

On RPC failure the check logs warn and proceeds, so transient backend issues don't block legitimate work.

4. `scripts: add --no-safe-head-split to cost-estimator` (follow-up to #869)

#869 fixed --batch-size precedence so the flag is honored when explicit --start/--end are given. But with SafeDB active, the splitter still cuts at span batch boundaries via split_range_based_on_safe_heads, regardless of the now-correct effective_batch_size. The proposer with RANGE_SPLIT_COUNT=1 (default) does not split that way — it produces one range proof per proposal interval. --no-safe-head-split forces split_range_basic so the estimator partitions only by --batch-size, giving a closer estimate of the per-segment cost the proposer actually incurs on the prover network.

Test plan

cargo check --all-targets --all-features --tests && cargo fmt --all -- --check && cargo clippy --all-features --all-targets -- -D warnings -A incomplete-features
Manual: cost-estimator with --no-safe-head-split produces a single batch instead of N span-batch-aligned chunks

Distinguish the two skip cases: log at WARN when confirmed_number moves backwards (load-balanced RPC backend regression or deep L1 reorg) so operators can detect unhealthy backends, and keep DEBUG for the normal equal case where L1 simply hasn't ticked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When sync_games prunes future games (above pinned_latest_index) due to abnormal cache states like backup restore into a shorter chain or deep L1 reorg, the duplicate-creation guard could point at a game that no longer exists on chain. Without resetting, should_create_game blocks indefinitely because canonical_head_l2_block cannot advance through an orphaned game. Reset is gated on the guarded address being among the entries this prune actually removes (evaluated before the removal loop). Checking "absent from post-prune cache" would over-clear in the case where the just-created game has not yet been added to the cache and an unrelated prune fires, allowing should_create_game to re-submit a duplicate at the same L2 block before the cache catches up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ve/claim With sync_l1_confirmations > 0, the pinned cache lags behind the chain tip by sync_l1_confirmations × block_time, so a recently confirmed prove(), resolve(), or claimCredit() tx may not yet be reflected in should_attempt_* flags. Without a pre-flight check, the proposer would re-submit duplicate transactions that revert on chain — wasting gas for resolve/claim, and re-running expensive proof generation for prove. Each path now does one eth_call at `latest` before submission: - resolve_games: skip if GameStatus != IN_PROGRESS - claim_bonds: skip if credit(signer) == 0 - should_skip_proving: skip if ProposalStatus is *ValidProofProvided or Resolved (single check covers both already-proven and timeout default-loss cases since Resolved is set whenever GameStatus moves out of IN_PROGRESS) On RPC failure the check logs a warn and proceeds, so transient backend issues don't block legitimate work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When SafeDB is active, cost-estimator splits the requested range at every span batch boundary via split_range_based_on_safe_heads, producing one zkVM execution per span batch regardless of --batch-size. That mirrors a hypothetical "split each proposal at span batch boundaries" workload, not what the proposer actually does (RANGE_SPLIT_COUNT-driven arithmetic split, default 1 = single execution per proposal interval). The new --no-safe-head-split flag forces split_range_basic so the range is partitioned solely by --batch-size, giving a closer estimate of the per-segment cost the proposer sees on the prover network. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

seolaoh changed the title ~~fault-proof: robustness follow-ups to #865 + cost-estimator parity~~ fix(fault-proof): robustness follow-ups to #865 + cost-estimator parity Apr 29, 2026

seolaoh and others added 4 commits May 4, 2026 17:54

seolaoh force-pushed the seolaoh/fault-proof-robustness-followups branch from ac3dad4 to 03924a1 Compare May 4, 2026 08:54

fakedev9999 mentioned this pull request May 7, 2026

fix(fault-proof): robustness follow-ups to #865 + cost-estimator parity #901

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(fault-proof): robustness follow-ups to #865 + cost-estimator parity#894

fix(fault-proof): robustness follow-ups to #865 + cost-estimator parity#894
seolaoh wants to merge 4 commits into
succinctlabs:mainfrom
celo-org:seolaoh/fault-proof-robustness-followups

seolaoh commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seolaoh commented Apr 29, 2026

Summary

1. fault-proof: warn-log L1 head regression in sync_state

2. fault-proof: reset creation guard when tracked game is pruned

3. fault-proof: pre-flight on-chain status check before prove/resolve/claim

4. scripts: add --no-safe-head-split to cost-estimator (follow-up to #869)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `fault-proof: warn-log L1 head regression in sync_state`

2. `fault-proof: reset creation guard when tracked game is pruned`

3. `fault-proof: pre-flight on-chain status check before prove/resolve/claim`

4. `scripts: add --no-safe-head-split to cost-estimator` (follow-up to #869)