Skip to content

fix(consensus): return Result from IncrementalProofState::take instead of panicking#726

Open
keanji-x wants to merge 1 commit into
mainfrom
kj/fix-67-proof-coordinator
Open

fix(consensus): return Result from IncrementalProofState::take instead of panicking#726
keanji-x wants to merge 1 commit into
mainfrom
kj/fix-67-proof-coordinator

Conversation

@keanji-x
Copy link
Copy Markdown
Contributor

Summary

  • Minimal surgical fix for gravity-audit#67 (Critical).
  • Replaces unreachable!() in IncrementalProofState::take with a real Result return so the ProofCoordinator task survives a BLS aggregation failure (e.g. after validator-set rotation) instead of crashing and halting consensus.
  • Adds SignedBatchInfoError::UnableToAggregate and propagates ? at the single caller.

Why minimal (not a full backport of upstream #14644)

Upstream aptos-core PR #14644 fixes the same panic but as part of a much larger refactor (~466 lines of conflicts against our vendored consensus). This PR ports only the panic→Result slice; we can decide later whether to take the rest of the refactor.

Test plan

  • cargo check -p aptos-consensus --tests passes.
  • Reviewer: confirm no other in-tree callers of IncrementalProofState::take (there is only one, in aggregate_and_verify).

🤖 Generated with Claude Code

…d of panicking

Closes #67 (gravity-audit).

When BLS signature aggregation fails (e.g. after a validator-set rotation
that invalidates some collected per-signer keys), the old code hit
`unreachable!()` and crashed the `ProofCoordinator` tokio task, stalling
consensus on the affected node. This is a minimal surgical fix that
mirrors the relevant slice of upstream aptos-core PR #14644:

- Change `IncrementalProofState::take` to return
  `Result<ProofOfStore, SignedBatchInfoError>`.
- On aggregation failure, log via `error!` and return
  `SignedBatchInfoError::UnableToAggregate` (new variant) instead of
  panicking. Leave `completed = false` so a future attempt can retry.
- Propagate `?` at the single caller in `aggregate_and_verify`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant