Skip to content

feat(frost): route coarse ROAST evidence into the transition bundle (7.3 PR2b-2 step 2)#4088

Merged
mswilkison merged 1 commit into
feat/frost-schnorr-migration-scaffoldfrom
feat/frost-7.3-blame-bridge
Jun 19, 2026
Merged

feat(frost): route coarse ROAST evidence into the transition bundle (7.3 PR2b-2 step 2)#4088
mswilkison merged 1 commit into
feat/frost-schnorr-migration-scaffoldfrom
feat/frost-7.3-blame-bridge

Conversation

@mswilkison

Copy link
Copy Markdown
Contributor

RFC-21 Phase 7.3 PR2b-2 step 2 — the blame bridge

Makes ROAST exclusion/blame actually fire by routing the coarse receive loop's real evidence into the bundle the elected coordinator aggregates.

The gap

Per attempt a ROAST seat mints two coordinator attempt records on the same per-seat coordinator: an OBSERVE handle (what RoastTransitionExchange aggregates) and a DRIVE handle (where submitSnapshotIfActive recorded the receive loop's real rejects/overflows/conflicts). The drive handle is never aggregated — the exchange is the sole bundle producer — so that evidence was a write-only dead end, and the aggregated bundle carried only forced-EMPTY proof-of-attendance snapshots. computeNextAttempt's f+1 accuser tally therefore never excluded anyone.

The bridge

A member-keyed pending-evidence stash keyed by (RoastSessionID, member, attemptHash) — the namespace-independent coordinate shared by the drive and observe worlds (both derive from ctx.SessionID / ctx.Hash()) — carries the raw recorder snapshot:

  • submitSnapshotIfActive stashes the evidence (dropping the dead drive-handle RecordEvidence).
  • BroadcastForcedSnapshot consumes the stash and builds the ONE signed snapshot it broadcasts, so the elected coordinator's AggregateBundle includes the real evidence and computeNextAttempt can establish + exclude an f+1-accused member.
  • Signing stays in one place (the exchange), so VerifyBundle's censorship check (which compares the receiver's own OperatorSignature) is unaffected.

Lifecycle mirrors the observe registry: consume on broadcast, clear on local success + session end, TTL sweep backstop; Evidence is deep-copied on store.

Scope

Coarse f+1 reject/overflow/conflict path only. Coordinator-equivocation proofs (CoordinatorPackageProofs, sourced from the interactive Round2Collector) are step 2b. The ROAST selector is already live under frost_roast_retry; the stale "once AggregateBundle production is wired upstream" comment is retired.

Verification

  • build + vet across 5 tag combos (default / frost_native / frost_roast_retry / frost_native frost_roast_retry / frost_native frost_tbtc_signer w/ CGO)
  • tests green incl. an e2e (stashed evidence → bundle → NextAttempt excludes the f+1-accused member), -race on pkg/frost/signing, gofmt clean
  • design locked via Codex + Gemini consult (both PROCEED, no holes)

🤖 Generated with Claude Code

…7.3 PR2b-2 step 2)

The transition exchange aggregated FORCED-EMPTY proof-of-attendance snapshots, so
NextAttempt's f+1 accuser tally never saw the real rejects/overflows/conflicts the
coarse receive loop collects -- blame/exclusion could not fire. submitSnapshotIfActive
recorded that evidence to the DRIVE handle, which is never aggregated (the exchange is
the sole bundle producer, keyed by the OBSERVE handle), so it was a write-only dead end.

Bridge the two: a member-keyed pending-evidence stash keyed by (RoastSessionID, member,
attemptHash) -- the namespace-independent coordinate shared by the drive and observe
worlds -- carries the raw recorder snapshot. submitSnapshotIfActive stashes it (dropping
the dead drive-handle RecordEvidence); BroadcastForcedSnapshot consumes it and builds the
ONE signed snapshot it broadcasts, so the elected coordinator's AggregateBundle includes
the real evidence and computeNextAttempt can establish + exclude an f+1-accused member.

Lifecycle mirrors the observe registry: consume on broadcast, clear on local success and
session end, TTL sweep as a backstop; the Evidence is deep-copied on store. The stale
selector comment is retired (the ROAST selector is already wired under frost_roast_retry).

Design locked via Codex + Gemini consult (both PROCEED, no holes). Scoped to the coarse
f+1 path; coordinator-equivocation proofs (interactive Round2Collector) are step 2b.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a7d89b56-4b79-4874-826b-026935a6e529

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/frost-7.3-blame-bridge

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mswilkison mswilkison merged commit c3aa36c into feat/frost-schnorr-migration-scaffold Jun 19, 2026
15 of 16 checks passed
@mswilkison mswilkison deleted the feat/frost-7.3-blame-bridge branch June 19, 2026 16:57
mswilkison added a commit that referenced this pull request Jun 19, 2026
….3 PR2b-2 step 2b) (#4089)

## RFC-21 Phase 7.3 PR2b-2 step 2b — coordinator-equivocation exclusion
wiring

Follows step 2a (#4088). 2a routed the coarse f+1 evidence
(rejects/overflows/conflicts) into the transition bundle; 2b wires the
**other** exclusion path — proof-carrying coordinator equivocation.

### The gap
`verifiedCoordinatorEquivocations` (`next_attempt.go`) scans every
bundle snapshot's `CoordinatorPackageProofs`, authenticates each
(coordinator operator-sig over body + attempt), dedupes by body hash,
and **instant-excludes** a coordinator that signed ≥2 distinct authentic
bodies — no f+1 gate, unforgeable. But nothing populated those proofs in
the live path: the interactive runner's `Round2Collector` retains the
coordinator-signed packages (and is **not** pruned on failure), yet they
never reached the broadcast snapshot.

### The bridge
Reuse 2a's pending-evidence stash, extended to a union `{evidence,
coordinatorProofs}`:

- On an interactive `runner.Run` failure,
`driveInteractiveRoastSigningIfEnabled` surfaces
`collector.CoordinatorPackageProofs(hash)` and **stashes** them by
`(RoastSessionID, member, attemptHash)`.
- `BroadcastForcedSnapshot` **consumes** them and passes them to the
variadic `NewLocalEvidenceSnapshot(member, hash, evidence, proofs...)` —
the seat's **one** signed snapshot carries them (the constructor sorts +
owns them; `VerifyBundle`'s censorship check is over the body already
compared).

### Why publish-always
Publishing the authoritative package on **every** failed interactive
attempt (1 envelope when honest, 2 when the coordinator equivocated to
this seat) is what lets `NextAttempt` aggregate bodies **across
observers** and catch **targeted** equivocation — a coordinator that
sends different packages to different members, where no single observer
sees both. Identical authoritative proofs dedupe to one body → no false
positive. Wire cost is bounded (`MaxCoordinatorPackageProofs = 2`,
failure-path only).

### Safety
- Each writer **upserts only its own field** — coarse and interactive
are mutually exclusive per attempt, but the union never assumes XOR and
never drops the sibling's data.
- Proofs deep-copied on store; lifecycle (consume-on-broadcast,
clear-on-success, session-end, TTL) unchanged.
- **No publish on success** (the runner prunes the collector and the
failure block is skipped).
- A Byzantine observer cannot frame an honest coordinator
(`AuthenticateSigningPackage` requires the coordinator's own signature).
- Default-build no-op stub added (the drive path is `frost_native`).
Stale "deferred cleanup stashes the transition bundle" comments retired.

### Scope / gating
Prod-dormant until the cgo interactive engine is registered (gated on
the `frost-secp256k1-tr` audit). **2c** (fault injection) follows.

### Verification
- build + vet across 5 tag combos (default / `frost_native` /
`frost_roast_retry` / `frost_native frost_roast_retry` / `frost_native
frost_tbtc_signer` w/ CGO)
- tests green incl. an e2e (two observers stash **distinct** coordinator
proofs → bundle → `NextAttempt` instant-excludes the coordinator — the
targeted-split case) + union/deep-copy/consume unit tests; `-race` on
`pkg/frost/signing`; gofmt clean
- design locked via Codex + Gemini consult (both PROCEED, no holes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
mswilkison added a commit that referenced this pull request Jun 19, 2026
… PR2b-2 step 2c) (#4090)

## RFC-21 Phase 7.3 PR2b-2 step 2c — fault-injection tests

Closes Step 2's test coverage (after 2a #4088 and 2b #4089). Two
**integration fault-injection** tests that the 2a/2b e2e tests left to
direct stash-seeding:

### `TestDriveInteractiveRoastSigning_StashesCoordinatorProofsOnFailure`
Exercises the **2b extraction seam end-to-end through the real drive +
runner + collector**: a committed interactive attempt that fails at
aggregation (after the runner records the coordinator package) makes
`driveInteractiveRoastSigningIfEnabled` surface
`collector.CoordinatorPackageProofs` and stash them under the attempt
key. A 1-of-1 attempt retains exactly its own authoritative package, so
one proof is stashed — proving the seam fires through real code, not a
directly-seeded stash.

### `TestRoastTransitionExchange_CombinedFaultsExcludeBoth`
The **2a+2b coexistence capstone**: one transition bundle carrying
**both** f+1 reject evidence (against a bad signer) **and**
coordinator-equivocation proofs (two distinct authentic bodies) —
including snapshots that carry evidence *and* a proof at once — excludes
**both** the rejected member and the equivocating coordinator in a
single `NextAttempt`, leaving a feasible threshold-sized included set.

### Scope
**Test-only; no production change.** The policy layer (categories /
soak: overflow-park / silence / reinstatement / infeasibility /
equivocation) and runner capture
(`RetainsCoordinatorPackageEquivocation`,
`PreservesEvidenceOnAggregateFailure`, …) were already covered, so this
fills only the remaining `capture → stash → bundle → exclusion`
integration gap.

### Verification
- build across all 5 tag combos (default / `frost_native` /
`frost_roast_retry` / `frost_native frost_roast_retry` / `frost_native
frost_tbtc_signer` w/ CGO) — no prod change, trivially green
- new tests pass; `-race` on `pkg/frost/signing`; vet + gofmt clean

🤖 Generated with [Claude Code](https://claude.com/claude-code)
mswilkison added a commit that referenced this pull request Jun 19, 2026
…e (7.3 share-blame) (#4091)

## RFC-21 Phase 7.3 — interactive share-verification blame (the third
fault source)

Follows PR2b-2 (the blame layer: 2a #4088 coarse f+1 evidence, 2b #4089
coordinator-equivocation proofs, 2c #4090 tests). This closes the
**last** blame gap.

### The gap
The blame layer excluded members for coarse faults and coordinator
equivocation, but an **interactive bad-share submitter escaped blame
entirely**. The engine's `InteractiveAggregate` surfaces candidate
culprits in a typed
`InteractiveAggregateShareVerificationError{CandidateCulprits
[]uint16}`, yet the runner just wrapped and returned the error,
**dropping them**. `EngineRound2ShareVerifier` and
`Round2Collector.ClassifyCandidateCulprits` were both complete but
constructed/called **nowhere** — so a member that submitted an invalid
FROST signature share was never excluded and the retry could loop.

### The wiring
On a `runner.Run` share-verification failure,
`driveInteractiveRoastSigningIfEnabled` (where 2b already extracts
proofs):
1. `errors.As` the typed `*InteractiveAggregateShareVerificationError`
(through the runner's `%w` wrap);
2. converts the engine's `uint16` candidates → `MemberIndex`, **dropping
`0` / out-of-range** so a malformed candidate can't truncate into an
honest seat;
3. type-asserts the engine to `Round2ShareVerifyingEngine` (skip if
absent — still stash 2b proofs);
4. builds an `EngineRound2ShareVerifier` bound to the attempt
(`SessionID == active.SessionID()`, `AttemptContextHash`,
`TaprootMerkleRoot`);
5. `ClassifyCandidateCulprits` re-verifies each candidate's **retained**
share (frozen-Q1 boundary: only an **ACCEPTED** retained share that
re-verifies **INVALID** is blamed; every not-the-member's-fault
condition fails closed);
6. stashes the reject accusations in the **same union pending-evidence
entry** as the 2b proofs.

`BroadcastForcedSnapshot` carries them; `computeNextAttempt`'s f+1
reject gate excludes an f+1-corroborated bad-share member.

### Why f+1 (not instant)
A share is self-incriminating only against the package **this observer
accepted** — a byzantine coordinator's targeted split could otherwise
make one honest observer instant-exclude an honest peer. f+1
corroboration (honest observers re-verify identical retained bytes
deterministically) closes that. Only unforgeable
coordinator-equivocation proofs (2b) are instant.

### Safety
Best-effort + fail-safe: a non-share error, a verifier-incapable engine,
malformed candidates, or an empty classification all stash nothing.
`EngineRound2ShareVerifier` already fails closed against false blame
(bound-attempt / root / submitter / package-body checks). Compile-time
assertion that the cgo engine satisfies `Round2ShareVerifyingEngine`.
Stale "evidence/proofs mutually exclusive" comments corrected (one
interactive failure now carries both).

### Scope / gating
Prod-dormant until the cgo interactive engine is registered (gated on
the `frost-secp256k1-tr` audit).

### Verification
- build + vet across 5 tag combos (default / `frost_native` /
`frost_roast_retry` / `frost_native frost_roast_retry` / `frost_native
frost_tbtc_signer` w/ CGO)
- tests: blame-fires (engine-verified-invalid share → reject through the
real drive+runner+collector+verifier), skip-without-verifier,
malformed-candidates-dropped; `-race`; gofmt clean
- design locked via Codex + Gemini consult (both PROCEED, no holes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant