You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(frost): conclude the signing done check on the t-subset (7.3 t-of-included PR3/3) (#4094)
## What
Completes RFC-21 Phase 7.3 t-of-included finalize. With PR2 (#4093) an
interactive signing attempt finalizes over the first `t` responsive
committers of an (optionally) oversized included set. The non-subset and
offline members never broadcast a signing done check, so the outer
`signingDoneCheck` — which required a confirmation from **every**
attempt member — would hang an otherwise-successful attempt to its
timeout and force a needless retry.
This concludes the done check on a **deterministic threshold quorum**,
keeping the legacy path byte-for-byte unchanged.
## Design (locked via a Codex + Gemini consult)
My first cut completed on "the first `t` arrivals," which is
**network-order-dependent**: the resulting `latestEndBlock` (a per-node
max over a network-order-dependent subset) differs across honest nodes,
and that value feeds batch scheduling (`signBatch`: `signingStartBlock =
prev endBlock + interlude`) → the batch desyncs. The consult locked this
design:
- **Non-oversized (`included == honestThreshold`; today's selector
output and the whole coarse path)**: the legacy all-members rule is
**UNCHANGED** — wait for every attempt member, require all signatures
equal, return `max(endBlock)`.
- **Oversized (`included > honestThreshold`)**: bucket done checks **by
signature**, conclude once a bucket holds `>= honestThreshold` distinct
senders (the minimum that proves a valid threshold signature).
**Minority buckets (divergent / adversarial signatures) are ignored,
never fatal** — one bad done message can't fracture the group. The end
block is the **deterministic `attemptTimeoutBlock`** (every honest node
computes it identically), not a max over done messages.
With honest majority (`t > groupSize/2`) at most one signature bucket
can reach `t` — even under coordinator equivocation (two disjoint
`t`-subsets can't coexist when `2t > n`) — so the quorum is unique. That
means **no body-hash / proto change** is needed (bucket by the existing
signature field), and **no `signBatch` change** (the oversized path
feeds `attemptTimeoutBlock` through the existing return). The
`>1`-quorum branch is unreachable and intentionally non-fatal (no noisy
post-success failure).
## Inert until oversizing
`included == honestThreshold` today, so the legacy branch is taken and
behavior is identical to before; the quorum path only activates once
selection oversizes the set (MacLane's policy knob).
## Tests (`signing_done_test.go`)
- Legacy `TestSigningDoneCheck{,_MissingConfirmation,_AnotherSignature}`
— unchanged, still green (they use an attempt-member set of size
`honestThreshold`).
- `TestSigningDoneCheck_ThresholdSubsetConcludes` — oversized; reporters
carry different end blocks but the result is the deterministic
`attemptTimeoutBlock`.
- `TestSigningDoneCheck_OversizedIgnoresMinorityDivergentSignature` —
`t` correct + 1 divergent → concludes on the quorum, ignores the
minority (no error).
- `TestSigningDoneCheck_OversizedSplitBelowQuorumTimesOut` — 2+2 split,
no bucket reaches `t` → times out.
Validated: build+vet across default / `frost_roast_retry` /
`frost_native frost_roast_retry` / CGO; the done-check + signing-loop +
roast-transition tests with `-race`; gofmt clean. The
`signingDoneCheckStrategy` interface + `mockSigningDoneCheck` are
untouched (only the constructor signature changed).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
0 commit comments