Skip to content

Commit c3aa36c

Browse files
authored
feat(frost): route coarse ROAST evidence into the transition bundle (7.3 PR2b-2 step 2) (#4088)
## RFC-21 Phase 7.3 PR2b-2 step 2 — the blame bridge Makes ROAST exclusion/blame actually fire by routing the coarse receive loop's real evidence into the bundle the elected coordinator aggregates. ### The gap Per attempt a ROAST seat mints two coordinator attempt records on the same per-seat coordinator: an **OBSERVE** handle (what `RoastTransitionExchange` aggregates) and a **DRIVE** handle (where `submitSnapshotIfActive` recorded the receive loop's real rejects/overflows/conflicts). The drive handle is **never aggregated** — the exchange is the sole bundle producer — so that evidence was a write-only dead end, and the aggregated bundle carried only forced-EMPTY proof-of-attendance snapshots. `computeNextAttempt`'s f+1 accuser tally therefore never excluded anyone. ### The bridge A member-keyed pending-evidence stash keyed by `(RoastSessionID, member, attemptHash)` — the namespace-independent coordinate shared by the drive and observe worlds (both derive from `ctx.SessionID` / `ctx.Hash()`) — carries the raw recorder snapshot: - `submitSnapshotIfActive` **stashes** the evidence (dropping the dead drive-handle `RecordEvidence`). - `BroadcastForcedSnapshot` **consumes** the stash and builds the ONE signed snapshot it broadcasts, so the elected coordinator's `AggregateBundle` includes the real evidence and `computeNextAttempt` can establish + exclude an f+1-accused member. - Signing stays in one place (the exchange), so `VerifyBundle`'s censorship check (which compares the receiver's own `OperatorSignature`) is unaffected. Lifecycle mirrors the observe registry: consume on broadcast, clear on local success + session end, TTL sweep backstop; `Evidence` is deep-copied on store. ### Scope Coarse f+1 reject/overflow/conflict path only. Coordinator-equivocation proofs (`CoordinatorPackageProofs`, sourced from the interactive `Round2Collector`) are **step 2b**. The ROAST selector is already live under `frost_roast_retry`; the stale "once AggregateBundle production is wired upstream" comment is retired. ### Verification - build + vet across 5 tag combos (default / `frost_native` / `frost_roast_retry` / `frost_native frost_roast_retry` / `frost_native frost_tbtc_signer` w/ CGO) - tests green incl. an e2e (stashed evidence → bundle → `NextAttempt` excludes the f+1-accused member), `-race` on `pkg/frost/signing`, gofmt clean - design locked via Codex + Gemini consult (both PROCEED, no holes) 🤖 Generated with [Claude Code](https://claude.com/claude-code)
2 parents 8c369c3 + b3ff29a commit c3aa36c

10 files changed

Lines changed: 666 additions & 397 deletions

pkg/frost/signing/roast_retry_attempt_handle_frost_roast_retry.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ func sessionHandleSweepLoop(stop <-chan struct{}) {
161161
// anything past the TTL.
162162
evictStaleObservedAttempts(ObservedAttemptRegistryTTL)
163163
evictStaleRoastTransitions(RoastTransitionRegistryTTL)
164+
// Stashed coarse-path evidence is normally consumed by
165+
// BroadcastForcedSnapshot or cleared on success/session-end (RFC-21
166+
// Phase 7.3 PR2b-2 step 2); sweep any orphaned by an abnormal end.
167+
evictStalePendingEvidence(PendingEvidenceRegistryTTL)
164168
}
165169
}
166170
}
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
//go:build !frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"github.com/keep-network/keep-core/pkg/frost/roast/attempt"
7+
"github.com/keep-network/keep-core/pkg/protocol/group"
8+
)
9+
10+
// stashPendingEvidence is a no-op in the default build: with no ROAST-retry
11+
// orchestration there is never a session-handle binding, so submitSnapshotIfActive
12+
// returns before reaching this call. The build-tagged implementation does the real
13+
// stashing the transition exchange's BroadcastForcedSnapshot consumes.
14+
func stashPendingEvidence(
15+
_ string,
16+
_ group.MemberIndex,
17+
_ [attempt.MessageDigestLength]byte,
18+
_ attempt.Evidence,
19+
) {
20+
}
Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
//go:build frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"sync"
7+
"time"
8+
9+
"github.com/keep-network/keep-core/pkg/frost/roast/attempt"
10+
"github.com/keep-network/keep-core/pkg/protocol/group"
11+
)
12+
13+
// PendingEvidenceRegistryTTL is how long a stashed evidence entry is retained
14+
// before the background sweeper evicts it. Matches the observe-binding TTL: a
15+
// stashed snapshot is useless once the session it tracks is archived.
16+
const PendingEvidenceRegistryTTL = SessionHandleBindingTTL
17+
18+
// pendingEvidenceKey scopes one local seat's captured coarse-path evidence to a
19+
// specific attempt of a ROAST session.
20+
//
21+
// RFC-21 Phase 7.3 PR2b-2 step 2 (the blame bridge): the coarse receive loop
22+
// captures real rejects/overflows/conflicts into an EvidenceRecorder and, at
23+
// end-of-collect, stashes the raw snapshot here (submitSnapshotIfActive). The
24+
// transition exchange's BroadcastForcedSnapshot then reads it so the seat's
25+
// broadcast snapshot -- and therefore the elected coordinator's aggregated bundle
26+
// -- carries real evidence instead of a forced-empty proof-of-attendance one, so
27+
// NextAttempt's f+1 accuser tally can finally fire.
28+
//
29+
// The key is (RoastSessionID, member, attemptHash): the same coordinate space the
30+
// observe binding uses (attemptHash == ctx.Hash() is namespace-independent, and
31+
// the stashing ctx.SessionID is the STABLE RoastSessionID == the exchange's
32+
// e.roastSessionID). The member component isolates a multi-seat operator's sibling
33+
// seats: they share a RoastSessionID and attemptHash but each stashes -- and
34+
// broadcasts -- its OWN evidence, so neither overwrites the other.
35+
type pendingEvidenceKey struct {
36+
sessionID string
37+
member group.MemberIndex
38+
attemptHash [attempt.MessageDigestLength]byte
39+
}
40+
41+
type pendingEvidenceEntry struct {
42+
evidence attempt.Evidence
43+
createdAt time.Time
44+
}
45+
46+
var (
47+
pendingEvidenceMu sync.Mutex
48+
pendingEvidenceRegistry = map[pendingEvidenceKey]pendingEvidenceEntry{}
49+
)
50+
51+
// stashPendingEvidence stores a deep COPY of the captured evidence for
52+
// (sessionID, member, attemptHash). A later call for the same key overwrites the
53+
// earlier one (a re-driven attempt re-stashes). The deep copy means a later
54+
// mutation of the caller's recorder snapshot cannot race the exchange's read.
55+
func stashPendingEvidence(
56+
sessionID string,
57+
member group.MemberIndex,
58+
attemptHash [attempt.MessageDigestLength]byte,
59+
evidence attempt.Evidence,
60+
) {
61+
key := pendingEvidenceKey{sessionID, member, attemptHash}
62+
pendingEvidenceMu.Lock()
63+
defer pendingEvidenceMu.Unlock()
64+
pendingEvidenceRegistry[key] = pendingEvidenceEntry{
65+
evidence: copyEvidence(evidence),
66+
createdAt: time.Now(),
67+
}
68+
}
69+
70+
// takePendingEvidence returns the stashed evidence for (sessionID, member,
71+
// attemptHash) and a presence flag, REMOVING it (consume-on-read). The returned
72+
// value is the sole reference -- the entry is deleted from the registry -- so the
73+
// caller owns it exclusively without a further copy. BroadcastForcedSnapshot calls
74+
// it: a present entry means the coarse receive loop observed real evidence for the
75+
// attempt; absent means none was captured (the broadcast then carries an empty
76+
// proof-of-attendance snapshot).
77+
func takePendingEvidence(
78+
sessionID string,
79+
member group.MemberIndex,
80+
attemptHash [attempt.MessageDigestLength]byte,
81+
) (attempt.Evidence, bool) {
82+
key := pendingEvidenceKey{sessionID, member, attemptHash}
83+
pendingEvidenceMu.Lock()
84+
defer pendingEvidenceMu.Unlock()
85+
entry, ok := pendingEvidenceRegistry[key]
86+
if !ok {
87+
return attempt.Evidence{}, false
88+
}
89+
delete(pendingEvidenceRegistry, key)
90+
return entry.evidence, true
91+
}
92+
93+
// clearPendingEvidence removes any stashed evidence for (sessionID, member,
94+
// attemptHash).
95+
func clearPendingEvidence(
96+
sessionID string,
97+
member group.MemberIndex,
98+
attemptHash [attempt.MessageDigestLength]byte,
99+
) {
100+
key := pendingEvidenceKey{sessionID, member, attemptHash}
101+
pendingEvidenceMu.Lock()
102+
defer pendingEvidenceMu.Unlock()
103+
delete(pendingEvidenceRegistry, key)
104+
}
105+
106+
// ClearPendingEvidenceOnLocalSuccess removes the stashed evidence for an attempt
107+
// this seat completed successfully. A succeeded attempt produces no transition
108+
// bundle, so BroadcastForcedSnapshot never consumes the stash; clearing here
109+
// prevents a leak until the TTL backstop. Exported so the pkg/tbtc transition
110+
// controller can call it alongside ClearObservedAttemptOnLocalSuccess (RFC-21
111+
// Phase 7.3 PR2b-2).
112+
func ClearPendingEvidenceOnLocalSuccess(
113+
sessionID string,
114+
member group.MemberIndex,
115+
attemptHash [attempt.MessageDigestLength]byte,
116+
) {
117+
clearPendingEvidence(sessionID, member, attemptHash)
118+
}
119+
120+
// clearPendingEvidenceForSession removes every stashed evidence entry for
121+
// (sessionID, member), regardless of attempt hash. The transition exchange calls
122+
// it when the session ends (its listener context is done), mirroring
123+
// clearObservedAttemptsForSession, so a signing whose attempts succeeded -- and
124+
// therefore never consumed the stash via BroadcastForcedSnapshot -- does not leave
125+
// entries behind.
126+
func clearPendingEvidenceForSession(sessionID string, member group.MemberIndex) {
127+
pendingEvidenceMu.Lock()
128+
defer pendingEvidenceMu.Unlock()
129+
for key := range pendingEvidenceRegistry {
130+
if key.sessionID == sessionID && key.member == member {
131+
delete(pendingEvidenceRegistry, key)
132+
}
133+
}
134+
}
135+
136+
// evictStalePendingEvidence sweeps the registry and removes entries older than
137+
// maxAge. Exposed at the package level so tests can invoke it directly with small
138+
// maxAge values and so the shared session-handle sweeper can fold it into one
139+
// background goroutine: a session that ends abnormally must not orphan a stash
140+
// entry past its backstop TTL.
141+
func evictStalePendingEvidence(maxAge time.Duration) int {
142+
cutoff := time.Now().Add(-maxAge)
143+
pendingEvidenceMu.Lock()
144+
defer pendingEvidenceMu.Unlock()
145+
evicted := 0
146+
for key, entry := range pendingEvidenceRegistry {
147+
if entry.createdAt.Before(cutoff) {
148+
delete(pendingEvidenceRegistry, key)
149+
evicted++
150+
}
151+
}
152+
return evicted
153+
}
154+
155+
// ResetPendingEvidenceRegistryForTest clears every stashed entry. Test-only seam;
156+
// not for production code paths.
157+
func ResetPendingEvidenceRegistryForTest() {
158+
pendingEvidenceMu.Lock()
159+
defer pendingEvidenceMu.Unlock()
160+
pendingEvidenceRegistry = map[pendingEvidenceKey]pendingEvidenceEntry{}
161+
}
162+
163+
// PendingEvidenceStashedForTest reports whether any stash entry exists for
164+
// (sessionID, member), regardless of attempt hash. Exported test seam so
165+
// downstream-package tests can assert the stash without reaching into the
166+
// unexported registry.
167+
func PendingEvidenceStashedForTest(sessionID string, member group.MemberIndex) bool {
168+
pendingEvidenceMu.Lock()
169+
defer pendingEvidenceMu.Unlock()
170+
for key := range pendingEvidenceRegistry {
171+
if key.sessionID == sessionID && key.member == member {
172+
return true
173+
}
174+
}
175+
return false
176+
}
177+
178+
// copyEvidence returns a deep copy of an attempt.Evidence: the three maps are
179+
// re-allocated and the per-sender RejectEntry slices are cloned. RejectEntry is a
180+
// flat value type (Reason string, Count uint), so a slice copy is a full deep
181+
// copy. nil maps stay nil (a nil category means "did not fire", which
182+
// NewLocalEvidenceSnapshot and NextAttempt both treat as empty).
183+
func copyEvidence(e attempt.Evidence) attempt.Evidence {
184+
out := attempt.Evidence{}
185+
if e.Overflows != nil {
186+
out.Overflows = make(map[group.MemberIndex]uint, len(e.Overflows))
187+
for k, v := range e.Overflows {
188+
out.Overflows[k] = v
189+
}
190+
}
191+
if e.Rejects != nil {
192+
out.Rejects = make(map[group.MemberIndex][]attempt.RejectEntry, len(e.Rejects))
193+
for k, v := range e.Rejects {
194+
out.Rejects[k] = append([]attempt.RejectEntry(nil), v...)
195+
}
196+
}
197+
if e.Conflicts != nil {
198+
out.Conflicts = make(map[group.MemberIndex]uint, len(e.Conflicts))
199+
for k, v := range e.Conflicts {
200+
out.Conflicts[k] = v
201+
}
202+
}
203+
return out
204+
}

0 commit comments

Comments
 (0)