Skip to content

Commit b3ff29a

Browse files
mswilkisonclaude
andcommitted
feat(frost): route coarse ROAST evidence into the transition bundle (7.3 PR2b-2 step 2)
The transition exchange aggregated FORCED-EMPTY proof-of-attendance snapshots, so NextAttempt's f+1 accuser tally never saw the real rejects/overflows/conflicts the coarse receive loop collects -- blame/exclusion could not fire. submitSnapshotIfActive recorded that evidence to the DRIVE handle, which is never aggregated (the exchange is the sole bundle producer, keyed by the OBSERVE handle), so it was a write-only dead end. Bridge the two: a member-keyed pending-evidence stash keyed by (RoastSessionID, member, attemptHash) -- the namespace-independent coordinate shared by the drive and observe worlds -- carries the raw recorder snapshot. submitSnapshotIfActive stashes it (dropping the dead drive-handle RecordEvidence); BroadcastForcedSnapshot consumes it and builds the ONE signed snapshot it broadcasts, so the elected coordinator's AggregateBundle includes the real evidence and computeNextAttempt can establish + exclude an f+1-accused member. Lifecycle mirrors the observe registry: consume on broadcast, clear on local success and session end, TTL sweep as a backstop; the Evidence is deep-copied on store. The stale selector comment is retired (the ROAST selector is already wired under frost_roast_retry). Design locked via Codex + Gemini consult (both PROCEED, no holes). Scoped to the coarse f+1 path; coordinator-equivocation proofs (interactive Round2Collector) are step 2b. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 8c369c3 commit b3ff29a

10 files changed

Lines changed: 666 additions & 397 deletions

pkg/frost/signing/roast_retry_attempt_handle_frost_roast_retry.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ func sessionHandleSweepLoop(stop <-chan struct{}) {
161161
// anything past the TTL.
162162
evictStaleObservedAttempts(ObservedAttemptRegistryTTL)
163163
evictStaleRoastTransitions(RoastTransitionRegistryTTL)
164+
// Stashed coarse-path evidence is normally consumed by
165+
// BroadcastForcedSnapshot or cleared on success/session-end (RFC-21
166+
// Phase 7.3 PR2b-2 step 2); sweep any orphaned by an abnormal end.
167+
evictStalePendingEvidence(PendingEvidenceRegistryTTL)
164168
}
165169
}
166170
}
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
//go:build !frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"github.com/keep-network/keep-core/pkg/frost/roast/attempt"
7+
"github.com/keep-network/keep-core/pkg/protocol/group"
8+
)
9+
10+
// stashPendingEvidence is a no-op in the default build: with no ROAST-retry
11+
// orchestration there is never a session-handle binding, so submitSnapshotIfActive
12+
// returns before reaching this call. The build-tagged implementation does the real
13+
// stashing the transition exchange's BroadcastForcedSnapshot consumes.
14+
func stashPendingEvidence(
15+
_ string,
16+
_ group.MemberIndex,
17+
_ [attempt.MessageDigestLength]byte,
18+
_ attempt.Evidence,
19+
) {
20+
}
Lines changed: 204 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
//go:build frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"sync"
7+
"time"
8+
9+
"github.com/keep-network/keep-core/pkg/frost/roast/attempt"
10+
"github.com/keep-network/keep-core/pkg/protocol/group"
11+
)
12+
13+
// PendingEvidenceRegistryTTL is how long a stashed evidence entry is retained
14+
// before the background sweeper evicts it. Matches the observe-binding TTL: a
15+
// stashed snapshot is useless once the session it tracks is archived.
16+
const PendingEvidenceRegistryTTL = SessionHandleBindingTTL
17+
18+
// pendingEvidenceKey scopes one local seat's captured coarse-path evidence to a
19+
// specific attempt of a ROAST session.
20+
//
21+
// RFC-21 Phase 7.3 PR2b-2 step 2 (the blame bridge): the coarse receive loop
22+
// captures real rejects/overflows/conflicts into an EvidenceRecorder and, at
23+
// end-of-collect, stashes the raw snapshot here (submitSnapshotIfActive). The
24+
// transition exchange's BroadcastForcedSnapshot then reads it so the seat's
25+
// broadcast snapshot -- and therefore the elected coordinator's aggregated bundle
26+
// -- carries real evidence instead of a forced-empty proof-of-attendance one, so
27+
// NextAttempt's f+1 accuser tally can finally fire.
28+
//
29+
// The key is (RoastSessionID, member, attemptHash): the same coordinate space the
30+
// observe binding uses (attemptHash == ctx.Hash() is namespace-independent, and
31+
// the stashing ctx.SessionID is the STABLE RoastSessionID == the exchange's
32+
// e.roastSessionID). The member component isolates a multi-seat operator's sibling
33+
// seats: they share a RoastSessionID and attemptHash but each stashes -- and
34+
// broadcasts -- its OWN evidence, so neither overwrites the other.
35+
type pendingEvidenceKey struct {
36+
sessionID string
37+
member group.MemberIndex
38+
attemptHash [attempt.MessageDigestLength]byte
39+
}
40+
41+
type pendingEvidenceEntry struct {
42+
evidence attempt.Evidence
43+
createdAt time.Time
44+
}
45+
46+
var (
47+
pendingEvidenceMu sync.Mutex
48+
pendingEvidenceRegistry = map[pendingEvidenceKey]pendingEvidenceEntry{}
49+
)
50+
51+
// stashPendingEvidence stores a deep COPY of the captured evidence for
52+
// (sessionID, member, attemptHash). A later call for the same key overwrites the
53+
// earlier one (a re-driven attempt re-stashes). The deep copy means a later
54+
// mutation of the caller's recorder snapshot cannot race the exchange's read.
55+
func stashPendingEvidence(
56+
sessionID string,
57+
member group.MemberIndex,
58+
attemptHash [attempt.MessageDigestLength]byte,
59+
evidence attempt.Evidence,
60+
) {
61+
key := pendingEvidenceKey{sessionID, member, attemptHash}
62+
pendingEvidenceMu.Lock()
63+
defer pendingEvidenceMu.Unlock()
64+
pendingEvidenceRegistry[key] = pendingEvidenceEntry{
65+
evidence: copyEvidence(evidence),
66+
createdAt: time.Now(),
67+
}
68+
}
69+
70+
// takePendingEvidence returns the stashed evidence for (sessionID, member,
71+
// attemptHash) and a presence flag, REMOVING it (consume-on-read). The returned
72+
// value is the sole reference -- the entry is deleted from the registry -- so the
73+
// caller owns it exclusively without a further copy. BroadcastForcedSnapshot calls
74+
// it: a present entry means the coarse receive loop observed real evidence for the
75+
// attempt; absent means none was captured (the broadcast then carries an empty
76+
// proof-of-attendance snapshot).
77+
func takePendingEvidence(
78+
sessionID string,
79+
member group.MemberIndex,
80+
attemptHash [attempt.MessageDigestLength]byte,
81+
) (attempt.Evidence, bool) {
82+
key := pendingEvidenceKey{sessionID, member, attemptHash}
83+
pendingEvidenceMu.Lock()
84+
defer pendingEvidenceMu.Unlock()
85+
entry, ok := pendingEvidenceRegistry[key]
86+
if !ok {
87+
return attempt.Evidence{}, false
88+
}
89+
delete(pendingEvidenceRegistry, key)
90+
return entry.evidence, true
91+
}
92+
93+
// clearPendingEvidence removes any stashed evidence for (sessionID, member,
94+
// attemptHash).
95+
func clearPendingEvidence(
96+
sessionID string,
97+
member group.MemberIndex,
98+
attemptHash [attempt.MessageDigestLength]byte,
99+
) {
100+
key := pendingEvidenceKey{sessionID, member, attemptHash}
101+
pendingEvidenceMu.Lock()
102+
defer pendingEvidenceMu.Unlock()
103+
delete(pendingEvidenceRegistry, key)
104+
}
105+
106+
// ClearPendingEvidenceOnLocalSuccess removes the stashed evidence for an attempt
107+
// this seat completed successfully. A succeeded attempt produces no transition
108+
// bundle, so BroadcastForcedSnapshot never consumes the stash; clearing here
109+
// prevents a leak until the TTL backstop. Exported so the pkg/tbtc transition
110+
// controller can call it alongside ClearObservedAttemptOnLocalSuccess (RFC-21
111+
// Phase 7.3 PR2b-2).
112+
func ClearPendingEvidenceOnLocalSuccess(
113+
sessionID string,
114+
member group.MemberIndex,
115+
attemptHash [attempt.MessageDigestLength]byte,
116+
) {
117+
clearPendingEvidence(sessionID, member, attemptHash)
118+
}
119+
120+
// clearPendingEvidenceForSession removes every stashed evidence entry for
121+
// (sessionID, member), regardless of attempt hash. The transition exchange calls
122+
// it when the session ends (its listener context is done), mirroring
123+
// clearObservedAttemptsForSession, so a signing whose attempts succeeded -- and
124+
// therefore never consumed the stash via BroadcastForcedSnapshot -- does not leave
125+
// entries behind.
126+
func clearPendingEvidenceForSession(sessionID string, member group.MemberIndex) {
127+
pendingEvidenceMu.Lock()
128+
defer pendingEvidenceMu.Unlock()
129+
for key := range pendingEvidenceRegistry {
130+
if key.sessionID == sessionID && key.member == member {
131+
delete(pendingEvidenceRegistry, key)
132+
}
133+
}
134+
}
135+
136+
// evictStalePendingEvidence sweeps the registry and removes entries older than
137+
// maxAge. Exposed at the package level so tests can invoke it directly with small
138+
// maxAge values and so the shared session-handle sweeper can fold it into one
139+
// background goroutine: a session that ends abnormally must not orphan a stash
140+
// entry past its backstop TTL.
141+
func evictStalePendingEvidence(maxAge time.Duration) int {
142+
cutoff := time.Now().Add(-maxAge)
143+
pendingEvidenceMu.Lock()
144+
defer pendingEvidenceMu.Unlock()
145+
evicted := 0
146+
for key, entry := range pendingEvidenceRegistry {
147+
if entry.createdAt.Before(cutoff) {
148+
delete(pendingEvidenceRegistry, key)
149+
evicted++
150+
}
151+
}
152+
return evicted
153+
}
154+
155+
// ResetPendingEvidenceRegistryForTest clears every stashed entry. Test-only seam;
156+
// not for production code paths.
157+
func ResetPendingEvidenceRegistryForTest() {
158+
pendingEvidenceMu.Lock()
159+
defer pendingEvidenceMu.Unlock()
160+
pendingEvidenceRegistry = map[pendingEvidenceKey]pendingEvidenceEntry{}
161+
}
162+
163+
// PendingEvidenceStashedForTest reports whether any stash entry exists for
164+
// (sessionID, member), regardless of attempt hash. Exported test seam so
165+
// downstream-package tests can assert the stash without reaching into the
166+
// unexported registry.
167+
func PendingEvidenceStashedForTest(sessionID string, member group.MemberIndex) bool {
168+
pendingEvidenceMu.Lock()
169+
defer pendingEvidenceMu.Unlock()
170+
for key := range pendingEvidenceRegistry {
171+
if key.sessionID == sessionID && key.member == member {
172+
return true
173+
}
174+
}
175+
return false
176+
}
177+
178+
// copyEvidence returns a deep copy of an attempt.Evidence: the three maps are
179+
// re-allocated and the per-sender RejectEntry slices are cloned. RejectEntry is a
180+
// flat value type (Reason string, Count uint), so a slice copy is a full deep
181+
// copy. nil maps stay nil (a nil category means "did not fire", which
182+
// NewLocalEvidenceSnapshot and NextAttempt both treat as empty).
183+
func copyEvidence(e attempt.Evidence) attempt.Evidence {
184+
out := attempt.Evidence{}
185+
if e.Overflows != nil {
186+
out.Overflows = make(map[group.MemberIndex]uint, len(e.Overflows))
187+
for k, v := range e.Overflows {
188+
out.Overflows[k] = v
189+
}
190+
}
191+
if e.Rejects != nil {
192+
out.Rejects = make(map[group.MemberIndex][]attempt.RejectEntry, len(e.Rejects))
193+
for k, v := range e.Rejects {
194+
out.Rejects[k] = append([]attempt.RejectEntry(nil), v...)
195+
}
196+
}
197+
if e.Conflicts != nil {
198+
out.Conflicts = make(map[group.MemberIndex]uint, len(e.Conflicts))
199+
for k, v := range e.Conflicts {
200+
out.Conflicts[k] = v
201+
}
202+
}
203+
return out
204+
}

0 commit comments

Comments
 (0)