Skip to content

Commit 9f4590e

Browse files
authored
feat(frost/signing): RFC-21 Phase 7.1 -- AggregateBundle + bundle registry (#3985)
## Summary First Phase-7 PR. Wires \`AggregateBundle\` production into the orchestration cleanup path so the elected coordinator's node automatically produces a \`TransitionMessage\` at the end of each attempt. The bundle is stashed in a per-session registry that Phase 7.2's ROAST-driven \`signingParticipantSelector\` reads to compute the next attempt's \`IncludedSet\`. Stacked on #3984 (Phase 6.4). ## What lands | File | Build tag | Role | |---|---|---| | \`roast_retry_bundle_registry_default_build.go\` | \`!frost_roast_retry\` | Permanent no-op stubs. Default-build selector always falls back to legacy. | | \`roast_retry_bundle_registry_frost_roast_retry.go\` | \`frost_roast_retry\` | Real mutex-protected map. TTL matches \`SessionHandleBindingTTL\` (2h). Later Record-calls overwrite (latest transition wins). | | \`roast_retry_orchestration.go\` (extended) | untagged | New \`maybeProduceTransitionBundle\` helper called from cleanup. | ## How cleanup produces a bundle After \`BeginOrchestrationForSession\` returns, the deferred cleanup fires at session end. It: 1. Verifies the local node is the elected coordinator (skip if not). 2. Checks the attempt is still \`Collecting\` (skip if already transitioned -- e.g. signature succeeded, no bundle needed). 3. Calls \`Coordinator.AggregateBundle\`. 4. Stashes the result via \`RecordTransitionBundleForSession\` (no-op stub in default build). **Failures along the path are silent.** Cleanup must never panic and must never propagate errors into the signing flow's defer chain. A missing bundle just means the next attempt's selector falls back to legacy. ## Test coverage | File | Build | Cases | |---|---|---| | \`roast_retry_bundle_registry_test.go\` | \`!frost_roast_retry\` | 1 (default stub is observable no-op) | | \`roast_retry_bundle_registry_frost_roast_retry_test.go\` | \`frost_roast_retry\` | 5 (round-trip, latest-wins, clear, nil-discard, TTL eviction, TTL matches session-handle TTL) | | \`roast_retry_orchestration_bundle_test.go\` | \`frost_roast_retry\` | 3 (elected coordinator records, non-elected does not, double-cleanup is safe) | ## Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/...\` | pass (5 packages) | | \`go test -tags 'frost_roast_retry' ./pkg/frost/signing/...\` | pass | | \`staticcheck -checks '-SA1019' ./pkg/frost/...\` | silent | | \`gofmt -l ./pkg/frost/signing/\` | silent | ## Phase 7 plan | PR | Scope | State | |---|---|---| | **7.1 (this)** | **AggregateBundle + bundle registry** | **open** | | 7.2 | ROAST-driven signingParticipantSelector (consumes registry) | next | | 7.3+ | Readiness manifest entry + integration testnet evidence + manifest flip | post-7.2 | ## Test plan - [ ] CI green. - [ ] Reviewer confirms the silent-error discipline in cleanup is appropriate (alternative: log at WARN level). - [ ] Reviewer confirms the TTL = session-handle TTL alignment is intentional (alternative: longer-lived bundles).
2 parents 6472a40 + 86a5446 commit 9f4590e

6 files changed

Lines changed: 554 additions & 0 deletions
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
//go:build !frost_roast_retry
2+
3+
package signing
4+
5+
import "github.com/keep-network/keep-core/pkg/frost/roast"
6+
7+
// RecordTransitionBundleForSession is a no-op in the default build:
8+
// the per-session bundle registry is not active without the
9+
// frost_roast_retry tag. The signing-loop ROAST selector (when
10+
// installed via Phase 7's build) reads this registry to consume
11+
// the most recent TransitionMessage for a message.
12+
func RecordTransitionBundleForSession(_ string, _ *roast.TransitionMessage) {}
13+
14+
// TransitionBundleForSession returns (nil, false) in the default
15+
// build, signalling to callers that no ROAST bundle is available
16+
// and the legacy retry shuffle should be used.
17+
func TransitionBundleForSession(_ string) (*roast.TransitionMessage, bool) {
18+
return nil, false
19+
}
20+
21+
// ClearTransitionBundleForSession is a no-op in the default build.
22+
func ClearTransitionBundleForSession(_ string) {}
23+
24+
// ResetTransitionBundleRegistryForTest is a no-op in the default
25+
// build.
26+
func ResetTransitionBundleRegistryForTest() {}
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
//go:build frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"sync"
7+
"time"
8+
9+
"github.com/keep-network/keep-core/pkg/frost/roast"
10+
)
11+
12+
// TransitionBundleRegistryTTL is how long a session's most recent
13+
// TransitionMessage is retained before the background sweeper
14+
// evicts it. Matches the session-handle TTL: a bundle's usefulness
15+
// to retry-driven participant selection expires when the session
16+
// it describes is itself archived.
17+
const TransitionBundleRegistryTTL = SessionHandleBindingTTL
18+
19+
// sessionBundleEntry pairs a TransitionMessage with the wall-clock
20+
// time at which it was recorded so the sweeper can evict stale
21+
// entries.
22+
type sessionBundleEntry struct {
23+
bundle *roast.TransitionMessage
24+
createdAt time.Time
25+
}
26+
27+
var (
28+
sessionBundleRegistryMu sync.RWMutex
29+
sessionBundleRegistry = map[string]sessionBundleEntry{}
30+
)
31+
32+
// RecordTransitionBundleForSession stores the most recent
33+
// TransitionMessage produced by the elected coordinator for the
34+
// named session. The bundle is later consumed by the ROAST-driven
35+
// signingParticipantSelector to compute the next attempt's
36+
// IncludedSet via EvaluateRoastRetryForSigning.
37+
//
38+
// A later call for the same session overwrites the earlier bundle
39+
// -- the registry tracks only the most recent transition.
40+
func RecordTransitionBundleForSession(
41+
sessionID string,
42+
bundle *roast.TransitionMessage,
43+
) {
44+
if bundle == nil {
45+
return
46+
}
47+
sessionBundleRegistryMu.Lock()
48+
defer sessionBundleRegistryMu.Unlock()
49+
sessionBundleRegistry[sessionID] = sessionBundleEntry{
50+
bundle: bundle,
51+
createdAt: time.Now(),
52+
}
53+
}
54+
55+
// TransitionBundleForSession returns the most recent transition
56+
// message for the named session, plus a presence flag. Callers
57+
// (the ROAST selector) treat (nil, false) as "no bundle; fall back
58+
// to legacy".
59+
func TransitionBundleForSession(
60+
sessionID string,
61+
) (*roast.TransitionMessage, bool) {
62+
sessionBundleRegistryMu.RLock()
63+
defer sessionBundleRegistryMu.RUnlock()
64+
entry, ok := sessionBundleRegistry[sessionID]
65+
if !ok {
66+
return nil, false
67+
}
68+
return entry.bundle, true
69+
}
70+
71+
// ClearTransitionBundleForSession removes any bundle for the named
72+
// session. Called when a session terminates.
73+
func ClearTransitionBundleForSession(sessionID string) {
74+
sessionBundleRegistryMu.Lock()
75+
defer sessionBundleRegistryMu.Unlock()
76+
delete(sessionBundleRegistry, sessionID)
77+
}
78+
79+
// ResetTransitionBundleRegistryForTest clears every bundle. Test-
80+
// only seam.
81+
func ResetTransitionBundleRegistryForTest() {
82+
sessionBundleRegistryMu.Lock()
83+
defer sessionBundleRegistryMu.Unlock()
84+
sessionBundleRegistry = map[string]sessionBundleEntry{}
85+
}
86+
87+
// evictStaleTransitionBundles sweeps the registry and removes
88+
// entries older than maxAge. Exposed at the package level so
89+
// tests can invoke it directly with small maxAge values. The
90+
// production sweeper invokes it from sessionHandleSweepLoop
91+
// (Phase 5.2) so the bundle and handle registries share a single
92+
// background goroutine.
93+
func evictStaleTransitionBundles(maxAge time.Duration) int {
94+
cutoff := time.Now().Add(-maxAge)
95+
sessionBundleRegistryMu.Lock()
96+
defer sessionBundleRegistryMu.Unlock()
97+
evicted := 0
98+
for sessionID, entry := range sessionBundleRegistry {
99+
if entry.createdAt.Before(cutoff) {
100+
delete(sessionBundleRegistry, sessionID)
101+
evicted++
102+
}
103+
}
104+
return evicted
105+
}
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
//go:build frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"testing"
7+
"time"
8+
9+
"github.com/keep-network/keep-core/pkg/frost/roast"
10+
)
11+
12+
func TestTransitionBundleRegistry_RoundTrip(t *testing.T) {
13+
ResetTransitionBundleRegistryForTest()
14+
t.Cleanup(ResetTransitionBundleRegistryForTest)
15+
16+
bundle := &roast.TransitionMessage{
17+
CoordinatorIDValue: 7,
18+
}
19+
RecordTransitionBundleForSession("session-A", bundle)
20+
21+
got, ok := TransitionBundleForSession("session-A")
22+
if !ok {
23+
t.Fatal("expected bundle to be present after Record")
24+
}
25+
if got.CoordinatorIDValue != 7 {
26+
t.Fatalf(
27+
"bundle round-trip mismatch: got coordinator %d, want 7",
28+
got.CoordinatorIDValue,
29+
)
30+
}
31+
}
32+
33+
func TestTransitionBundleRegistry_LaterRecordOverwrites(t *testing.T) {
34+
ResetTransitionBundleRegistryForTest()
35+
t.Cleanup(ResetTransitionBundleRegistryForTest)
36+
37+
RecordTransitionBundleForSession("session-B", &roast.TransitionMessage{CoordinatorIDValue: 1})
38+
RecordTransitionBundleForSession("session-B", &roast.TransitionMessage{CoordinatorIDValue: 2})
39+
got, ok := TransitionBundleForSession("session-B")
40+
if !ok {
41+
t.Fatal("expected bundle to be present")
42+
}
43+
if got.CoordinatorIDValue != 2 {
44+
t.Fatalf(
45+
"later Record must overwrite earlier: got %d, want 2",
46+
got.CoordinatorIDValue,
47+
)
48+
}
49+
}
50+
51+
func TestTransitionBundleRegistry_ClearRemovesBundle(t *testing.T) {
52+
ResetTransitionBundleRegistryForTest()
53+
t.Cleanup(ResetTransitionBundleRegistryForTest)
54+
55+
RecordTransitionBundleForSession("session-clear", &roast.TransitionMessage{})
56+
if _, ok := TransitionBundleForSession("session-clear"); !ok {
57+
t.Fatal("setup: bundle must exist")
58+
}
59+
ClearTransitionBundleForSession("session-clear")
60+
if _, ok := TransitionBundleForSession("session-clear"); ok {
61+
t.Fatal("bundle must be removed after Clear")
62+
}
63+
}
64+
65+
func TestTransitionBundleRegistry_NilBundleIsIgnored(t *testing.T) {
66+
ResetTransitionBundleRegistryForTest()
67+
t.Cleanup(ResetTransitionBundleRegistryForTest)
68+
69+
RecordTransitionBundleForSession("session-nil", nil)
70+
if _, ok := TransitionBundleForSession("session-nil"); ok {
71+
t.Fatal("nil bundle must be discarded")
72+
}
73+
}
74+
75+
func TestEvictStaleTransitionBundles_RemovesOldEntries(t *testing.T) {
76+
ResetTransitionBundleRegistryForTest()
77+
t.Cleanup(ResetTransitionBundleRegistryForTest)
78+
79+
RecordTransitionBundleForSession("session-old", &roast.TransitionMessage{CoordinatorIDValue: 1})
80+
// Backdate.
81+
sessionBundleRegistryMu.Lock()
82+
entry := sessionBundleRegistry["session-old"]
83+
entry.createdAt = time.Now().Add(-10 * time.Minute)
84+
sessionBundleRegistry["session-old"] = entry
85+
sessionBundleRegistryMu.Unlock()
86+
87+
RecordTransitionBundleForSession("session-new", &roast.TransitionMessage{CoordinatorIDValue: 2})
88+
89+
evicted := evictStaleTransitionBundles(5 * time.Minute)
90+
if evicted != 1 {
91+
t.Fatalf("expected 1 eviction, got %d", evicted)
92+
}
93+
if _, ok := TransitionBundleForSession("session-old"); ok {
94+
t.Fatal("old bundle must be evicted")
95+
}
96+
if _, ok := TransitionBundleForSession("session-new"); !ok {
97+
t.Fatal("new bundle must survive")
98+
}
99+
}
100+
101+
func TestTransitionBundleRegistryTTL_MatchesSessionHandleTTL(t *testing.T) {
102+
if TransitionBundleRegistryTTL != SessionHandleBindingTTL {
103+
t.Fatalf(
104+
"bundle TTL %s != session-handle TTL %s; bundles must not outlive sessions",
105+
TransitionBundleRegistryTTL,
106+
SessionHandleBindingTTL,
107+
)
108+
}
109+
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
//go:build !frost_roast_retry
2+
3+
package signing
4+
5+
import (
6+
"testing"
7+
8+
"github.com/keep-network/keep-core/pkg/frost/roast"
9+
)
10+
11+
func TestTransitionBundleRegistry_DefaultBuildIsNoOp(t *testing.T) {
12+
// In the default build the registry is a permanent stub:
13+
// RecordTransitionBundleForSession discards; TransitionBundleForSession
14+
// always returns (nil, false). The ROAST selector must therefore
15+
// always fall back to legacy retry in the default build.
16+
RecordTransitionBundleForSession(
17+
"session-default-build-test",
18+
&roast.TransitionMessage{},
19+
)
20+
got, ok := TransitionBundleForSession("session-default-build-test")
21+
if ok {
22+
t.Fatalf(
23+
"default build registry must report not-present; got bundle %v",
24+
got,
25+
)
26+
}
27+
if got != nil {
28+
t.Fatalf("default build must return nil bundle; got %v", got)
29+
}
30+
31+
// Clear and reset must not panic.
32+
ClearTransitionBundleForSession("session-default-build-test")
33+
ResetTransitionBundleRegistryForTest()
34+
}

pkg/frost/signing/roast_retry_orchestration.go

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import (
66

77
"github.com/keep-network/keep-core/pkg/frost/roast"
88
"github.com/keep-network/keep-core/pkg/frost/roast/attempt"
9+
"github.com/keep-network/keep-core/pkg/protocol/group"
910
)
1011

1112
// ErrNoRoastRetryCoordinatorRegistered is returned by
@@ -71,11 +72,75 @@ func BeginOrchestrationForSession(
7172
}
7273
SetCurrentAttemptHandleForSession(sessionID, handle, ctx)
7374
cleanup := func() {
75+
// RFC-21 Phase 7.1: if this node is the elected
76+
// coordinator and the attempt is still in the Collecting
77+
// state at cleanup time (i.e. it did not succeed via
78+
// signature aggregation), produce the TransitionMessage
79+
// and stash it in the per-session bundle registry. Phase
80+
// 7.2's ROAST signingParticipantSelector consumes the
81+
// stashed bundle to compute the next attempt's
82+
// IncludedSet via EvaluateRoastRetryForSigning.
83+
//
84+
// Failures are best-effort and silent: a panic in the
85+
// deferred cleanup is materially worse than a missing
86+
// transition bundle (the next attempt's selector falls
87+
// back to the legacy retry shuffle), so we swallow errors
88+
// rather than propagate them.
89+
maybeProduceTransitionBundle(sessionID, handle, deps)
7490
ClearCurrentAttemptHandleForSession(sessionID)
7591
}
7692
return handle, cleanup, nil
7793
}
7894

95+
// maybeProduceTransitionBundle attempts to call AggregateBundle on
96+
// the registered Coordinator when (a) the local node is the
97+
// elected coordinator for the attempt and (b) the attempt has not
98+
// already transitioned. The result is stashed via
99+
// RecordTransitionBundleForSession (a no-op in default build); on
100+
// any error path the function returns silently because cleanup
101+
// must not break the signing-flow contract.
102+
//
103+
// In the default build this still compiles because
104+
// RecordTransitionBundleForSession is a no-op stub; calls to
105+
// roast.Coordinator methods compile because pkg/frost/roast is
106+
// not build-tagged.
107+
func maybeProduceTransitionBundle(
108+
sessionID string,
109+
handle roast.AttemptHandle,
110+
deps RoastRetryDeps,
111+
) {
112+
if deps.Coordinator == nil {
113+
return
114+
}
115+
if deps.SelfMember == 0 {
116+
// Without a known self-member, we cannot determine
117+
// whether to aggregate. Skip.
118+
return
119+
}
120+
elected, err := deps.Coordinator.SelectedCoordinator(handle)
121+
if err != nil {
122+
return
123+
}
124+
if elected != group.MemberIndex(deps.SelfMember) {
125+
return
126+
}
127+
state, err := deps.Coordinator.State(handle)
128+
if err != nil {
129+
return
130+
}
131+
if state != roast.AttemptStateCollecting {
132+
// Already transitioned or succeeded -- nothing to do.
133+
return
134+
}
135+
bundle, err := deps.Coordinator.AggregateBundle(handle)
136+
if err != nil {
137+
// Best-effort; the next attempt's selector will fall
138+
// back to the legacy retry shuffle.
139+
return
140+
}
141+
RecordTransitionBundleForSession(sessionID, bundle)
142+
}
143+
79144
// EndOrchestrationForSession is a convenience for callers that
80145
// did not capture the cleanup function from
81146
// BeginOrchestrationForSession (e.g. callers that pass session

0 commit comments

Comments
 (0)