Skip to content

Commit 2f1fd39

Browse files
authored
feat(frost/roast): RFC-21 Phase 3.1 -- coordinator skeleton + seed bridge (#3968)
## Summary First Phase-3 implementation PR for **RFC-21**. Introduces the ROAST coordinator state-machine surface (`Coordinator` interface, in-memory implementation, attempt-handle identity, state enum) plus the sterile seed-folding adapter that lets the new `[32]byte` `AttemptSeed` drive the legacy `SelectCoordinator` helper without modifying it. **No production code path uses the new \`Coordinator\` yet.** Phase 3 "ships unused" per the RFC. Phase 4 wires it into receivers behind the \`frost_roast_retry\` build tag. ## What lands ### \`pkg/frost/roast/coordinator_state.go\` | Surface | Role | |---|---| | \`AttemptState\` enum | \`Pending / Collecting / Aggregating / Succeeded / Transitioned\` with \`String()\`. | | \`AttemptHandle\` | Opaque per-attempt identity. \`ContextHash()\` accessor cross-checks the bound context. | | \`Coordinator\` interface | \`BeginAttempt(ctx) → handle\`, \`State(handle) → state\`, \`SelectedCoordinator(handle) → member\`. Later Phase-3 PRs (3.2 / 3.3 / 3.4) extend with \`TransitionMessage\`, \`AggregateBundle\`, \`VerifyBundle\`, and \`NextAttempt\`. | | \`NewInMemoryCoordinator()\` | Concurrent-safe via \`sync.Mutex\` + \`atomic.Uint64\` next-id counter. | | \`ErrUnknownAttempt\` | Sentinel for handle/instance mismatch. | ### \`pkg/frost/roast/seed_bridge.go\` | Surface | Role | |---|---| | \`foldAttemptSeed(seed [32]byte) int64\` | First 8 bytes BE → int64 reinterpretation. Sterile, named, non-cryptographic adapter. Documented contract: byte-identical input must produce byte-identical output on every honest signer. | \`BeginAttempt\` calls \`foldAttemptSeed\` and forwards to the existing \`SelectCoordinator\` to elect the attempt's coordinator. The legacy helper itself is **not modified** -- the bridge is the only thing between RFC-21 contexts and the legacy seed format. ## Why the seed bridge The legacy \`SelectCoordinator\` takes \`(seed int64, attemptNumber uint)\` and is correct in isolation. RFC-21 widens \`AttemptSeed\` to \`[32]byte\` for the canonical-hash binding. We could rewrite the shuffle, but rewriting cryptographic-consensus logic that already agrees across the network is the wrong trade-off; the audit and behaviour are settled. The bridge satisfies the resolved decision in RFC-21: > \"BeginAttempt wraps it with a sterile bridge that folds the new > [32]byte AttemptSeed into the legacy parameter shape... The bridge > is named, isolated, and exhaustively tested so later edits cannot > accidentally desynchronise it.\" ## Test coverage ### \`coordinator_state_test.go\` (9 tests) - \`TestBeginAttempt_ReturnsHandleWithMatchingContextHash\` - \`TestBeginAttempt_HandlesAreDistinctAcrossAttempts\` - \`TestBeginAttempt_RejectsEmptyIncludedSet\` (defence-in-depth) - \`TestState_ReturnsCollectingAfterBegin\` - \`TestState_UnknownHandleReturnsSentinel\` - \`TestSelectedCoordinator_ReturnsMemberFromIncludedSet\` - \`TestSelectedCoordinator_IsDeterministicForSameContext\` -- two independent \`Coordinator\` instances agree on the elected member - \`TestSelectedCoordinator_DifferentAttemptNumbersCanProduceDifferentLeaders\` -- 16 attempts produce ≥2 distinct leaders, defending the ROAST leader-rotation property - \`TestSelectedCoordinator_UnknownHandleReturnsSentinel\` - \`TestInMemoryCoordinator_ConcurrentBeginAttemptsAreRaceSafe\` -- 16 goroutines × 50 calls each, all handles unique - \`TestAttemptState_String\` -- all enum values + unknown sentinel ### \`seed_bridge_test.go\` (5 tests) - \`TestFoldAttemptSeed_IsDeterministic\` - \`TestFoldAttemptSeed_TakesFirst8BytesBigEndian\` -- specific byte pattern verified - \`TestFoldAttemptSeed_IgnoresBytesAfterIndex7\` -- documents the contract: bytes 8..31 don't influence output (still bound at the \`AttemptContext.Hash()\` layer) - \`TestFoldAttemptSeed_FirstByteSwept\` -- 256-value sweep of the high byte produces 256 distinct outputs (no collisions) - \`TestFoldAttemptSeed_GoldenFixture\` -- literal int64 value locks the wire-format reduction; literal drift caught at code review ### Verification | Command | Result | |---|---| | \`go build ./...\` | clean | | \`go test ./pkg/frost/roast/...\` | pass (14 cases) | | \`go test -race ./pkg/frost/roast/...\` | pass | | \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` | pass (5 packages) | | \`staticcheck -checks '-SA1019' ./pkg/frost/roast/...\` | silent | | \`go vet ./pkg/frost/roast/...\` | clean | ## Test plan - [ ] CI green. - [ ] Reviewer confirms the seed bridge's discard of bytes 8..31 is acceptable. (Bytes 8..31 still appear in \`AttemptContext.Hash()\`, so any mutation is detected at the protocol-message layer in Phase 1B; the bridge merely reduces 256-bit input to the 64-bit width \`SelectCoordinator\` needs.) - [ ] Reviewer confirms the \`Coordinator\` interface scope is appropriate for Phase 3.1 (state surface only). Phase 3.2 will extend with \`TransitionMessage\` types. Refs RFC-21 Phase 3 (\`docs/rfc/rfc-21-*\`). Stacked at the integration tip after #3967 merged.
2 parents 6214eec + cd7e222 commit 2f1fd39

4 files changed

Lines changed: 606 additions & 0 deletions

File tree

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
package roast
2+
3+
import (
4+
"errors"
5+
"fmt"
6+
"sync"
7+
"sync/atomic"
8+
9+
"github.com/keep-network/keep-core/pkg/frost/roast/attempt"
10+
"github.com/keep-network/keep-core/pkg/protocol/group"
11+
)
12+
13+
// AttemptState is the phase an attempt is in within the Coordinator
14+
// state machine. The lifecycle is monotonic:
15+
//
16+
// AttemptStatePending -> AttemptStateCollecting -> AttemptStateAggregating
17+
// -> {AttemptStateSucceeded, AttemptStateTransitioned}
18+
//
19+
// AttemptStateSucceeded means the attempt produced a final signature.
20+
// AttemptStateTransitioned means the attempt timed out or hit an
21+
// unrecoverable reject and the coordinator emitted a
22+
// TransitionMessage that drives the next attempt's context. Phase 3.1
23+
// (this file) introduces the state surface only; later phases drive
24+
// the transitions.
25+
type AttemptState uint8
26+
27+
const (
28+
// AttemptStatePending is the zero value -- not a real state, used
29+
// only as the default-initialised "unknown" sentinel returned with
30+
// ErrUnknownAttempt.
31+
AttemptStatePending AttemptState = iota
32+
// AttemptStateCollecting -- the attempt has been started, the
33+
// included set is fixed, and the coordinator is accepting signed
34+
// evidence snapshots from peers.
35+
AttemptStateCollecting
36+
// AttemptStateAggregating -- the coordinator has stopped
37+
// accepting evidence and is building the TransitionMessage
38+
// bundle.
39+
AttemptStateAggregating
40+
// AttemptStateSucceeded -- the attempt produced a final
41+
// signature; no transition message is needed.
42+
AttemptStateSucceeded
43+
// AttemptStateTransitioned -- the attempt timed out or failed
44+
// and the coordinator has emitted a TransitionMessage; the next
45+
// attempt's context can now be computed by NextAttempt.
46+
AttemptStateTransitioned
47+
)
48+
49+
func (s AttemptState) String() string {
50+
switch s {
51+
case AttemptStatePending:
52+
return "pending"
53+
case AttemptStateCollecting:
54+
return "collecting"
55+
case AttemptStateAggregating:
56+
return "aggregating"
57+
case AttemptStateSucceeded:
58+
return "succeeded"
59+
case AttemptStateTransitioned:
60+
return "transitioned"
61+
default:
62+
return fmt.Sprintf("unknown(%d)", uint8(s))
63+
}
64+
}
65+
66+
// AttemptHandle is the opaque per-attempt identity returned by
67+
// Coordinator.BeginAttempt. Handles are not interchangeable across
68+
// coordinator instances: a handle minted by coordinator A cannot be
69+
// passed to coordinator B. Callers must not mutate handles directly.
70+
type AttemptHandle struct {
71+
id uint64
72+
contextHash [attempt.MessageDigestLength]byte
73+
}
74+
75+
// ContextHash returns the canonical AttemptContext.Hash() value bound
76+
// to this handle. Useful for cross-checking a handle against a
77+
// context after the fact.
78+
func (h AttemptHandle) ContextHash() [attempt.MessageDigestLength]byte {
79+
return h.contextHash
80+
}
81+
82+
// Coordinator is the ROAST coordinator state machine introduced by
83+
// RFC-21 Phase 3. It owns per-attempt state, the deterministic
84+
// participant selection (via the existing SelectCoordinator helper),
85+
// and -- in later Phase-3 PRs -- signed-evidence aggregation,
86+
// transition-message construction, and the NextAttempt policy.
87+
//
88+
// Phase 3.1 (this file) introduces only:
89+
// - BeginAttempt: initialise tracking for a new attempt.
90+
// - State: read the current AttemptState for a handle.
91+
// - SelectedCoordinator: report the member elected as coordinator
92+
// for the attempt.
93+
//
94+
// Phase 3.2 adds the TransitionMessage / LocalEvidenceSnapshot types.
95+
// Phase 3.3 adds AggregateBundle and VerifyBundle. Phase 3.4 adds the
96+
// NextAttempt policy function.
97+
//
98+
// Implementations must be safe for concurrent calls from multiple
99+
// goroutines; production keep-core code paths are network-driven.
100+
type Coordinator interface {
101+
// BeginAttempt initialises tracking for a new attempt with the
102+
// given context. It selects the attempt's coordinator
103+
// deterministically from ctx.IncludedSet via SelectCoordinator
104+
// (with the legacy int64 seed produced by foldAttemptSeed) and
105+
// stores the result on the returned handle.
106+
BeginAttempt(ctx attempt.AttemptContext) (AttemptHandle, error)
107+
// State returns the current AttemptState for the given handle.
108+
// Returns ErrUnknownAttempt if the handle was not produced by
109+
// this Coordinator instance.
110+
State(handle AttemptHandle) (AttemptState, error)
111+
// SelectedCoordinator returns the member elected as coordinator
112+
// for the attempt identified by the handle. Returns
113+
// ErrUnknownAttempt if the handle is not tracked.
114+
SelectedCoordinator(handle AttemptHandle) (group.MemberIndex, error)
115+
}
116+
117+
// ErrUnknownAttempt indicates an AttemptHandle does not correspond to
118+
// any attempt tracked by this Coordinator. Either the handle was
119+
// minted by a different coordinator instance, or the attempt has
120+
// been pruned.
121+
var ErrUnknownAttempt = errors.New("coordinator: unknown attempt handle")
122+
123+
// NewInMemoryCoordinator returns a Coordinator that tracks attempts
124+
// in-process. Phase 3 production paths use this implementation.
125+
// Later phases may add persistent variants once persistence is
126+
// designed (RFC-21 Open question on signer restart).
127+
func NewInMemoryCoordinator() Coordinator {
128+
return &inMemoryCoordinator{
129+
attempts: map[uint64]*attemptRecord{},
130+
}
131+
}
132+
133+
type attemptRecord struct {
134+
handle AttemptHandle
135+
context attempt.AttemptContext
136+
coordinator group.MemberIndex
137+
state AttemptState
138+
}
139+
140+
type inMemoryCoordinator struct {
141+
mu sync.Mutex
142+
nextID atomic.Uint64
143+
attempts map[uint64]*attemptRecord
144+
}
145+
146+
func (c *inMemoryCoordinator) BeginAttempt(
147+
ctx attempt.AttemptContext,
148+
) (AttemptHandle, error) {
149+
if len(ctx.IncludedSet) == 0 {
150+
return AttemptHandle{}, fmt.Errorf(
151+
"coordinator: cannot begin attempt with empty included set",
152+
)
153+
}
154+
coord, err := SelectCoordinator(
155+
ctx.IncludedSet,
156+
foldAttemptSeed(ctx.AttemptSeed),
157+
uint(ctx.AttemptNumber),
158+
)
159+
if err != nil {
160+
return AttemptHandle{}, fmt.Errorf(
161+
"coordinator: selection failed: %w",
162+
err,
163+
)
164+
}
165+
handle := AttemptHandle{
166+
id: c.nextID.Add(1),
167+
contextHash: ctx.Hash(),
168+
}
169+
record := &attemptRecord{
170+
handle: handle,
171+
context: ctx,
172+
coordinator: coord,
173+
state: AttemptStateCollecting,
174+
}
175+
c.mu.Lock()
176+
defer c.mu.Unlock()
177+
c.attempts[handle.id] = record
178+
return handle, nil
179+
}
180+
181+
func (c *inMemoryCoordinator) State(
182+
handle AttemptHandle,
183+
) (AttemptState, error) {
184+
c.mu.Lock()
185+
defer c.mu.Unlock()
186+
record, ok := c.attempts[handle.id]
187+
if !ok {
188+
return AttemptStatePending, ErrUnknownAttempt
189+
}
190+
return record.state, nil
191+
}
192+
193+
func (c *inMemoryCoordinator) SelectedCoordinator(
194+
handle AttemptHandle,
195+
) (group.MemberIndex, error) {
196+
c.mu.Lock()
197+
defer c.mu.Unlock()
198+
record, ok := c.attempts[handle.id]
199+
if !ok {
200+
return 0, ErrUnknownAttempt
201+
}
202+
return record.coordinator, nil
203+
}

0 commit comments

Comments
 (0)