Skip to content

Commit 12ec8ba

Browse files
authored
feat(raftengine): Stage 6C-2c — encryptionScanner against etcdraft.MemoryStorage (#783)
## Summary Stage 6C-2c per the [PR #762 plan](#762) and the 6C sub-decomposition landed in [PR #781](#781) / [PR #782](#782). Ships the **raftengine-side scanner** that implements the `encryption.EncryptionRelevantScanner` interface defined in [PR #782 (Stage 6C-2b)](#782). The `main.go` startup-guard wiring that ties this scanner to the encryption-side guard is split into a new **6C-2d** sub-milestone per the design doc update in this PR. ## What this PR ships ### `encryptionScanner` against `etcdraft.MemoryStorage` ```go func (e *Engine) EncryptionScanner() encryption.EncryptionRelevantScanner ``` Walks `raftpb.Entry` rows in `(startExclusive, endInclusive]` via `storage.Entries(lo, hi, scanMaxBytes)`, filters out: - `EntryConfChange` / `EntryConfChangeV2` — config changes never carry FSM payloads; their `Data[0]` could coincidentally fall in the encryption opcode range without this gate - Zero-length normal entries — etcd-raft leadership bumps after election; would `panic: index out of range` on a naive `Data[0]` read Returns `IsEncryptionRelevantOpcode(ent.Data[0])` over surviving entries — i.e., asks whether any entry's opcode is in `[OpBootstrap, OpEncryptionMax]` = `[0x04, 0x07]`. Note: **OpRegistration (0x03) is correctly excluded** per the predicate's narrowed range from PR #782. ### Error semantics | Path | Behavior | |---|---| | Compacted range (sidecar below snapshot horizon) | Wrapped `ErrCompacted` propagated; caller routes as scanner-error refusal **distinct** from `ErrSidecarBehindRaftLog` | | `nil` storage (engine not opened) | Returns error rather than nil-deref | | Empty range | `(false, nil)` fast path; storage NOT consulted | ### Tests (8 cases covering every branch) | Test | Pins | |---|---| | `EmptyRange` (3 sub-cases) | equal / start-ahead / both-zero — storage never consulted | | `NoRelevantEntries` | OpRegistration (0x03) explicitly NOT hit — pins the PR #782 predicate narrowing end-to-end | | `BootstrapHit` | OpBootstrap in range → hit | | `RotationHit` | OpRotation in range → hit | | `NonNormalEntriesSkipped` | ConfChange / ConfChangeV2 with `Data[0]` in encryption range → NOT hit | | `EmptyDataSkipped` | nil / empty `Data` → NOT hit AND no panic | | `CompactedRange` | Snapshot at index 10, scan `[0, 5)` → error, hit=false | | `NilStorageError` | Zero-value scanner returns an error | ## Design doc updates - **6C-2c** row rewritten to reflect the scanner-only scope of this PR (engine implementation + `Engine.EncryptionScanner()` accessor, **no** `main.go` wiring). - **6C-2d** row added — `main.go` startup-guard phase that ties 6C-2b's pure-function guard to 6C-2c's engine scanner. Will ship as its own PR. - Shipping order updated: `6C-1 ✅ → 6C-2 ✅ → 6C-2b ✅ → 6C-2c (this PR) → 6C-2d → 6D (bundles 6C-3) → 6E (bundles 6C-4)`. - Rationale section updated with the 6C-2c / 6C-2d split. ## Why split 6C-2c into "scanner" vs "wiring" Keeping the engine scanner separate from the `main.go` wiring lets: - Reviewers focus on one concern per PR. - Stage 6D-and-later code paths that need the scanner (capability fan-out gap-coverage check) depend on the engine package without waiting for 6C-2d. - 6C-2d's main.go integration touches the multi-shard lifecycle, which is a different concern from the engine-internal scanner implementation here. ## Caller audit No public API changes outside `Engine.EncryptionScanner()`, which is net-additive. `encryptionScanner` and `isEncryptionRelevantEntry` are package-internal. No existing caller references the new accessor. The implementation depends only on: - `encryption.EncryptionRelevantScanner` (interface from PR #782) - `encryption.IsEncryptionRelevantOpcode` (predicate from PR #782, OpRegistration-excluded) - `etcdraft.MemoryStorage.Entries` (existing engine internal) - `raftpb.Entry` / `raftpb.EntryConfChange` / V2 (existing types) No changes to the engine's apply path or any other lifecycle. ## Five-lens self-review 1. **Data loss** — net-positive. The scanner enables the §9.1 refusal that catches partial-write crashes between an encryption-relevant Raft commit and the sidecar update. Until 6C-2d wires the scanner into main.go the scanner is operator-inert, but its failure modes (compacted-range error, conf-change skip, empty-data skip, nil-storage error) are all correct. 2. **Concurrency / distributed failures** — scanner reads from `MemoryStorage` (thread-safe per etcd-raft contract). Scanner itself is stateless. 3. **Performance** — zero hot-path impact. Called once per shard per process start in 6C-2d, with at most one `Entries()` call in the typical case (gap size << 64 MiB). 4. **Data consistency** — the scanner respects MemoryStorage's compaction boundary: never silently returns "no hit" for a range below the snapshot horizon; propagates the error so the guard routes it as a distinct scanner-error refusal. 5. **Test coverage** — 8 tests cover all 6 predicate paths, all 2 filter gates (type, length), AND the 2 error paths (compacted, nil-storage). The OpRegistration exclusion is pinned end-to-end with a dedicated test case. ## Test plan - [x] `go test -race -timeout=120s ./internal/raftengine/etcd/... ./internal/encryption/...` — PASS - [x] `golangci-lint run ./internal/raftengine/etcd/...` — 0 issues - [ ] Full Jepsen suite — not run; scanner has no caller in this PR (operator-inert until 6C-2d) ## Plan Per the design doc updates in this PR, next sub-milestones: - **6C-2d** — `main.go` startup-guard phase that calls `encryption.GuardSidecarBehindRaftLog` with the scanner from this PR - **6C-3** — `ErrNodeIDCollision` + `ErrLocalEpochRollback` (bundles with Stage 6D) - **6D** — `enable-storage-envelope` admin RPC + §7.1 Phase-1 cutover (unblocked once 6C-2d lands) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Advances encryption-at-rest implementation with internal validation mechanisms to ensure system consistency during startup. * **Documentation** * Updated design documentation clarifying the encryption-at-rest implementation phases and dependency chain. * **Tests** * Added comprehensive test coverage for encryption validation logic, including edge cases and error scenarios. <!-- review_stack_entry_start --> [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/783?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents 3763bfd + 894d2f9 commit 12ec8ba

3 files changed

Lines changed: 505 additions & 14 deletions

File tree

docs/design/2026_04_29_partial_data_at_rest_encryption.md

Lines changed: 26 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ Date: 2026-04-29
2323
| 6C-1 | Subset of §9.1 startup refusal guards that depend only on flags + sidecar state: sidecar present without `--encryption-enabled` (`ErrSidecarPresentWithoutFlag`, downgrade prevention), `--encryption-enabled` without `--kekFile` (`ErrKEKRequiredWithFlag`, fail-fast), KEK loaded but sidecar wrapped DEKs do not unwrap (`ErrKEKMismatch`, operator-error catch). `internal/encryption/startup.go`'s `CheckStartupGuards` runs once in `main.go run()` before `buildShardGroups`. | shipped | PR #778 |
2424
| 6C-2 | Process-local §9.1 guards that read only the encryption package's own state: `ErrLocalEpochExhausted` (any active DEK with `local_epoch == 0xFFFF` — refuses to start until rotated, preventing GCM nonce reuse on the next write), `ErrUnsupportedFilesystem` startup probe (`ProbeSidecarFilesystem` exercises the §5.1 write+rename+dir.Sync sequence on a sentinel file in the sidecar directory at startup so filesystem-capability failures surface BEFORE the first encryption-relevant Raft entry, not on the first WriteSidecar call hours into operation). Also includes the gemini-r3 medium parseErr-annotation fix deferred from PR #778 (`ErrKEKMismatch` now wraps both `parseErr` and the unwrap error when the defensive non-decimal-key_id branch fires). The asymmetric `ErrLocalEpochRollback` peer is split into 6C-3 because it requires the writer-registry record (Stage 7). The `ErrSidecarBehindRaftLog` guard moves to 6C-2b because it requires raftengine integration — outside the encryption package boundary. | open | — |
2525
| 6C-2b | `ErrSidecarBehindRaftLog` guard PRIMITIVE — encryption-side sentinel, `IsEncryptionRelevantOpcode` predicate (§5.5), `EncryptionRelevantScanner` cross-package interface, and `GuardSidecarBehindRaftLog` pure-function guard that takes both indices + a scanner callback. Tests use a fake scanner to exercise every branch (caught-up / gap-not-covered / gap-covered / scanner-error / nil-scanner-fails-closed). Wires nothing into main.go; the raftengine-side implementation lands in 6C-2c. | shipped-in-this-PR | this PR |
26-
| 6C-2c | Raftengine implementation of `EncryptionRelevantScanner` (WAL-based scan of Raft entries in the supplied index range against the §5.5 predicate) + `main.go` startup-guard phase that runs after `buildShardGroups` for each shard with encryption enabled, before any gRPC server starts serving. Inherits the encryption-side primitive from 6C-2b. | open ||
26+
| 6C-2c | Raftengine implementation of `EncryptionRelevantScanner` against `etcdraft.MemoryStorage.Entries`. `Engine.EncryptionScanner()` exposes the scanner as the encryption-side interface so callers (Stage 6C-2d, 6D, 6E) inherit it without coupling to the engine internals. Tests pin every branch: empty range, no-relevant entries (with OpRegistration explicitly), bootstrap hit, rotation hit, conf-change skipped, empty-data skipped, compacted-range error propagation, nil-storage error. | shipped-in-this-PR | this PR |
27+
| 6C-2d | `main.go` startup-guard phase that runs after `buildShardGroups` for each shard with encryption enabled, before any gRPC server starts serving. Calls `encryption.GuardSidecarBehindRaftLog` with the shard's engine `AppliedIndex()` + sidecar `RaftAppliedIndex` + the engine's `EncryptionScanner()` from 6C-2c. The guard's existing branches (caught-up / gap-not-covered / gap-covered / scanner-error / nil-scanner-fails-closed) determine refusal behavior. | open ||
2728
| 6C-3 | Cluster-wide §9.1 guards that require the Voters ∪ Learners membership view: `ErrNodeIDCollision` (two members hash to the same 16-bit `node_id`), `ErrLocalEpochRollback` (sidecar `local_epoch` ≤ writer-registry record). Naturally bundles with the Stage 6D capability fan-out which already needs that membership view. | open ||
2829
| 6C-4 | Phase-2-specific §9.1 guards: `ErrEnvelopeCutoverDivergence` (sidecar `raft_envelope_cutover_index` disagrees with snapshot header), `ErrEncryptionNotBootstrapped` (`enable-raft-envelope` before bootstrap), `ErrLocalEpochOutOfRange` on the wire side of `GetCapability` / `GetSidecarState`. Bundles with Stage 6E (`enable-raft-envelope` admin RPC + Phase-2 cutover). | open ||
2930
| 6D | §6.6 `enable-storage-envelope` admin RPC + §7.1 Phase-1 storage cutover (§6.2 toggle ON) + Voters ∪ Learners capability gate (depends on 6B for mutator wiring AND 6C-1+6C-2 for §9.1 startup-refusal guards; bundles 6C-3 for the membership-view-dependent collision check — see rationale) | open ||
@@ -200,18 +201,28 @@ production-safe state, and unblocks the next one.
200201
ONLY on the encryption package without waiting on the
201202
raftengine-side scan helper.
202203

203-
* **6C-2c** — Raftengine implementation of the
204-
`EncryptionRelevantScanner` interface defined in 6C-2b. Walks
205-
the WAL across the supplied entry-index range, decodes each
206-
entry's opcode byte, and applies the §5.5 predicate. Wires
207-
a new startup-guard phase into `main.go` that runs AFTER
204+
* **6C-2c** — Raftengine-side implementation of the
205+
`EncryptionRelevantScanner` interface defined in 6C-2b.
206+
Implemented against `etcdraft.MemoryStorage.Entries` so the
207+
scan reads in-memory state that the engine has just replayed
208+
from the WAL + snapshot at Open() time. Filters out
209+
EntryConfChange / EntryConfChangeV2 and zero-length normal
210+
entries (etcd-raft leadership bumps) so those don't falsely
211+
trigger the predicate. Compacted-range errors propagate
212+
wrapped — the caller (Stage 6C-2d's main.go phase) surfaces
213+
them as a scanner-error refusal distinct from
214+
ErrSidecarBehindRaftLog. Exposed via the new
215+
`Engine.EncryptionScanner()` accessor.
216+
217+
* **6C-2d**`main.go` startup-guard phase that ties 6C-2b's
218+
pure-function guard to 6C-2c's engine scanner. Runs AFTER
208219
`buildShardGroups` returns and each shard's engine has opened,
209220
but BEFORE any gRPC server starts serving — a different
210221
lifecycle phase from the existing 6C-1 / 6C-2 guards (which
211222
run before engine startup). For each shard with encryption
212-
enabled, the phase calls `GuardSidecarBehindRaftLog` with the
213-
shard's engine `AppliedIndex()` and a scanner backed by the
214-
engine's WAL.
223+
enabled, calls `GuardSidecarBehindRaftLog` with the shard's
224+
engine `AppliedIndex()` + sidecar `RaftAppliedIndex` + the
225+
engine's `EncryptionScanner()`.
215226

216227
* **6C-3** — Membership-view guards. `ErrNodeIDCollision`
217228
requires a Voters ∪ Learners enumeration of every Raft
@@ -229,11 +240,12 @@ production-safe state, and unblocks the next one.
229240
`local_epoch` values to the §4.1 16-bit field. Bundles with
230241
Stage 6E's `enable-raft-envelope` admin RPC.
231242

232-
**Shipping order:** 6C-1 → 6C-2 → 6C-2b (primitive) → 6C-2c
233-
(raftengine implementation + main.go wiring) → 6D (bundles
234-
6C-3) → 6E (bundles 6C-4). Stage 6D is unblocked once 6C-1
235-
+ 6C-2 + 6C-2b + 6C-2c have all shipped (the latter pair
236-
is the load-bearing `ErrSidecarBehindRaftLog` work); 6D
243+
**Shipping order:** 6C-1 → 6C-2 → 6C-2b (encryption-side
244+
primitive) → 6C-2c (raftengine scanner) → 6C-2d (main.go
245+
startup-guard wiring) → 6D (bundles 6C-3) → 6E (bundles
246+
6C-4). Stage 6D is unblocked once 6C-1 + 6C-2 + 6C-2b +
247+
6C-2c + 6C-2d have all shipped (the latter three together
248+
are the load-bearing `ErrSidecarBehindRaftLog` work); 6D
237249
does not need to wait for the 6C-3/6C-4 work that bundles
238250
into it.
239251

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
package etcd
2+
3+
import (
4+
"math"
5+
6+
"github.com/bootjp/elastickv/internal/encryption"
7+
"github.com/cockroachdb/errors"
8+
etcdraft "go.etcd.io/raft/v3"
9+
raftpb "go.etcd.io/raft/v3/raftpb"
10+
)
11+
12+
// scanMaxBytes caps the byte size of a single Entries() fetch from
13+
// MemoryStorage during a gap scan. The §9.1 ErrSidecarBehindRaftLog
14+
// guard runs once per process start, so a comfortable margin over
15+
// any expected gap (typical: 1-N entries from a partial-write
16+
// crash) keeps the scan to a single Entries() call without the
17+
// per-iteration ceremony of a pagination loop. If a real gap ever
18+
// exceeds this size the iteration falls back to additional calls
19+
// (see scanEntries() below) so the cap is not a correctness
20+
// boundary, only a per-call memory cap.
21+
const scanMaxBytes uint64 = 64 << 20 // 64 MiB
22+
23+
// encryptionScanner implements encryption.EncryptionRelevantScanner
24+
// against the engine's in-memory raft storage. It wraps the
25+
// engine's *MemoryStorage; a fresh value is constructed per
26+
// Engine.EncryptionScanner() call (cheap — no internal state).
27+
// Callers get access to it via Engine.EncryptionScanner().
28+
//
29+
// The scan walks raftpb.Entry rows in (startExclusive, endInclusive]
30+
// via etcdraft.MemoryStorage.Entries(lo, hi, maxSize), filters out
31+
// non-normal entries (conf-change, empty leadership bump) since the
32+
// §5.5 IsEncryptionRelevantOpcode predicate only applies to normal
33+
// FSM entries, and reports whether any survives the predicate.
34+
//
35+
// Compacted-out-of-MemoryStorage ranges return raft's
36+
// ErrCompacted (or ErrUnavailable) which the scanner wraps and
37+
// propagates. GuardSidecarBehindRaftLog routes that as a scanner
38+
// error (NOT as ErrSidecarBehindRaftLog), so the operator sees a
39+
// distinct refusal: "the gap exists but is below the snapshot
40+
// horizon; run encryption resync-sidecar to advance past the
41+
// compacted entries". A sidecar that far behind is itself the
42+
// operator's primary problem; the scanner just reports honestly.
43+
type encryptionScanner struct {
44+
storage *etcdraft.MemoryStorage
45+
}
46+
47+
// HasEncryptionRelevantEntryInRange satisfies the
48+
// encryption.EncryptionRelevantScanner contract: returns true iff
49+
// at least one raftpb.Entry with index in (startExclusive,
50+
// endInclusive] is a §5.5 sidecar-mutating opcode per
51+
// encryption.IsEncryptionRelevantOpcode.
52+
//
53+
// Empty range (startExclusive >= endInclusive) returns (false, nil)
54+
// per the interface contract. The encryption.GuardSidecarBehindRaftLog
55+
// caller precomputes the non-empty-gap branch and only calls scan
56+
// when there's a range to inspect, but the defensive guard stays.
57+
func (s *encryptionScanner) HasEncryptionRelevantEntryInRange(startExclusive, endInclusive uint64) (bool, error) {
58+
// The EncryptionRelevantScanner contract (audit.go) requires
59+
// the empty-range case to return (false, nil) UNCONDITIONALLY
60+
// — including on nil-storage receivers. Reorder the guards so
61+
// an empty range short-circuits before any state-dependent
62+
// failure path. GuardSidecarBehindRaftLog pre-filters the
63+
// empty-gap case in production, so this is contract conformance
64+
// rather than a reachable bug; pinning the order keeps any
65+
// future caller (6C-2d, capability fan-out) from triggering
66+
// the surprising "empty range on nil scanner returns error"
67+
// path.
68+
if startExclusive >= endInclusive {
69+
return false, nil
70+
}
71+
if s.storage == nil {
72+
return false, errors.New("encryption scanner: storage is nil (engine not opened?)")
73+
}
74+
// MemoryStorage.Entries(lo, hi, ...) returns indices [lo, hi).
75+
// We want (startExclusive, endInclusive] = [startExclusive+1,
76+
// endInclusive+1). Guard against uint64 wraparound at
77+
// MaxUint64 (otherwise hi = 0 and the loop never runs).
78+
if endInclusive == math.MaxUint64 {
79+
return false, errors.New("encryption scanner: endInclusive == math.MaxUint64 cannot be expressed as half-open Entries() range")
80+
}
81+
lo := startExclusive + 1
82+
hi := endInclusive + 1
83+
cursor := lo
84+
for cursor < hi {
85+
batch, err := s.storage.Entries(cursor, hi, scanMaxBytes)
86+
if err != nil {
87+
return false, errors.Wrapf(err,
88+
"encryption scanner: read raft entries [%d, %d) from MemoryStorage",
89+
cursor, hi)
90+
}
91+
if len(batch) == 0 {
92+
// Fail closed: an empty batch with cursor < hi means
93+
// MemoryStorage didn't advance even though the requested
94+
// range is non-empty. Treating this as "no hit" would
95+
// silently classify an unscanned range as harmless, so
96+
// we surface it as a scanner error. The caller routes
97+
// scanner errors as a refusal distinct from
98+
// ErrSidecarBehindRaftLog.
99+
return false, errors.WithStack(errors.Newf(
100+
"encryption scanner: Entries(cursor=%d, hi=%d) returned no entries and no error (unscanned gap)",
101+
cursor, hi))
102+
}
103+
for _, ent := range batch {
104+
if isEncryptionRelevantEntry(ent) {
105+
return true, nil
106+
}
107+
}
108+
// Continue from one past the last returned index.
109+
cursor = batch[len(batch)-1].Index + 1
110+
}
111+
return false, nil
112+
}
113+
114+
// isEncryptionRelevantEntry centralises the per-Entry predicate.
115+
// Returns false for:
116+
//
117+
// - EntryConfChange / EntryConfChangeV2 — never carry encryption
118+
// FSM payloads.
119+
// - Entries whose Data does not decode as a valid proposal
120+
// envelope (zero-length leadership bumps, malformed entries).
121+
// - Envelopes wrapping an empty FSM payload.
122+
//
123+
// For valid envelopes, returns IsEncryptionRelevantOpcode(payload[0])
124+
// where payload is the post-envelope-strip FSM byte sequence. The
125+
// engine wraps every EntryNormal in a 9-byte proposal envelope
126+
// (1 byte version + 8 byte proposal ID) via encodeProposalEnvelope
127+
// at propose time, so the raw Data[0] is always the envelope
128+
// version 0x01 — NEVER the FSM opcode. A scanner that misreads
129+
// the layout would return false for every entry in a real Raft
130+
// log, silently defeating the §9.1 ErrSidecarBehindRaftLog guard.
131+
func isEncryptionRelevantEntry(ent raftpb.Entry) bool {
132+
if ent.Type != raftpb.EntryNormal {
133+
return false
134+
}
135+
_, payload, ok := decodeProposalEnvelope(ent.Data)
136+
if !ok || len(payload) == 0 {
137+
return false
138+
}
139+
return encryption.IsEncryptionRelevantOpcode(payload[0])
140+
}
141+
142+
// EncryptionScanner returns the engine's
143+
// encryption.EncryptionRelevantScanner implementation. The
144+
// returned value is safe to call after Open() has populated
145+
// MemoryStorage from the on-disk WAL + snapshot, and only after
146+
// — calling it before Open() returns a nil-storage error from
147+
// HasEncryptionRelevantEntryInRange.
148+
//
149+
// Used by main.go's startup-guard phase (Stage 6C-2d) to invoke
150+
// encryption.GuardSidecarBehindRaftLog against the engine's
151+
// applied index and the sidecar's raft_applied_index.
152+
//
153+
// Nil-receiver-safe: a nil *Engine returns a zero-value
154+
// encryptionScanner whose HasEncryptionRelevantEntryInRange will
155+
// surface the nil-storage error. This matches the rest of the
156+
// package's defensive nil-receiver posture (Status() / AppliedIndex()
157+
// return zero values on nil receivers) and lets call sites in
158+
// the lifecycle that may forward a maybe-nil engine pointer
159+
// avoid a panic during startup/refusal triage.
160+
func (e *Engine) EncryptionScanner() encryption.EncryptionRelevantScanner {
161+
if e == nil {
162+
return &encryptionScanner{}
163+
}
164+
return &encryptionScanner{storage: e.storage}
165+
}

0 commit comments

Comments
 (0)