Commit 3763bfd
authored
feat(encryption): Stage 6C-2b — ErrSidecarBehindRaftLog guard primitive (encryption-side) (#782)
## Summary
Stage 6C-2b per the [PR #762
plan](#762) and the 6C
sub-decomposition landed in [PR
#781](#781).
Ships the **encryption-side primitive** for the
`ErrSidecarBehindRaftLog` §9.1 guard so Stage 6D-and-later code paths
(e.g., §7.1 capability fan-out's gap-coverage check) can depend on the
predicate without waiting for the raftengine-side scanner helper. The
actual raftengine WAL-scan implementation + `main.go` startup-guard
phase wiring lands in a follow-up **6C-2c** PR.
## What this PR ships
### One new sentinel
| Sentinel | Catches |
|---|---|
| `ErrSidecarBehindRaftLog` | Partial-write crash: encryption-relevant
Raft entry applied to engine but §5.1 sidecar write didn't complete.
Recovery: `encryption resync-sidecar`. |
Without this guard the encryption package would silently rebuild
keystore state from an outdated wrapped-DEK snapshot, and a fail-closed
read would fire on the next post-cutover entry with `unknown_key_id`
(halting apply).
### Three new exported primitives in `internal/encryption/audit.go`
```go
// §5.5 predicate
func IsEncryptionRelevantOpcode(opcode byte) bool
// Cross-package contract (raftengine implements in 6C-2c)
type EncryptionRelevantScanner interface {
HasEncryptionRelevantEntryInRange(startExclusive, endInclusive uint64) (bool, error)
}
// Pure-function guard
func GuardSidecarBehindRaftLog(sidecarAppliedIdx, engineAppliedIdx uint64,
scanner EncryptionRelevantScanner) error
```
The encryption package owns the **semantic-level** predicate; `fsmwire`
owns the **wire-level** constants (`OpEncryptionMin` /
`OpEncryptionMax`). Decoupling means a future widening of the encryption
opcode range (Stage 6E reserves 0x06 / 0x07) is a single-line change in
`fsmwire`, not a cross-package edit.
### Guard contract (5 branches, all tested)
| Branch | Behavior |
|---|---|
| `sidecarAppliedIdx >= engineAppliedIdx` | Caught up. Return nil.
**Scanner NEVER called.** |
| Non-empty gap, scanner says no hit | Return nil. |
| Non-empty gap, scanner says hit | Fire `ErrSidecarBehindRaftLog` with
both indices in annotation. |
| Scanner error | Propagate wrapped, **NOT** marked with
`ErrSidecarBehindRaftLog` (scanner failure ≠ gap-coverage refusal). |
| `nil` scanner with non-empty gap | **Fail closed** with
`ErrSidecarBehindRaftLog` (cannot prove safety). |
| `nil` scanner on caught-up path | Return nil (scanner is irrelevant
when there's no gap). |
## Design doc updates
- **6C-2b** row rewritten to reflect the primitive-only scope of this PR
(encryption package only, no `main.go` wiring).
- **6C-2c** row added — raftengine implementation of the scanner +
`main.go` startup-guard phase wiring.
- Shipping order updated: `6C-1 ✅ → 6C-2 ✅ → 6C-2b (this PR) → 6C-2c →
6D (bundles 6C-3) → 6E (bundles 6C-4)`.
- Rationale section updated with the 6C-2b / 6C-2c split: the primitive
lands in 6C-2b so Stage 6D-and-later RPC paths that need the same
predicate (capability fan-out gap check) can depend on the encryption
package without waiting for 6C-2c.
## Why split 6C-2b into "primitive" vs "integration"
The §5.5 predicate (`IsEncryptionRelevantOpcode`) is needed by more than
just the startup guard. Stage 6E's capability fan-out checks the same
predicate when refusing a cutover RPC if any peer's sidecar is behind.
Shipping the predicate + sentinel + interface without the raftengine
WAL-scan implementation lets 6D / 6E proceed against the encryption
package boundary without waiting on the cross-package plumbing.
## Caller audit
No public function signatures changed. All four new exports
(`IsEncryptionRelevantOpcode`, `EncryptionRelevantScanner`,
`GuardSidecarBehindRaftLog`, `ErrSidecarBehindRaftLog`) are
**net-additive**. No existing callers to audit.
## Five-lens self-review
1. **Data loss** — net-positive. The primitive enables the §9.1 refusal
that catches partial-write crashes between an encryption-relevant Raft
commit and the sidecar update. Until 6C-2c wires it into `main.go` the
refusal is operator-inert, but the primitive's failure modes (nil
scanner → fail closed, scanner error → propagate distinct) are correct.
2. **Concurrency / distributed failures** — pure function, no shared
state. Scanner interface is single-call per guard invocation;
implementations are free to lock internally.
3. **Performance** — zero hot-path impact. The guard is called once per
shard per process start (in 6C-2c); within the guard it's one integer
comparison + at most one scanner call.
4. **Data consistency** — `ErrSidecarBehindRaftLog` is the only §9.1
guard that catches the "Raft committed but sidecar missed it"
partial-write window. Without it the next post-cutover read would
HaltApply with `unknown_key_id`; with it the operator gets a single
typed startup refusal pointing at the right recovery runbook.
5. **Test coverage** — 7 new tests cover the predicate (range-member +
known-opcode + below/above boundary) AND every branch of the guard
(caught-up, gap-not-covered, gap-covered, scanner-error,
nil-scanner-fails-closed, nil-scanner-caught-up). The `fakeScanner`
exercises the interface contract in isolation from any real raftengine.
## Test plan
- [x] `go test -race -timeout=60s ./internal/encryption/...` — PASS
- [x] `golangci-lint run ./internal/encryption/...` — 0 issues
- [ ] Full Jepsen suite — not run; the primitive has no caller in this
PR (operator-inert until 6C-2c)
## Plan
Per the design doc updates in this PR, next sub-milestones:
- **6C-2c** — raftengine implementation of `EncryptionRelevantScanner`
(WAL-based) + `main.go` startup-guard phase wiring
- **6C-3** — `ErrNodeIDCollision` + `ErrLocalEpochRollback` (bundles
with Stage 6D)
- **6D** — `enable-storage-envelope` admin RPC + §7.1 Phase-1 cutover
(unblocked once 6C-2c lands)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added a safety guard to prevent sidecar lag from silently bypassing
encryption-relevant log entries.
* Added error handling with operator guidance for sidecar
synchronization situations.
* **Tests**
* Added comprehensive test coverage for encryption relevance detection
and sidecar lag validation.
<!-- review_stack_entry_start -->
[](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/782?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)
<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->4 files changed
Lines changed: 440 additions & 14 deletions
File tree
- docs/design
- internal/encryption
Lines changed: 34 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| |||
186 | 187 | | |
187 | 188 | | |
188 | 189 | | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
197 | 215 | | |
198 | 216 | | |
199 | 217 | | |
| |||
211 | 229 | | |
212 | 230 | | |
213 | 231 | | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
219 | 239 | | |
220 | 240 | | |
221 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
0 commit comments