Skip to content

Commit 6ab8a33

Browse files
committed
docs(encryption): address PR707 round-10 (claude[bot] r7 critical/medium/minor)
Critical (claude[bot] r7 #1): writer registry §4.1 only covered process-start and post-rotation registration. Phase 1 cluster-wide activation left every running nodes in a window where they encrypt before having proposed RegisterEncryptionWriter, defeating the registry collision guarantee. Add a third trigger: the bootstrap-encryption Raft entry (§5.6 step 1a) carries a batch writer-registry payload covering every voting member that passed the capability pre-check. FSM apply writes the wrapped DEKs and the registry entries in the same Pebble batch, closing the window without a separate round-trip. §4.1 reorganised to enumerate the three triggers explicitly with their coordinator-side gating. Medium (claude[bot] r7 #2): §6.3 gained an explicit specification for the three new encryption-internal Raft tag constants (0x03 RegisterEncryptionWriter, 0x04 bootstrap, 0x05 rotation/rewrap/retire/enable-storage-envelope/enable-raft-envelope). FSM Apply dispatch order is now spelled out (envelope unwrap > encryption-internal > HLC-lease > regular). Notes that registration entries are replayed via Raft and therefore NOT in isEncryptionRelevant() (sidecar lag check), only the bootstrap/rotation family is. Minor (claude[bot] r7 #3): §4.6 pebbleSnapshotMagic was a stale reference to a different snapshot layer; replaced with hlcSnapshotMagic / hlcSnapshotMagicV2 cross-referenced to §4.4.
1 parent 86a2e33 commit 6ab8a33

1 file changed

Lines changed: 97 additions & 11 deletions

File tree

docs/design/2026_04_29_proposed_data_at_rest_encryption.md

Lines changed: 97 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -207,11 +207,38 @@ Envelope format (single byte stream, stored as the Pebble value):
207207
last_seen_local_epoch: uint16 }
208208
```
209209
210-
Before a node performs **its first write under a DEK**
211-
(whether at process start under an existing DEK, or
212-
immediately after a rotation that created a new active
213-
DEK), it proposes a `RegisterEncryptionWriter(dek_id,
214-
full_node_id, local_epoch)` Raft entry. FSM apply checks:
210+
Before a node performs **its first write under a DEK** it
211+
must appear in the registry for that DEK. Three
212+
registration paths exist, each closing a distinct rollout
213+
window:
214+
215+
- **Bootstrap (Phase 1 cluster-wide activation).** When
216+
the leader proposes the `bootstrap-encryption` Raft entry
217+
(§5.6 step 1a), the entry body carries a batch
218+
registration set covering every voting member that
219+
passed the capability pre-check. FSM apply inserts the
220+
registry entries in the same Pebble batch as the
221+
bootstrap. This handles the common case where the
222+
whole cluster transitions from cleartext to encrypted
223+
at once — no per-node `RegisterEncryptionWriter`
224+
round-trip is required for the initial cohort.
225+
- **Process start under an existing DEK.** A node that
226+
restarts (or joins a cluster that already has active
227+
DEKs) proposes `RegisterEncryptionWriter(dek_id,
228+
full_node_id, local_epoch)` from the coordinator
229+
**before the coordinator accepts any client write** that
230+
would land as an encrypted entry. FSM apply may run on
231+
Raft entries proposed by the leader during this window
232+
(those are decrypted using DEKs the node already has from
233+
its sidecar), but no self-originated encrypted entry can
234+
be queued by the local coordinator until registration
235+
commits.
236+
- **Post-rotation.** When a node observes a `rotate-dek`
237+
apply, it proposes `RegisterEncryptionWriter` against
238+
the new DEK with the same coordinator-side gate as the
239+
process-start case above.
240+
241+
For each path, FSM apply checks:
215242
216243
1. If no entry exists at the registry key → insert,
217244
allowing the registration. The node may now write under
@@ -498,7 +525,9 @@ For absolute clarity, the following remain unencrypted:
498525
- Raft metadata: term, index, entry type, configuration changes.
499526
(Membership changes carry node IDs and addresses, which are
500527
topology, not user data.)
501-
- The `pebbleSnapshotMagic` header on FSM snapshot streams.
528+
- The `hlcSnapshotMagic` / `hlcSnapshotMagicV2` header on FSM
529+
snapshot streams (v1/v2 respectively; see §4.4 for the
530+
versioned format).
502531
- The encryption sidecar file itself (§5.1) — it stores **wrapped**
503532
DEKs only; the wrap key (KEK) is held externally.
504533

@@ -895,11 +924,40 @@ same way as every subsequent rotation:
895924
and `active.raft` fields are unset (`key_id == 0`). If so, it
896925
first proposes a `bootstrap-encryption` Raft entry containing
897926
freshly-generated `dek_storage` + `dek_raft` (CSPRNG via
898-
`crypto/rand`, wrapped under the current KEK), then proposes
899-
the actual `enable-storage-envelope` flag entry. The two
900-
entries are pipelined but not atomic; the order is fixed
901-
so the flag entry is always preceded by a bootstrap that
902-
has already populated every node's sidecar.
927+
`crypto/rand`, wrapped under the current KEK) **plus the
928+
batch writer registry** described in step 1a below, then
929+
proposes the actual `enable-storage-envelope` flag entry.
930+
The two entries are pipelined but not atomic; the order is
931+
fixed so the flag entry is always preceded by a bootstrap
932+
that has already populated every node's sidecar and the
933+
writer registry.
934+
935+
1a. **Batch writer registry in the same bootstrap entry.**
936+
Without this step, the FSM apply path on every node
937+
would start encrypting from the first post-flag entry
938+
before any node had a `RegisterEncryptionWriter`
939+
proposal in flight — and FSM apply is synchronous and
940+
deterministic, so it cannot block for a Raft round-trip
941+
to register itself. The leader closes that window by
942+
collecting `(full_node_id, local_epoch_at_capability_check)`
943+
from every voting member during the §7.1 Phase 0 / Phase 1
944+
capability pre-check (the same `EncryptionAdmin.GetCapability`
945+
fan-out) and including the resulting batch as a field
946+
on the `bootstrap-encryption` Raft entry body. The FSM
947+
handler atomically writes the new wrapped DEKs to the
948+
sidecar AND inserts the registry entries
949+
`!encryption|writers|<dek_storage_id>|<uint16(node_id)>`
950+
and `!encryption|writers|<dek_raft_id>|<uint16(node_id)>`
951+
in the same Pebble batch as the bootstrap apply. From
952+
the next Raft index onward, every encrypting node is
953+
already in the registry, so the §4.1 collision check
954+
has its full input cluster-wide. Nodes added to the
955+
cluster *after* bootstrap fall back to the
956+
process-start trigger described in §4.1 (their first
957+
`RegisterEncryptionWriter` is proposed before they
958+
accept any encrypted write at the coordinator, before
959+
the FSM apply path can be called for a self-originated
960+
proposal).
903961
2. **Idempotency.** `bootstrap-encryption` is **rejected** by
904962
FSM apply if the sidecar already has an active storage DEK
905963
(the leader's pre-check above is a fast path; the FSM check
@@ -1025,6 +1083,34 @@ coordinator does not pre-encrypt; the FSM does not bypass this path.
10251083
- `internal/raftengine/etcd/engine.go` — no changes. It transports
10261084
opaque bytes; whether they are cleartext or ciphertext is
10271085
invisible to it.
1086+
- **Encryption-internal Raft entry types added to `kv/fsm.go`.**
1087+
Three new tag constants extend the existing
1088+
`raftEncodeSingle` / `raftEncodeBatch` / `raftEncodeHLCLease`
1089+
space:
1090+
- `raftEncodeEncryptionRegistration = 0x03` — body is a
1091+
single `RegisterEncryptionWriter(dek_id, full_node_id,
1092+
local_epoch)` per §4.1.
1093+
- `raftEncodeEncryptionBootstrap = 0x04` — body is the
1094+
`bootstrap-encryption` payload per §5.6 (initial wrapped
1095+
DEK pair plus the batch writer registry covering every
1096+
voting member that passed the capability pre-check).
1097+
- `raftEncodeEncryptionRotation = 0x05` — body is the
1098+
rotate / rewrap-deks / retire-dek / enable-storage-envelope /
1099+
enable-raft-envelope payload per §5.2 / §5.4 / §7.1 with
1100+
a sub-tag in the body distinguishing them.
1101+
1102+
FSM `Apply` dispatches in this order: (1) raft envelope
1103+
unwrap if `index > raft_envelope_cutover_index`; (2)
1104+
encryption-internal tags above (`0x03..0x05`) handled by
1105+
`internal/encryption/` package via a callback registered at
1106+
FSM construction; (3) HLC-lease (`0x02`); (4) regular
1107+
`decodeRaftRequests` (`0x00`/`0x01`). Encryption-internal
1108+
entries write to Pebble keys under `!encryption|...` and to
1109+
the sidecar (for the `0x04`/`0x05` cases). They are
1110+
replayed naturally via the Raft log on restart, so the
1111+
`RegisterEncryptionWriter` entries themselves are NOT in
1112+
the §5.5 `isEncryptionRelevant()` predicate (only the
1113+
`0x04`/`0x05` family is, because those mutate `keys.json`).
10281114

10291115
### 6.4 Compress-then-encrypt
10301116

0 commit comments

Comments
 (0)