@@ -207,11 +207,38 @@ Envelope format (single byte stream, stored as the Pebble value):
207207 last_seen_local_epoch: uint16 }
208208 ```
209209
210- Before a node performs **its first write under a DEK**
211- (whether at process start under an existing DEK, or
212- immediately after a rotation that created a new active
213- DEK), it proposes a `RegisterEncryptionWriter(dek_id,
214- full_node_id, local_epoch)` Raft entry. FSM apply checks:
210+ Before a node performs **its first write under a DEK** it
211+ must appear in the registry for that DEK. Three
212+ registration paths exist, each closing a distinct rollout
213+ window:
214+
215+ - **Bootstrap (Phase 1 cluster-wide activation).** When
216+ the leader proposes the `bootstrap-encryption` Raft entry
217+ (§5.6 step 1a), the entry body carries a batch
218+ registration set covering every voting member that
219+ passed the capability pre-check. FSM apply inserts the
220+ registry entries in the same Pebble batch as the
221+ bootstrap. This handles the common case where the
222+ whole cluster transitions from cleartext to encrypted
223+ at once — no per-node `RegisterEncryptionWriter`
224+ round-trip is required for the initial cohort.
225+ - **Process start under an existing DEK.** A node that
226+ restarts (or joins a cluster that already has active
227+ DEKs) proposes `RegisterEncryptionWriter(dek_id,
228+ full_node_id, local_epoch)` from the coordinator
229+ **before the coordinator accepts any client write** that
230+ would land as an encrypted entry. FSM apply may run on
231+ Raft entries proposed by the leader during this window
232+ (those are decrypted using DEKs the node already has from
233+ its sidecar), but no self-originated encrypted entry can
234+ be queued by the local coordinator until registration
235+ commits.
236+ - **Post-rotation.** When a node observes a `rotate-dek`
237+ apply, it proposes `RegisterEncryptionWriter` against
238+ the new DEK with the same coordinator-side gate as the
239+ process-start case above.
240+
241+ For each path, FSM apply checks:
215242
216243 1. If no entry exists at the registry key → insert,
217244 allowing the registration. The node may now write under
@@ -498,7 +525,9 @@ For absolute clarity, the following remain unencrypted:
498525- Raft metadata: term, index, entry type, configuration changes.
499526 (Membership changes carry node IDs and addresses, which are
500527 topology, not user data.)
501- - The ` pebbleSnapshotMagic ` header on FSM snapshot streams.
528+ - The ` hlcSnapshotMagic ` / ` hlcSnapshotMagicV2 ` header on FSM
529+ snapshot streams (v1/v2 respectively; see §4.4 for the
530+ versioned format).
502531- The encryption sidecar file itself (§5.1) — it stores ** wrapped**
503532 DEKs only; the wrap key (KEK) is held externally.
504533
@@ -895,11 +924,40 @@ same way as every subsequent rotation:
895924 and ` active.raft ` fields are unset (` key_id == 0 ` ). If so, it
896925 first proposes a ` bootstrap-encryption ` Raft entry containing
897926 freshly-generated ` dek_storage ` + ` dek_raft ` (CSPRNG via
898- ` crypto/rand ` , wrapped under the current KEK), then proposes
899- the actual ` enable-storage-envelope ` flag entry. The two
900- entries are pipelined but not atomic; the order is fixed
901- so the flag entry is always preceded by a bootstrap that
902- has already populated every node's sidecar.
927+ ` crypto/rand ` , wrapped under the current KEK) ** plus the
928+ batch writer registry** described in step 1a below, then
929+ proposes the actual ` enable-storage-envelope ` flag entry.
930+ The two entries are pipelined but not atomic; the order is
931+ fixed so the flag entry is always preceded by a bootstrap
932+ that has already populated every node's sidecar and the
933+ writer registry.
934+
935+ 1a. ** Batch writer registry in the same bootstrap entry.**
936+ Without this step, the FSM apply path on every node
937+ would start encrypting from the first post-flag entry
938+ before any node had a ` RegisterEncryptionWriter `
939+ proposal in flight — and FSM apply is synchronous and
940+ deterministic, so it cannot block for a Raft round-trip
941+ to register itself. The leader closes that window by
942+ collecting ` (full_node_id, local_epoch_at_capability_check) `
943+ from every voting member during the §7.1 Phase 0 / Phase 1
944+ capability pre-check (the same ` EncryptionAdmin.GetCapability `
945+ fan-out) and including the resulting batch as a field
946+ on the ` bootstrap-encryption ` Raft entry body. The FSM
947+ handler atomically writes the new wrapped DEKs to the
948+ sidecar AND inserts the registry entries
949+ ` !encryption|writers|<dek_storage_id>|<uint16(node_id)> `
950+ and ` !encryption|writers|<dek_raft_id>|<uint16(node_id)> `
951+ in the same Pebble batch as the bootstrap apply. From
952+ the next Raft index onward, every encrypting node is
953+ already in the registry, so the §4.1 collision check
954+ has its full input cluster-wide. Nodes added to the
955+ cluster * after* bootstrap fall back to the
956+ process-start trigger described in §4.1 (their first
957+ ` RegisterEncryptionWriter ` is proposed before they
958+ accept any encrypted write at the coordinator, before
959+ the FSM apply path can be called for a self-originated
960+ proposal).
9039612 . ** Idempotency.** ` bootstrap-encryption ` is ** rejected** by
904962 FSM apply if the sidecar already has an active storage DEK
905963 (the leader's pre-check above is a fast path; the FSM check
@@ -1025,6 +1083,34 @@ coordinator does not pre-encrypt; the FSM does not bypass this path.
10251083- ` internal/raftengine/etcd/engine.go ` — no changes. It transports
10261084 opaque bytes; whether they are cleartext or ciphertext is
10271085 invisible to it.
1086+ - ** Encryption-internal Raft entry types added to ` kv/fsm.go ` .**
1087+ Three new tag constants extend the existing
1088+ ` raftEncodeSingle ` / ` raftEncodeBatch ` / ` raftEncodeHLCLease `
1089+ space:
1090+ - ` raftEncodeEncryptionRegistration = 0x03 ` — body is a
1091+ single `RegisterEncryptionWriter(dek_id, full_node_id,
1092+ local_epoch)` per §4.1.
1093+ - ` raftEncodeEncryptionBootstrap = 0x04 ` — body is the
1094+ ` bootstrap-encryption ` payload per §5.6 (initial wrapped
1095+ DEK pair plus the batch writer registry covering every
1096+ voting member that passed the capability pre-check).
1097+ - ` raftEncodeEncryptionRotation = 0x05 ` — body is the
1098+ rotate / rewrap-deks / retire-dek / enable-storage-envelope /
1099+ enable-raft-envelope payload per §5.2 / §5.4 / §7.1 with
1100+ a sub-tag in the body distinguishing them.
1101+
1102+ FSM ` Apply ` dispatches in this order: (1) raft envelope
1103+ unwrap if ` index > raft_envelope_cutover_index ` ; (2)
1104+ encryption-internal tags above (` 0x03..0x05 ` ) handled by
1105+ ` internal/encryption/ ` package via a callback registered at
1106+ FSM construction; (3) HLC-lease (` 0x02 ` ); (4) regular
1107+ ` decodeRaftRequests ` (` 0x00 ` /` 0x01 ` ). Encryption-internal
1108+ entries write to Pebble keys under ` !encryption|... ` and to
1109+ the sidecar (for the ` 0x04 ` /` 0x05 ` cases). They are
1110+ replayed naturally via the Raft log on restart, so the
1111+ ` RegisterEncryptionWriter ` entries themselves are NOT in
1112+ the §5.5 ` isEncryptionRelevant() ` predicate (only the
1113+ ` 0x04 ` /` 0x05 ` family is, because those mutate ` keys.json ` ).
10281114
10291115### 6.4 Compress-then-encrypt
10301116
0 commit comments