Skip to content

Commit 2dda2cb

Browse files
authored
docs: propose data-at-rest encryption design (#707)
## Summary - Add `docs/design/2026_04_29_proposed_data_at_rest_encryption.md` (Status: Proposed) covering encryption-at-rest for elastickv. - Threat model: protect against disk theft, decommissioned-drive recovery, backup leak, and Raft WAL/snapshot leak. Without the externally-held key material, the persisted state cannot be decrypted. - Encryption boundary: per-value AES-256-GCM envelope at the storage layer + the same envelope wrapping every Raft proposal `Data []byte`. Keeps the same ciphertext flowing through Raft → WAL → Pebble SST → FSM snapshot, so no surface holds cleartext values. - Key hierarchy: external KEK (AWS KMS / GCP KMS / Vault / file) wraps DEKs; the data dir only holds wrapped DEKs in `encryption/keys.json`. DEK rotation is operator-driven via Raft so every replica observes the new key at the same log index. - Migration: rolling restart with envelope-version byte (`0x00` cleartext, `0x01` encrypted) plus a rate-limited rewrite job. Reverse migration is intentionally unsupported (dump-and-reload). - Self-review per CLAUDE.md (data loss / concurrency / performance / consistency / test coverage) included; Jepsen Redis + DynamoDB suites against an encrypted 3-node cluster are the implementation acceptance gate. Follows the design-doc-first workflow in CLAUDE.md — implementation PRs will land after review of this proposal. ## Test plan - [ ] Doc-only change; no code or tests in this PR. - [ ] Reviewer: confirm filename / header follow `docs/design/README.md` conventions. - [ ] Reviewer: confirm threat model scope (in/out) matches the operational stance you want. - [ ] Reviewer: weigh open question §11.5 (interaction with Lua commit batching) before the implementation PR is opened. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Added comprehensive design proposal for data‑at‑rest encryption: per‑value ciphertext everywhere, encrypted replication envelopes, external key management and DEK lifecycle, cluster-wide nonce/uniqueness and refusal conditions, crash‑durable key sidecar and startup validation, MVCC encryption state handling, multi‑phase rollout and admin commands, snapshot/joiner semantics, observability constraints, performance/Jepsen gates, and a full test plan. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents 65e725a + 07b2fd9 commit 2dda2cb

1 file changed

Lines changed: 2407 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)