Skip to content

Commit 9b82d8a

Browse files
authored
backup: Redis zset encoder (Phase 0a) (#790)
## Summary Adds the Redis sorted-set (zset) encoder for the Phase 0 logical snapshot decoder (`docs/design/2026_04_29_proposed_snapshot_logical_decoder.md`, lines 334-335). Mirrors the hash/list/set encoders shipped in #725/#755/#758. After this lands, Phase 0a's remaining Redis work is the stream encoder + HLL TTL sidecar; the per-adapter encoders themselves will be complete. Output shape per the design: ```json {"format_version": 1, "members": [{"member": "alice", "score": 100}], "expire_at_ms": null} ``` Members sorted by raw byte order (not by score) so `diff -r` between two dumps with the same logical contents stays line-stable across score-only mutations. Scores emit as JSON numbers for finite values, ZADD-conventional `"+inf"` / `"-inf"` strings for ±Inf (`json.Marshal` rejects non-finite floats). NaN scores fail closed at intake — Redis's ZADD rejects NaN at the wire level, so a NaN at backup time indicates corruption. Wire layout (mirrors `store/zset_helpers.go`): - `!zs|meta|<userKeyLen(4)><userKey>` → 8-byte BE Len - `!zs|mem|<userKeyLen(4)><userKey><member>` → 8-byte IEEE 754 score - `!zs|scr|<userKeyLen(4)><userKey><sortableScore(8)><member>` → silently discarded (secondary index; `!zs|mem|` is the source of truth) - `!zs|meta|d|<userKeyLen(4)><userKey><commitTS(8)><seqInTxn(4)>` → silently skipped (delta family; same policy as hash/set encoders) TTL routing: `!redis|ttl|<userKey>` for a registered zset folds into the JSON `expire_at_ms` field, matching the set/list/hash inlining, so a restorer replays ZADD + EXPIRE in one shot. ## Self-review (5 lenses) 1. **Data loss** — `!zs|mem|` is the source of truth; `!zs|scr|` and delta records are intentional silent skips with audit notes. Empty zsets (Len=0 but meta seen) still emit a file because `TYPE k → zset` is observable to clients. 2. **Concurrency / distributed** — `RedisDB` is sequential per scope (matches the existing per-DB encoder contract); no shared state. 3. **Performance** — per-zset state in a map, flushed once at Finalize. Bounded by `maxWideColumnItems` on the live side; sort is O(n log n) on member-name bytes. Identical cost shape to hash/set encoders. 4. **Data consistency** — JSON field order pinned via struct tags (not map). Inf score uses `json.RawMessage` so the same `score` key emits as either a number or a string. NaN fails closed. 5. **Test coverage** — 17 table-driven tests under `internal/backup/redis_zset_test.go`: - round-trip basic / empty / TTL inlining - binary member via base64 envelope - delta-key skip (both `HandleZSetMeta` entry + `parseZSetMetaKey` guard) - `HandleZSetMetaDelta` explicit entry - `HandleZSetScore` silent-discard - malformed meta length / overflow / MaxInt64 boundary - malformed member-value length / NaN rejection - ±Inf string-form serialization - members-without-meta still emits file - duplicate-members latest-wins (ZADD semantics) ## Caller audit (per `/loop` standing instruction) Semantics-changing edit: new `case redisKindZSet:` branch in `HandleTTL` (redis_string.go:310). Purely additive — the new branch fires only when `zsetState()` has previously registered the key. No existing handler maps to `redisKindZSet`, so no prior call site changes behavior. Verified: ``` grep -n 'redisKindZSet' internal/backup/ # internal/backup/redis_string.go:88 # internal/backup/redis_string.go:310 # internal/backup/redis_zset.go:166 ``` Three references, all new in this PR. No legacy caller assumes the prior behavior. ## Test plan - [x] `go test -race ./internal/backup/` → ok - [x] `golangci-lint run ./internal/backup/...` → 0 issues - [x] `go build ./...` → ok - [x] `go vet ./internal/backup/...` → ok <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added Redis sorted-set (ZSET) backup/restore with deterministic JSON output, proper ±Inf score serialization, and legacy-blob reconciliation. * TTLs for ZSETs and sets can be inlined so restored items retain expiry; pending TTLs are drained when a key is first typed. * Orphan-TTL buffering with configurable byte-cap and a fail-closed overflow behavior; option to disable buffering. * **Tests** * Extensive tests covering ZSET semantics, TTL buffering/drain, legacy-blob handling, byte-cap edge cases, and overflow behavior. <!-- review_stack_entry_start --> [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/790?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents f615443 + cbeae91 commit 9b82d8a

6 files changed

Lines changed: 2082 additions & 30 deletions

File tree

internal/backup/redis_hash.go

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -237,7 +237,11 @@ type hashFieldRecord struct {
237237
Value json.RawMessage `json:"value"`
238238
}
239239

240-
func marshalHashJSON(st *redisHashState) ([]byte, error) {
240+
// nolint comment lives at the function head: dupl pairs this with
241+
// marshalZSetJSON, which carries the rationale (parallel design-spec
242+
// wrappers that can't collapse into a shared helper without breaking
243+
// JSON field-order determinism). See redis_zset.go:marshalZSetJSON.
244+
func marshalHashJSON(st *redisHashState) ([]byte, error) { //nolint:dupl // see comment above + redis_zset.go
241245
// Sort by raw byte order for deterministic output across runs.
242246
names := make([]string, 0, len(st.fields))
243247
for name := range st.fields {

internal/backup/redis_set.go

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,12 +114,24 @@ func (r *RedisDB) HandleSetMetaDelta(_, _ []byte) error { return nil }
114114
// setState lazily creates per-key state. Mirrors the hash/list
115115
// kindByKey-registration pattern so HandleSetMeta, HandleSetMember,
116116
// and the HandleTTL back-edge all agree on the kind.
117+
//
118+
// On first registration we drain any pendingTTL for the user key.
119+
// `!redis|ttl|<k>` lex-sorts BEFORE `!st|...` (because `r` < `s`),
120+
// so in real snapshot order the TTL arrives FIRST; HandleTTL parks
121+
// it in pendingTTL, and this function inlines it into the set's
122+
// `expire_at_ms`. Without this drain step, every TTL'd set would
123+
// restore as permanent — a latent bug in PR #758 surfaced by codex
124+
// on PR #790. Phase 0a tests added in the same PR pin the ordering.
117125
func (r *RedisDB) setState(userKey []byte) *redisSetState {
118126
uk := string(userKey)
119127
if st, ok := r.sets[uk]; ok {
120128
return st
121129
}
122130
st := &redisSetState{members: make(map[string]struct{})}
131+
if expireAtMs, ok := r.claimPendingTTL(userKey); ok {
132+
st.expireAtMs = expireAtMs
133+
st.hasTTL = true
134+
}
123135
r.sets[uk] = st
124136
r.kindByKey[uk] = redisKindSet
125137
return st

0 commit comments

Comments
 (0)