Skip to content

Commit 71ecc12

Browse files
committed
backup: #904 v14 - fan out Redis encoder across all db_<N> directories
## Codex P1 v13: hardcoded db_0 silently dropped non-default Redis DBs The Redis adapter runner in adapterRunners was hardcoded to NewRedisEncoder(root, 0). RedisEncoder is scoped to a single redis/db_<n>/ subdirectory; any Phase 1 multi-DB dump that included redis/db_3/, redis/db_5/, etc. would silently drop those DBs from the produced .fsm — a header-only encode for those keys, exactly the silent-data-loss pattern the encoder is supposed to fail closed against. With --self-test on, the same valid multi-DB dump would be rejected at the diff stage (the decoded scratch tree would not match the input's db_3 content). Phase 0a/0b inputs only emit redis/db_0/, so this is a latent bug that fires the moment Phase 1's multi-DB dumper lands; codex flagged it as P1. ## Fix: enumerate redis/db_<N> directories and fan out per DB Two new helpers in encode_snapshot.go: - enumerateRedisDBs(inRoot) returns the sorted dbIndex values for which <inRoot>/redis/db_<N>/ exists as a directory. A missing redis/ returns nil (no-op, matching the per-DB encoder's missing-subdir convention). Non-canonical entries (non-numeric suffix, negative, leading-zero like "db_01", wrong prefix, regular files at the redis/ level) are silently skipped — they cannot be produced by the canonical decoder. A symlinked or regular-file redis/ path is rejected with ErrRedisEncodeNotDir (mirrors the per-DB encoder's symlink refusal). - encodeAllRedisDBs(b, inRoot) invokes NewRedisEncoder per enumerated index in ascending order, wrapping per-DB errors with "redis encoder db_%d" for traceability. The redis adapter runner now uses encodeAllRedisDBs as its function value directly (gocritic unlambda). ## Pinned by - TestEnumerateRedisDBsMissingDir: missing redis/ returns nil indices. - TestEnumerateRedisDBsMixedEntries: only canonical db_<N> dirs kept; non-numeric, negative, leading-zero, empty-suffix, wrong- prefix, and regular-file entries are skipped; result sorted ascending (asserted via direct index comparison). - TestEnumerateRedisDBsRedisIsRegularFile: regular file at redis/ path triggers ErrRedisEncodeNotDir. - TestEncodeSnapshotRedisMultiDB: a fixture under redis/db_3/ ONLY (no db_0) produces strictly more bytes than an empty-redis baseline through EncodeSnapshot. Pre-fix, the hardcoded db_0 path would have produced an identical header-only .fsm for both inputs. ## Caller audit per CLAUDE.md semantic-change rule - adapterRunners.redis: closure replaced with encodeAllRedisDBs function value (gocritic unlambda). Error contract: - errors.Is(err, ErrEncodeAdapterData) — still true (errors.Mark in runAdapterEncoders is unchanged). - errors.Is(err, ErrRedisEncodeNotDir) etc. — still true (errors.Mark + errors.Wrap preserve the inner chain). - Error prefix changed from "redis encoder: ..." to "redis encoder db_%d: ..." (or "redis encoder enumerate: ..." for the enumeration step). More specific, not less. - NewRedisEncoder direct callers (encode_redis_test.go, encode_redis_coll_test.go, encode_redis_hardlink_unix_test.go): unaffected — they call NewRedisEncoder directly, never through the adapter runner. - enumerateRedisDBs / parseRedisDBDir / checkRedisRoot / encodeAllRedisDBs are new package-private functions; no external callers exist yet. ## Lint compliance Two cyclop refactor splits required: - enumerateRedisDBs (initially 12 branches) split into checkRedisRoot + parseRedisDBDir helpers. Each is ≤6 branches. - TestEnumerateRedisDBs (initially 14 branches via t.Run) split into three top-level tests. Each is ≤4 branches. Tests + lint green.
1 parent 48b0b0e commit 71ecc12

2 files changed

Lines changed: 220 additions & 3 deletions

File tree

internal/backup/encode_snapshot.go

Lines changed: 101 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -356,6 +356,106 @@ func encodeBuffered(b *snapshotBuilder, opts EncodeOptions, enabled []string, ou
356356
return result, nil
357357
}
358358

359+
// redisDBDirPrefix is the canonical "db_" prefix produced by the
360+
// decoder for redis/db_<N>/ directories. Mirrored by encoder
361+
// enumeration (encodeAllRedisDBs) so a multi-DB dump round-trips.
362+
const redisDBDirPrefix = "db_"
363+
364+
// enumerateRedisDBs returns the sorted dbIndex values for which
365+
// <inRoot>/redis/db_<N>/ exists as a directory. A missing redis/
366+
// directory returns nil; the caller treats it as no-op (same convention
367+
// as the per-DB encoder, which is a no-op when its db_<n> subdir is
368+
// absent). Non-db_<N> entries (regular files, symlinks at the redis/
369+
// level, non-numeric or non-canonical suffixes like "db_-1" or
370+
// "db_01") are silently skipped — they cannot have been produced by
371+
// the canonical decoder and are not the encoder's concern.
372+
//
373+
// Codex P1 v13 #904: replaces the prior hardcoded NewRedisEncoder(_, 0)
374+
// in adapterRunners that silently dropped non-default DBs from any
375+
// future Phase 1 multi-DB dump.
376+
func enumerateRedisDBs(inRoot string) ([]int, error) {
377+
redisDir := filepath.Join(inRoot, "redis")
378+
if err := checkRedisRoot(redisDir); err != nil {
379+
return nil, err
380+
}
381+
entries, err := os.ReadDir(redisDir)
382+
if err != nil {
383+
if errors.Is(err, os.ErrNotExist) {
384+
return nil, nil
385+
}
386+
return nil, errors.WithStack(err)
387+
}
388+
var indices []int
389+
for _, ent := range entries {
390+
if idx, ok := parseRedisDBDir(ent); ok {
391+
indices = append(indices, idx)
392+
}
393+
}
394+
sort.Ints(indices)
395+
return indices, nil
396+
}
397+
398+
// checkRedisRoot stats <inRoot>/redis/ and rejects symlink / non-dir
399+
// shapes. Missing is allowed (caller returns nil indices). Split out
400+
// of enumerateRedisDBs to keep that function under the cyclop bound.
401+
func checkRedisRoot(redisDir string) error {
402+
info, err := os.Lstat(redisDir)
403+
switch {
404+
case errors.Is(err, os.ErrNotExist):
405+
return nil
406+
case err != nil:
407+
return errors.WithStack(err)
408+
case info.Mode()&os.ModeSymlink != 0:
409+
// Symlinked redis/ would let os.OpenRoot in the per-DB encoder
410+
// resolve outside the dump tree (mirrors the per-DB encoder's
411+
// symlink refusal on redis/db_<n>).
412+
return errors.Wrapf(ErrRedisEncodeNotDir, "redis path %q is a symlink", redisDir)
413+
case !info.IsDir():
414+
return errors.Wrapf(ErrRedisEncodeNotDir, "redis path %q is not a directory", redisDir)
415+
}
416+
return nil
417+
}
418+
419+
// parseRedisDBDir returns (dbIndex, true) when ent names a canonical
420+
// db_<N> directory (N is a non-negative decimal with no leading zeros).
421+
// Non-matching entries return (0, false) so the caller can skip without
422+
// erroring — they cannot have been produced by the canonical decoder.
423+
// Reject non-canonical decimals so a hypothetical Phase 1 dumper cannot
424+
// double-emit the same db under two distinct directory names.
425+
func parseRedisDBDir(ent os.DirEntry) (int, bool) {
426+
if !ent.IsDir() {
427+
return 0, false
428+
}
429+
name := ent.Name()
430+
if !strings.HasPrefix(name, redisDBDirPrefix) {
431+
return 0, false
432+
}
433+
suffix := name[len(redisDBDirPrefix):]
434+
idx, err := strconv.Atoi(suffix)
435+
if err != nil || idx < 0 || strconv.Itoa(idx) != suffix {
436+
return 0, false
437+
}
438+
return idx, true
439+
}
440+
441+
// encodeAllRedisDBs invokes NewRedisEncoder per discovered db_<N>
442+
// directory in ascending index order. A missing redis/ directory is a
443+
// no-op. Codex P1 v13 #904: replaces the prior hardcoded db_0 fan-out
444+
// which would silently drop non-default DBs from any Phase 1 multi-DB
445+
// dump. Per-DB errors are wrapped with the db index for traceability.
446+
func encodeAllRedisDBs(b *snapshotBuilder, inRoot string) error {
447+
indices, err := enumerateRedisDBs(inRoot)
448+
if err != nil {
449+
return errors.Wrap(err, "redis encoder enumerate")
450+
}
451+
for _, idx := range indices {
452+
if err := NewRedisEncoder(inRoot, idx).Encode(b); err != nil {
453+
return errors.Wrapf(err, "redis encoder db_%d", idx)
454+
}
455+
}
456+
return nil
457+
}
458+
359459
// adapterRunner pairs an enabled-check with an Encode call, keeping
360460
// runAdapterEncoders's per-iteration body to two branches (cyclop).
361461
type adapterRunner struct {
@@ -366,9 +466,7 @@ type adapterRunner struct {
366466

367467
func adapterRunners() []adapterRunner {
368468
return []adapterRunner{
369-
{"redis", func(s AdapterSet) bool { return s.Redis }, func(b *snapshotBuilder, root string) error {
370-
return errors.Wrap(NewRedisEncoder(root, 0).Encode(b), "redis encoder")
371-
}},
469+
{"redis", func(s AdapterSet) bool { return s.Redis }, encodeAllRedisDBs},
372470
{"dynamodb", func(s AdapterSet) bool { return s.DynamoDB }, func(b *snapshotBuilder, root string) error {
373471
return errors.Wrap(NewDynamoDBEncoder(root).Encode(b), "dynamodb encoder")
374472
}},

internal/backup/encode_snapshot_test.go

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,125 @@ func TestEncodeSnapshotRejectsZeroAdapterSet(t *testing.T) {
388388
}
389389
}
390390

391+
// TestEnumerateRedisDBsMissingDir pins codex P1 v13 #904: a missing
392+
// redis/ directory returns nil indices (no-op), matching the per-DB
393+
// encoder's "missing db_<n> = nothing to encode" convention.
394+
func TestEnumerateRedisDBsMissingDir(t *testing.T) {
395+
t.Parallel()
396+
indices, err := enumerateRedisDBs(t.TempDir())
397+
if err != nil {
398+
t.Fatalf("err = %v, want nil", err)
399+
}
400+
if indices != nil {
401+
t.Errorf("indices = %v, want nil for missing redis/", indices)
402+
}
403+
}
404+
405+
// TestEnumerateRedisDBsMixedEntries pins codex P1 v13 #904: only
406+
// canonical db_<N> entries are kept; non-numeric, negative, leading-
407+
// zero, empty-suffix, wrong-prefix, and non-directory entries are
408+
// silently skipped. The returned slice is sorted ascending.
409+
func TestEnumerateRedisDBsMixedEntries(t *testing.T) {
410+
t.Parallel()
411+
in := t.TempDir()
412+
for _, name := range []string{"db_0", "db_1", "db_5"} {
413+
if err := os.MkdirAll(filepath.Join(in, "redis", name), 0o755); err != nil {
414+
t.Fatalf("MkdirAll %s: %v", name, err)
415+
}
416+
}
417+
// Entries that MUST be skipped:
418+
// db_garbage — non-numeric suffix
419+
// db_-1 — negative
420+
// db_01 — non-canonical leading zero
421+
// db_ — empty suffix
422+
// notdb_2 — wrong prefix
423+
for _, name := range []string{"db_garbage", "db_-1", "db_01", "db_", "notdb_2"} {
424+
if err := os.MkdirAll(filepath.Join(in, "redis", name), 0o755); err != nil {
425+
t.Fatalf("MkdirAll %s: %v", name, err)
426+
}
427+
}
428+
// A regular file under redis/ must be skipped (not enumerable).
429+
if err := os.WriteFile(filepath.Join(in, "redis", "README"), []byte("x"), 0o600); err != nil {
430+
t.Fatalf("WriteFile README: %v", err)
431+
}
432+
indices, err := enumerateRedisDBs(in)
433+
if err != nil {
434+
t.Fatalf("enumerateRedisDBs: %v", err)
435+
}
436+
want := []int{0, 1, 5}
437+
if len(indices) != len(want) {
438+
t.Fatalf("indices = %v, want %v", indices, want)
439+
}
440+
for i, v := range want {
441+
if indices[i] != v {
442+
t.Errorf("indices[%d] = %d, want %d", i, indices[i], v)
443+
}
444+
}
445+
}
446+
447+
// TestEnumerateRedisDBsRedisIsRegularFile pins fail-closed when the
448+
// "redis" path inside the dump is a regular file rather than a
449+
// directory — distinct from the missing case.
450+
func TestEnumerateRedisDBsRedisIsRegularFile(t *testing.T) {
451+
t.Parallel()
452+
in := t.TempDir()
453+
if err := os.WriteFile(filepath.Join(in, "redis"), []byte("not a dir"), 0o600); err != nil {
454+
t.Fatalf("WriteFile: %v", err)
455+
}
456+
_, err := enumerateRedisDBs(in)
457+
if !errors.Is(err, ErrRedisEncodeNotDir) {
458+
t.Errorf("err = %v, want errors.Is ErrRedisEncodeNotDir", err)
459+
}
460+
}
461+
462+
// TestEncodeSnapshotRedisMultiDB pins codex P1 v13 #904: the Redis
463+
// fan-out in adapterRunners enumerates redis/db_<N>/ and invokes the
464+
// per-DB encoder for each. The fixture places a single string under
465+
// redis/db_3/ ONLY (no db_0). Pre-fix, the encoder hardcoded
466+
// NewRedisEncoder(root, 0) and produced a header-only .fsm for this
467+
// input — silent data loss. Post-fix, the db_3 string is included.
468+
//
469+
// Assertion is content-free: compare encoded byte count against an
470+
// empty-redis baseline. With multi-DB fan-out, the db_3 fixture
471+
// produces MORE bytes than an empty tree. Without it, both encodes
472+
// would produce identical header-only output.
473+
func TestEncodeSnapshotRedisMultiDB(t *testing.T) {
474+
t.Parallel()
475+
emptyIn := t.TempDir()
476+
var emptyBuf bytes.Buffer
477+
emptyResult, err := EncodeSnapshot(EncodeOptions{
478+
InputRoot: emptyIn,
479+
Adapters: AdapterSet{Redis: true},
480+
LastCommitTS: 1,
481+
}, &emptyBuf)
482+
if err != nil {
483+
t.Fatalf("EncodeSnapshot empty: %v", err)
484+
}
485+
486+
in := t.TempDir()
487+
encKey := EncodeSegment([]byte("k3"))
488+
db3Strings := filepath.Join(in, "redis", "db_3", "strings")
489+
if err := os.MkdirAll(db3Strings, 0o755); err != nil {
490+
t.Fatalf("MkdirAll db_3/strings: %v", err)
491+
}
492+
if err := os.WriteFile(filepath.Join(db3Strings, encKey+".bin"), []byte("v3"), 0o600); err != nil {
493+
t.Fatalf("WriteFile db_3 string: %v", err)
494+
}
495+
var buf bytes.Buffer
496+
result, err := EncodeSnapshot(EncodeOptions{
497+
InputRoot: in,
498+
Adapters: AdapterSet{Redis: true},
499+
LastCommitTS: 1,
500+
}, &buf)
501+
if err != nil {
502+
t.Fatalf("EncodeSnapshot db_3-only: %v", err)
503+
}
504+
if result.BytesWritten <= emptyResult.BytesWritten {
505+
t.Errorf("BytesWritten with redis/db_3 fixture (%d) <= empty (%d); pre-fix, hardcoded db_0 fan-out dropped db_3 silently",
506+
result.BytesWritten, emptyResult.BytesWritten)
507+
}
508+
}
509+
391510
// TestEncodeSnapshotMarksAdapterDataErrors pins codex P2 v9 #904: when
392511
// an adapter encoder rejects the input tree's contents (e.g. a
393512
// malformed DynamoDB _schema.json), EncodeSnapshot must surface the

0 commit comments

Comments
 (0)