Skip to content

backup: KEYMAP.jsonl writer/reader and MANIFEST.json schema (Phase 0a)#712

Open
bootjp wants to merge 7 commits intofeat/backup-phase0a-filenamefrom
feat/backup-phase0a-keymap-manifest
Open

backup: KEYMAP.jsonl writer/reader and MANIFEST.json schema (Phase 0a)#712
bootjp wants to merge 7 commits intofeat/backup-phase0a-filenamefrom
feat/backup-phase0a-keymap-manifest

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 29, 2026

Summary

Stacked on top of #711 (filename encoding). Adds two more foundation pieces of the Phase 0 logical-backup decoder.

KEYMAP.jsonl (internal/backup/keymap.go)

  • Append-only JSONL stream of {encoded, original (b64url), kind} records.
  • Records exist only when the original bytes are NOT recoverable from the encoded filename alone:
    • KindSHAFallback — segments rendered as <sha32>__<truncated>
    • KindS3LeafData — S3 path collisions renamed to .elastickv-leaf-data
    • KindMetaCollision — user object key ending in .elastickv-meta.json
  • KeymapWriter: streaming append, JSON encoder configured to skip HTML escapes so user-key bytes round-trip cleanly. Refuses empty encoded or kind so producer bugs surface loudly. Count() exposed for the "omit empty file" decision.
  • KeymapReader: line-by-line scanner with bounded buffer (1 MiB); blank lines surface as ErrInvalidKeymapRecord rather than being silently skipped so truncated dumps are recognised.
  • LoadKeymap: convenience helper that materialises the file as a map (last-wins on duplicates).

MANIFEST.json (internal/backup/manifest.go)

  • Structs matching the schema in docs/design/2026_04_29_proposed_snapshot_logical_decoder.md.
  • CurrentFormatVersion = 1; ReadManifest refuses format_version > current and format_version == 0 (ErrUnsupportedFormatVersion).
  • Phase discriminator: Phase 0 must not set Live, Phase 1 must not set Source — both validated at write and read time.
  • DisallowUnknownFields on read so format drift surfaces loudly.
  • Pretty-printed output (2-space indent, no HTML escapes) since MANIFEST.json is operator-facing.
  • NewPhase0SnapshotManifest seeds policy fields with the documented defaults.

Test plan

  • go test -race ./internal/backup/... — pass.
  • golangci-lint run ./internal/backup/... — clean.
  • Tests cover round-trip, sticky-error semantics, last-wins dedup, HTML-escape suppression, future-version refusal, unknown-field refusal, unknown-phase refusal, cross-phase Source/Live exclusion.

Self-review

  • Data loss — N/A (read/write helpers). KeymapReader returns sticky errors so partial reads cannot be silently treated as success.
  • ConcurrencyKeymapWriter/KeymapReader are not goroutine-safe (per-scope use); manifest helpers are pure. -race clean.
  • Performancebufio.Writer(64 KiB) for the JSONL stream; bounded scanner buffer (1 MiB) on read.
  • Data consistencyDisallowUnknownFields + format-version gate prevent silent drift. The phase discriminator's structural rules are enforced symmetrically at write and read.
  • Test coverage — 7 keymap tests + 10 manifest tests covering the documented happy/sad paths.

Stacking

Base branch is feat/backup-phase0a-filename (PR #711). When that lands, this PR's base will switch to main automatically.

Builds on PR #711 (filename encoding). Adds two more foundation
pieces of the Phase 0 logical-backup decoder.

KEYMAP.jsonl (internal/backup/keymap.go)
- Append-only JSONL stream of {encoded, original (b64url), kind}
  records.
- Records exist only for entries whose original bytes are NOT
  recoverable from the encoded filename alone:
  - KindSHAFallback (segments rendered as <sha32>__<truncated>)
  - KindS3LeafData (S3 path collisions renamed to .elastickv-leaf-data)
  - KindMetaCollision (user object key ending in .elastickv-meta.json)
- KeymapWriter: streaming append, json encoder configured to skip
  HTML escapes so user-key bytes round-trip cleanly. Refuses empty
  encoded or kind so producer bugs surface loudly. Count() exposed
  for the "omit empty file" decision.
- KeymapReader: line-by-line scanner with bounded buffer (1 MiB);
  blank lines surface as ErrInvalidKeymapRecord rather than being
  silently skipped so truncated dumps are recognised.
- LoadKeymap: convenience helper that materialises the file as a
  map (last-wins on duplicates).

MANIFEST.json (internal/backup/manifest.go)
- Manifest, Source, Live, Adapters, Adapter, Exclusions structs
  matching the schema in
  docs/design/2026_04_29_proposed_snapshot_logical_decoder.md.
- CurrentFormatVersion = 1; ReadManifest refuses
  format_version > current and format_version == 0
  (ErrUnsupportedFormatVersion).
- Phase discriminator constants for Phase 0 ("phase0-snapshot-decode")
  and Phase 1 ("phase1-live-pinned"); Phase 0 manifests must not
  set Live, Phase 1 must not set Source -- both validated at write
  and read time.
- DisallowUnknownFields on read so format drift surfaces loudly.
- Pretty-printed output (2-space indent, no HTML escapes) since
  MANIFEST.json is operator-facing.
- NewPhase0SnapshotManifest seeds the policy fields with the
  documented defaults so callers can focus on populating
  per-dump metadata.

Tests cover round-trip, sticky-error semantics, last-wins dedup,
HTML-escape suppression, future-version refusal, unknown-field
refusal, unknown-phase refusal, and the cross-phase Source/Live
exclusion rules.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dc32b772-ec37-4914-9dcf-9d190ef0c664

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/backup-phase0a-keymap-manifest

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the metadata infrastructure for a backup system, implementing the MANIFEST.json and KEYMAP.jsonl file formats. The manifest provides a central record of backup versions, phases, and adapter scopes, while the keymap facilitates the reversal of encoded filenames to their original keys. Feedback highlights an opportunity to improve the manifest's representation of adapter scopes using pointers to distinguish between excluded and empty states, and suggests renaming a misleading test case in the keymap suite.

Comment on lines +93 to +98
type Adapters struct {
DynamoDB Adapter `json:"dynamodb"`
S3 Adapter `json:"s3"`
Redis Adapter `json:"redis"`
SQS Adapter `json:"sqs"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation for the Adapters struct describes a distinction between nil and empty values to indicate whether an adapter was included in the scope filter. However, the current implementation uses non-pointer Adapter structs, which means they are always present in the JSON. This can lead to state inconsistencies where the 'not included' state is silently dropped or misrepresented during serialization. Following the rule to avoid silently dropping entries during serialization or normalization, consider using pointers for the fields in the Adapters struct and applying omitempty to them to ensure the intended state is preserved on disk.

type Adapters struct {
	DynamoDB *Adapter json:"dynamodb,omitempty"
	S3       *Adapter json:"s3,omitempty"
	Redis    *Adapter json:"redis,omitempty"
	SQS      *Adapter json:"sqs,omitempty"
}
References
  1. Avoid silently dropping entries during serialization or normalization. If placeholder or incomplete entries are valid in some contexts, provide a specific, more lenient handling path for them to ensure they are preserved on disk and avoid state inconsistencies.

Comment thread internal/backup/keymap_test.go Outdated
}
}

func TestKeymapReader_AcceptsBlankLinesByPolicy(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test name TestKeymapReader_AcceptsBlankLinesByPolicy is misleading because the test actually asserts that blank lines are rejected with ErrInvalidKeymapRecord. This matches the intended strict JSONL behavior described in the comments. The test name should be updated to reflect that blank lines are rejected.

Suggested change
func TestKeymapReader_AcceptsBlankLinesByPolicy(t *testing.T) {
func TestKeymapReader_RejectsBlankLines(t *testing.T) {

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a0d92449a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/keymap.go
Comment on lines +171 to +179
if err := json.Unmarshal(line, &rec); err != nil {
r.err = errors.Wrap(ErrInvalidKeymapRecord, err.Error())
return KeymapRecord{}, false, r.err
}
if rec.Encoded == "" || rec.Kind == "" {
r.err = errors.Wrap(ErrInvalidKeymapRecord, "missing encoded or kind")
return KeymapRecord{}, false, r.err
}
return rec, true, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject malformed base64 in KEYMAP records

KeymapReader.Next documents malformed base64 as an invalid record, but it currently only unmarshals JSON and checks encoded/kind, so records with invalid original are accepted and cached by LoadKeymap until a later Original() call fails. This defers corruption detection and can make a damaged KEYMAP appear valid during initial ingest.

Useful? React with 👍 / 👎.

Comment on lines +191 to +194
dec := json.NewDecoder(r)
dec.DisallowUnknownFields() // surface format drift loudly
if err := dec.Decode(&m); err != nil {
return Manifest{}, errors.Wrap(ErrInvalidManifest, err.Error())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject trailing bytes after MANIFEST JSON object

ReadManifest decodes exactly one JSON value and returns success without checking for EOF, so inputs like {"...":...}{"extra":1} (or other trailing junk) are accepted. That weakens the manifest's strictness guarantees and can hide file corruption or concatenation errors that should fail validation.

Useful? React with 👍 / 👎.

bootjp added 2 commits April 30, 2026 19:23
…tickv into feat/backup-phase0a-keymap-manifest
- Adapters fields are now *Adapter pointers (Gemini #98). Nil ->
  excluded, non-nil empty -> included-with-no-scopes, non-nil
  populated -> normal. Previous non-pointer struct collapsed both
  "excluded" and "included-empty" into the same on-disk shape;
  gemini's reference (don't silently drop entries during
  serialization) was the right call. Test
  TestAdaptersStruct_NilVsEmptyDistinguishedOnDisk covers it.
- KeymapReader.Next now validates the base64-encoded `original`
  field at parse time (Codex P1 #179). Previously the JSON parsed
  fine and the bad base64 surfaced lazily on Original() — that
  defers corruption detection past initial ingest. Test
  TestKeymapReader_RejectsMalformedBase64AtParseTime locks in the
  parse-time validation.
- ReadManifest now rejects trailing bytes via dec.More() (Codex P2
  #194). Two manifests concatenated, garbage tails, partial-write
  artifacts: all surface as ErrInvalidManifest. Tests
  TestReadManifest_RejectsTrailingBytes and
  TestReadManifest_RejectsTrailingNonWhitespace.
- Test name TestKeymapReader_AcceptsBlankLinesByPolicy renamed to
  TestKeymapReader_RejectsBlankLines (Gemini #160). The test was
  already asserting rejection; only the name was misleading.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

Review feedback addressed in c3d1e6fd:

Gemini MEDIUM #98 — Adapters nil-vs-empty must be distinguishable on disk
The non-pointer Adapter fields collapsed "excluded by --adapter filter" and "included with no scopes" into the same on-disk shape. Switched to *Adapter pointers + omitempty. Now: nil → excluded (key absent), non-nil empty &Adapter{} → included with no scopes ("dynamodb":{}), non-nil populated → normal. New TestAdaptersStruct_NilVsEmptyDistinguishedOnDisk covers it.

Codex P1 #179 — Validate base64 at parse time
KeymapReader.Next previously deferred base64 validation to rec.Original(). A malformed dump would parse fine and only fail on a much later read. Now the base64 is decoded inside Next() so corruption surfaces on first ingest. New TestKeymapReader_RejectsMalformedBase64AtParseTime.

Codex P2 #194 — Reject trailing bytes after MANIFEST JSON
ReadManifest now uses dec.More() after the single decode and rejects with ErrInvalidManifest if anything decodable remains. Catches concatenation bugs, partial writes, and double-write corruption. New TestReadManifest_RejectsTrailingBytes and TestReadManifest_RejectsTrailingNonWhitespace.

Gemini MEDIUM #160 — Test name fix
TestKeymapReader_AcceptsBlankLinesByPolicyTestKeymapReader_RejectsBlankLines. The test was already asserting rejection; only the name lied.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 2m 38s —— View job


Review of c3d1e6fd — KEYMAP/MANIFEST follow-up fixes

Tasks:

  • Read trigger comment and understand scope
  • Read changed files (keymap.go, keymap_test.go, manifest.go, manifest_test.go)
  • Diff against base branch to isolate the c3d1e6fd changes
  • Review all four fixes claimed in the trigger comment
  • Post full review

All four stated fixes are correctly implemented. One minor comment inaccuracy to address, everything else is clean.


Fix 1 — *Adapter pointers (Gemini #98) ✅

manifest.go:106-111 — The change from value-type Adapter fields to *Adapter + omitempty is correct and complete. nil → key absent (excluded), &Adapter{}"dynamodb":{} (included, no scopes), populated *Adapter → normal. The field comment (manifest.go:91-105) explains the three states clearly.

TestAdaptersStruct_NilVsEmptyDistinguishedOnDisk (manifest_test.go:241-261) is well-targeted: it asserts "dynamodb":{} is present and that "s3", "redis", "sqs" are absent. ✓

One observation: TestManifest_Phase0RoundTrip (manifest_test.go:22-27) now populates all four adapters. There's no test for a round-trip where some adapters are nil (excluded) — this would complement the serialization-only test. Not a blocker, just worth noting.


Fix 2 — Parse-time base64 validation in Next() (Codex P1 #179) ✅

keymap.go:184-187 — The base64.RawURLEncoding.DecodeString call placed after the empty-field guard is exactly right. It decodes and discards the bytes in Next(), then callers who need the bytes call rec.Original() which decodes again. The double-decode is a minor overhead but correct — LoadKeymap uses it this way already via assertKeymapRecord. For the expected keymap sizes this is not a concern.

TestKeymapReader_RejectsMalformedBase64AtParseTime (keymap_test.go:207-219) uses "!!!" as the deliberately invalid base64. ✓


Fix 3 — Trailing-bytes rejection via dec.More() (Codex P2 #194) ✅ (with one comment bug)

manifest.go:216-219dec.More() correctly returns true if any non-whitespace content follows the decoded object. The two tests confirm:

  • TestReadManifest_RejectsTrailingBytes — doubled manifest JSON → caught ✓
  • TestReadManifest_RejectsTrailingNonWhitespace"garbage" suffix → caught ✓

Minor bug: the comment is wrong in two places (manifest.go:209-215):

// MANIFEST.json is exactly one JSON object. Trailing bytes
// (a second object, junk, even whitespace-only padding) point at
// ...
// must surface here rather than be silently discarded. We use
// io.Discard rather than parsing because we only care that
// nothing-decodable is present; ...
  1. "even whitespace-only padding"dec.More() skips whitespace before peeking, so it returns false for a trailing newline or spaces. Trailing whitespace is not caught (and that's the correct behaviour — text editors add trailing newlines; this should not be an error). The comment incorrectly claims it is caught.
  2. "We use io.Discard rather than parsing" — there is no io.Discard anywhere in this function. This sentence is stale (describes a design that was not used).

The correct comment would be something like: "dec.More() returns true for any non-whitespace decodable content; trailing whitespace/newlines are benign and pass."

Fix this →


Fix 4 — Test name rename (Gemini #160) ✅

TestKeymapReader_AcceptsBlankLinesByPolicyTestKeymapReader_RejectsBlankLines. The body already asserted rejection; only the name was wrong. The updated name and comment (keymap_test.go:161-165) now agree with the assertion. ✓


Overall

The four fixes address the review findings accurately. The only actionable item is correcting the misleading two-part comment in ReadManifest around dec.More(). Everything else is correct.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c3d1e6fdf3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +293 to +295
case PhasePhase1LivePinned:
if m.Source != nil {
return errors.Wrap(ErrInvalidManifest, "phase1 must not set source")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Require live metadata for phase1 manifests

Phase 1 manifests are accepted even when live is completely absent, because validation only rejects source in the phase1 branch. That allows WriteManifest/ReadManifest to succeed for a phase1-live-pinned dump with no pinned read_ts, which breaks the manifest’s ability to describe the consistency point and can cause downstream restore/audit logic to treat an incomplete manifest as valid.

Useful? React with 👍 / 👎.

bootjp added 2 commits April 30, 2026 19:58
, round 2)

Codex P1 #295: validation only rejected source on phase1 manifests
but accepted a phase1 dump with live entirely absent. A phase1
dump's whole point is the cluster-wide read_ts pin recorded under
Live; a manifest that omits Live cannot describe its consistency
point and downstream restore/audit logic must not silently treat
it as valid.

validatePhaseSpecific now requires Live != nil AND
Live.ReadTS != 0 for phase1 dumps. Tests:
TestManifest_Phase1MustSetLive,
TestManifest_Phase1RejectsZeroReadTS,
TestManifest_Phase1WithLiveAndNonZeroReadTSIsValid.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

Codex P1 #295 addressed in e4cd7d8e. Phase 1 manifests now require Live != nil AND Live.ReadTS != 0; both WriteManifest and ReadManifest reject a phase1 dump that omits its cluster-wide pin record. New tests: TestManifest_Phase1MustSetLive, TestManifest_Phase1RejectsZeroReadTS, TestManifest_Phase1WithLiveAndNonZeroReadTSIsValid.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 2m 29s —— View job


Review of e4cd7d8e — Phase 1 Live metadata validation (Codex P1 #295)

Tasks:

  • Read trigger comment and understand scope
  • Read manifest.go and manifest_test.go to review the fix
  • Diff against base branch to isolate e4cd7d8e changes
  • Review phase1 Live validation logic
  • Review new tests
  • Post full review

The core fix is correct and the tests cover all three branches. Two minor issues to address:

Fix — validatePhaseSpecific phase1 checks ✅

manifest.go:294-307 — The two additions are exactly right:

if m.Live == nil {
    return errors.Wrap(ErrInvalidManifest, "phase1 must set live")
}
if m.Live.ReadTS == 0 {
    return errors.Wrap(ErrInvalidManifest, "phase1 live.read_ts must be non-zero")
}

Ordering is safe: nil check before field dereference. Since validate() is shared by WriteManifest and ReadManifest, both paths enforce the constraint symmetrically — a hand-crafted phase1 JSON with live absent or read_ts: 0 will fail at ReadManifest too.

Tests ✅

All three new tests are well-targeted:

  • TestManifest_Phase1MustSetLive — nil LiveErrInvalidManifest
  • TestManifest_Phase1RejectsZeroReadTSLive{ReadTS: 0}ErrInvalidManifest
  • TestManifest_Phase1WithLiveAndNonZeroReadTSIsValid — round-trip with ReadTS: 12345, asserts got.Live.ReadTS == 12345

Minor: issue reference in comment violates CLAUDE.md

manifest.go:297-301:

// A phase1 dump's whole point is the cluster-wide read_ts
// pin recorded under Live. A manifest that omits Live cannot
// describe its consistency point and downstream restore /
// audit logic must not silently accept it as valid (Codex
// P1 #295).

CLAUDE.md says: "Do not reference the current task, fix, or callers… since those belong in the PR description and rot as the codebase evolves." Drop (Codex P1 #295) — the substantive explanation above it is sufficient.

Fix this →

Carry-over: stale comment from previous round still unaddressed

manifest.go:209-215 (added in c3d1e6fd, not fixed in e4cd7d8e):

// MANIFEST.json is exactly one JSON object. Trailing bytes
// (a second object, junk, even whitespace-only padding) point at
// concatenation bugs or partial-write corruption — both of which
// must surface here rather than be silently discarded. We use
// io.Discard rather than parsing because we only care that
// nothing-decodable is present; structural validation lives in
// validate().

Still has both issues flagged in the previous review round:

  1. "even whitespace-only padding"dec.More() skips whitespace so trailing newlines/spaces are not caught. That's the right behaviour, but the comment claims they are.
  2. "We use io.Discard rather than parsing" — there is no io.Discard call in this function; this describes a design path that wasn't taken.

Suggested replacement:

// MANIFEST.json is exactly one JSON object. Any non-whitespace
// decodable content after it (a second object, junk) points at
// concatenation bugs or partial-write corruption. Trailing
// whitespace / newlines are benign and not flagged.

Fix this →


Summary

The Codex P1 #295 fix in e4cd7d8e is correct and complete. Two non-blocking cleanup items remain: drop the issue reference from the new comment, and fix the stale comment in ReadManifest that was flagged last round.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e4cd7d8ea7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/manifest.go Outdated
func ReadManifest(r io.Reader) (Manifest, error) {
var m Manifest
dec := json.NewDecoder(r)
dec.DisallowUnknownFields() // surface format drift loudly
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow unknown MANIFEST fields for same-major evolution

ReadManifest rejects any unknown JSON key via DisallowUnknownFields, which makes older readers fail as soon as a newer producer adds an optional field while keeping format_version at the same major value. That breaks the stated same-major compatibility model (optional field additions) and turns otherwise valid manifests into hard read failures during mixed-version operation.

Useful? React with 👍 / 👎.

, round 3)

Codex P1 #205 (round 2): DisallowUnknownFields broke the
documented same-major minor-evolution path. A newer producer
that adds an optional field at the same format_version turns
older readers into hard read failures during mixed-version
operation, even though both nodes are on the "same major" by
contract.

The format_version field IS the breaking-change signal. Major
bumps are gated by ErrUnsupportedFormatVersion; same-major
minor evolution must silently tolerate optional new fields.

Removed dec.DisallowUnknownFields() from ReadManifest. Replaced
TestReadManifest_RejectsUnknownFields with
TestReadManifest_AcceptsUnknownFieldsForSameMajorMinorEvolution.
The trailing-bytes guard (dec.More() check) is preserved -- it
catches concatenation bugs which is a different concern from
optional-field tolerance.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

Codex P1 #205 (round 2) addressed in 6f231fe7. The earlier DisallowUnknownFields was too strict — it broke the documented same-major minor-evolution path. A newer producer that adds an optional field at the same format_version would turn older readers into hard read failures during mixed-version operation.

format_version IS the breaking-change signal: major bumps are gated by ErrUnsupportedFormatVersion, same-major minor evolution must silently tolerate optional new fields. Removed dec.DisallowUnknownFields(). The trailing-bytes guard (Codex P2 #194 from round 1) is preserved — it catches concatenation/corruption, which is a different concern.

Test TestReadManifest_RejectsUnknownFields replaced by TestReadManifest_AcceptsUnknownFieldsForSameMajorMinorEvolution.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 1m 50s —— View job


Review of 6f231fe7 — Unknown-fields tolerance for same-major evolution

Tasks:

  • Read trigger comment and understand scope
  • Read manifest.go and manifest_test.go
  • Diff against base to isolate 6f231fe7 changes
  • Review unknown-fields handling and test replacement
  • Check carry-over items from previous rounds
  • Post full review

Fix — Remove DisallowUnknownFields (Codex P1 #205 round 2) ✅

The removal is correct. The three-tier contract is now properly implemented:

  • format_version > CurrentFormatVersion → hard refuse via ErrUnsupportedFormatVersion
  • Same format_version + unknown fields → silently ignored (same-major minor evolution)
  • dec.More() → still catches concatenation/corruption (distinct concern, correctly preserved)

TestReadManifest_AcceptsUnknownFieldsForSameMajorMinorEvolution (manifest_test.go:144-173) correctly asserts that a future_optional_field key passes through ReadManifest without error, and that the known fields still decode correctly. ✓


Issue reference in new comment (manifest.go:205-215) ⚠️

The new comment explaining the intentional omission of DisallowUnknownFields contains a ticket reference:

// We intentionally do NOT call DisallowUnknownFields here.
// The format-version contract (Codex P1, follow-up) is:

CLAUDE.md: "Do not reference the current task, fix, or callers… since those belong in the PR description and rot as the codebase evolves." Drop (Codex P1, follow-up) — the substantive explanation below it is self-sufficient.

Same issue in test comment (manifest_test.go:147-150) ⚠️

// Codex P1 #205 (round 2) caught the earlier
// DisallowUnknownFields strictness which broke the documented
// same-major compatibility model.

Same rule. The first two lines of the comment already convey the full rationale; the Codex reference adds nothing durable.


Carry-over: two items flagged in previous rounds still unaddressed

1. Stale trailing-bytes comment (manifest.go:219-225) — flagged in round 1 and round 2:

// MANIFEST.json is exactly one JSON object. Trailing bytes
// (a second object, junk, even whitespace-only padding) point at
// concatenation bugs or partial-write corruption — both of which
// must surface here rather than be silently discarded. We use
// io.Discard rather than parsing because we only care that
// nothing-decodable is present; structural validation lives in
// validate().

Both errors remain:

  • "even whitespace-only padding"dec.More() skips whitespace before peeking, so trailing newlines/spaces return false. They are not caught. That is the correct behaviour, but the comment claims they are.
  • "We use io.Discard rather than parsing" — there is no io.Discard call anywhere in ReadManifest. This describes a design path that wasn't taken.

Fix this →

2. Issue reference in validatePhaseSpecific (manifest.go:307-311) — flagged in round 2:

// A phase1 dump's whole point is the cluster-wide read_ts
// pin recorded under Live. A manifest that omits Live cannot
// describe its consistency point and downstream restore /
// audit logic must not silently accept it as valid (Codex
// P1 #295).

(Codex P1 #295) is an issue reference that will rot. The three lines above it already convey the full rationale; drop the parenthetical.

Fix this →


Summary

The Codex P1 #205 fix in 6f231fe7 is correct. Three non-blocking comment cleanup items remain — all flagged in prior rounds — none of which affect correctness:

Item File Status
Stale "whitespace" + "io.Discard" in trailing-bytes comment manifest.go:219-225 Flagged rounds 1+2
(Codex P1 #295) issue ref in validatePhaseSpecific manifest.go:310-311 Flagged round 2
(Codex P1, follow-up) in new comment manifest.go:206 New this round
Codex P1 #205 (round 2) in test comment manifest_test.go:148-150 New this round

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f231fe790

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/manifest.go Outdated
Comment on lines +144 to +146
Adapters Adapters `json:"adapters"`
Exclusions Exclusions `json:"exclusions"`
ChecksumAlgorithm string `json:"checksum_algorithm"`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject manifests that omit adapters/exclusions

ReadManifest accepts JSON where adapters and/or exclusions are missing, because these are decoded into non-pointer struct fields and validate() only checks scalar policy fields. In that case the sections silently become zero values and the manifest is treated as valid, which can hide truncated/corrupted MANIFEST files and drop provenance about what scopes/filters were actually used. This should fail validation instead of defaulting.

Useful? React with 👍 / 👎.

Codex P2 #146 round 3: ReadManifest accepted manifests where the
adapters and/or exclusions sections were omitted entirely from
the JSON, because they were decoded into non-pointer struct
fields and validate() only checked scalar policy fields. The
empty zero-value sections were treated as valid, hiding
truncated or corrupted MANIFEST files and dropping provenance
about which scopes/filters the producer used.

Manifest.Adapters and Manifest.Exclusions are now *Adapters /
*Exclusions pointer types. validateRequiredFields rejects nil
for either with ErrInvalidManifest. NewPhase0SnapshotManifest
seeds them as &Adapters{} / &Exclusions{} so producers continue
to satisfy the contract by default; populating individual
scopes reaches in via the now-non-nil pointer.

Tests TestReadManifest_RejectsMissingAdapters and
TestReadManifest_RejectsMissingExclusions cover the new
validation path.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

Codex P2 #146 (round 3) addressed in 73641330. ReadManifest previously accepted manifests where adapters and/or exclusions sections were omitted because they decoded into non-pointer struct fields and validate() only checked scalar policy fields. The empty zero-value sections passed silently, dropping provenance and masking truncation.

Manifest.Adapters / Manifest.Exclusions are now pointer types. validateRequiredFields rejects nil for either with ErrInvalidManifest. NewPhase0SnapshotManifest seeds both to non-nil zero values so producers satisfy the contract by default.

Tests: TestReadManifest_RejectsMissingAdapters, TestReadManifest_RejectsMissingExclusions.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant