backup: S3 encoder for buckets, objects, and blob reassembly (Phase 0a)#718
backup: S3 encoder for buckets, objects, and blob reassembly (Phase 0a)#718bootjp wants to merge 51 commits intofeat/backup-phase0a-dynamodbfrom
Conversation
The Phase 0a logical-backup decoder (internal/backup) needs to route each !s3|blob| record to its assembled object body without holding a live cluster. Today the package only exports parsers for object manifests and upload parts; blob keys are constructable via BlobKey / VersionedBlobKey but not parseable. ParseBlobKey decodes the 6-component form (BlobKey output) and the 7-component form (VersionedBlobKey output with partVersion != 0) into all components: bucket, generation, object, uploadID, partNo, chunkNo, partVersion. Truncated keys, malformed segments, and trailers that aren't either zero bytes or exactly one u64 are rejected with ok=false. Implementation is split into parseBlobKeyHead (the 6-component head) and parseOptionalPartVersion (the trailer) so cyclomatic complexity stays under the package cap without a //nolint marker. Tests cover the un-versioned round-trip, versioned round-trip, partVersion=0 fallback to un-versioned shape (matching VersionedBlobKey's documented behaviour), rejection of non-blob keys (bucket meta, object manifest, upload part, junk), and rejection of trailing-garbage keys.
Builds on PR #716 (DynamoDB) and PR #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder. Snapshot prefixes handled: - !s3|bucket|meta|<bucket> -> s3/<bucket>/_bucket.json (live s3BucketMeta projected into the dump-format s3PublicBucket; cluster- internal fields like CreatedAtHLC and Generation stripped). - !s3|obj|head|<bucket><gen><object> -> s3/<bucket>/<object>.elastickv-meta.json (live s3ObjectManifest projected into the dump-format s3PublicManifest; UploadID and per-part chunk arrays stripped). - !s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]: spilled to a per-(bucket, object) scratch directory as it arrives; assembled at Finalize by walking sortChunkKeys output (partNo, partVersion, chunkNo) and concatenating into the final body file via writeFileAtomic-style tmp+rename. Memory: O(num_objects); body bytes never held in memory. - !s3|upload|meta|, !s3|upload|part|: excluded by default; --include-incomplete-uploads emits them as opaque {prefix, key, value} JSONL records under s3/<bucket>/_incomplete_uploads/. - !s3|bucket|gen|, !s3|gc|upload|, !s3route|: HandleIgnored no-op so the master pipeline can dispatch all !s3|* prefixes uniformly. Reserved-suffix collision: - A user object key ending in .elastickv-meta.json is rejected with ErrS3MetaSuffixCollision by default; --rename-collisions appends .user-data and records the rename in s3/<bucket>/KEYMAP.jsonl with KindMetaCollision so the on-disk reverse map is sound. - Per-bucket KEYMAP.jsonl is opened lazily and dropped if Finalize finds zero records (the spec's "omit empty file" rule). Body assembly: - Each blob chunk is written atomically to scratch as it arrives; the chunk path is registered in the object's chunkPaths map keyed by (uploadID, partNo, chunkNo, partVersion). - Finalize sorts the chunk keys by (partNo, partVersion, chunkNo) and concatenates the matching scratch files into a single <bodyPath> via tmp+rename. - The scratch tree is rm -rf'd on Finalize via deferred RemoveAll, so a successful run leaves nothing behind. Tests: bucket meta + single-chunk object round-trip, multipart ordering verification (chunks emitted out of order, body assembled in part/chunk order), orphan-chunks warning, .elastickv-meta.json collision rejection (default) and rename (with flag) including KEYMAP entry, malformed JSON rejection on both manifest and bucket paths, ignored-prefix no-op, --include-incomplete-uploads round- trip, default no _incomplete_uploads dir, versioned blob assembly.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces an S3Encoder to process S3 backup records, including bucket metadata, object manifests, and blob chunks, with final assembly performed during the Finalize step. It also adds a ParseBlobKey utility for decoding S3 blob keys. Feedback focuses on critical improvements for data integrity and performance, such as filtering blobs by uploadID during assembly to avoid corruption from multiple upload attempts, optimizing file system operations by caching directory creation status and file handles, and correctly implementing the includeOrphans flag. Additionally, the reviewer advised against ignoring errors from resource cleanup methods.
| type s3ObjectState struct { | ||
| bucket string | ||
| generation uint64 | ||
| object string | ||
| manifest *s3PublicManifest | ||
| // chunkPaths maps (uploadID, partNo, chunkNo) -> scratch path. | ||
| chunkPaths map[s3ChunkKey]string | ||
| } |
There was a problem hiding this comment.
The s3ObjectState should store the uploadID from the object manifest to ensure that only blobs belonging to the active upload are assembled. Additionally, adding a dirCreated flag can optimize directory creation in HandleBlob.
type s3ObjectState struct {
bucket string
generation uint64
object string
uploadID string
dirCreated bool
manifest *s3PublicManifest
// chunkPaths maps (uploadID, partNo, chunkNo) -> scratch path.
chunkPaths map[s3ChunkKey]string
}| jl, err := openJSONL(filepath.Join(dir, "records.jsonl")) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| defer func() { _ = closeJSONL(jl) }() |
There was a problem hiding this comment.
Opening and closing the records.jsonl file for every incomplete upload record is inefficient. Consider lazily opening the file and caching the writer in s3BucketState to amortize the I/O cost. Furthermore, if openJSONL uses os.Create, this logic will incorrectly truncate the file on every call, preserving only the last record. Avoid silently dropping entries during serialization to ensure state consistency.
References
- Avoid silently dropping entries during serialization or normalization. If placeholder or incomplete entries are valid in some contexts, provide a specific, more lenient handling path for them to ensure they are preserved on disk and avoid state inconsistencies.
| func (st *s3ObjectState) attachManifestParts(_ string, _ []s3LivePart) { | ||
| // Intentionally empty: assembleObjectBody consumes the manifest | ||
| // directly via st.manifest at Finalize. Kept as a hook so the | ||
| // callsite reads symmetrically with HandleBlob. | ||
| } |
| chunks := sortChunkKeys(obj.chunkPaths) | ||
| for _, k := range chunks { | ||
| path := obj.chunkPaths[k] | ||
| if err := appendFile(tmp, path); err != nil { | ||
| _ = tmp.Close() | ||
| return err | ||
| } | ||
| } |
There was a problem hiding this comment.
Filter the chunks by the uploadID specified in the manifest. This prevents data corruption if the snapshot contains blobs from multiple upload attempts for the same object key. Additionally, ensure that errors from tmp.Close() are not silently ignored.
chunks := sortChunkKeys(obj.chunkPaths)
for _, k := range chunks {
if k.uploadID != obj.uploadID {
continue
}
path := obj.chunkPaths[k]
if err := appendFile(tmp, path); err != nil {
if closeErr := tmp.Close(); closeErr != nil {
// Log the error as per repository rules
}
return err
}
}References
- Do not silently ignore errors from Close() methods on resources like network connections. Log them to ensure resource leaks or other cleanup problems are visible.
| dir := filepath.Join(s.scratchRoot, EncodeSegment([]byte(bucket)), EncodeSegment([]byte(object))) | ||
| if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode | ||
| return errors.WithStack(err) | ||
| } |
There was a problem hiding this comment.
Calling os.MkdirAll for every blob chunk is inefficient. Use the dirCreated flag in s3ObjectState to perform this operation only once per object.
if !st.dirCreated {
dir := filepath.Join(s.scratchRoot, EncodeSegment([]byte(bucket)), EncodeSegment([]byte(object)))
if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode
return errors.WithStack(err)
}
st.dirCreated = true
}| if err != nil { | ||
| return err | ||
| } | ||
| defer func() { _ = closeJSONL(jl) }() |
There was a problem hiding this comment.
The error from closeJSONL is silently ignored. For a writable resource, failing to close properly might mean data was not flushed to disk. Per general rules, errors from Close() methods should not be silently ignored.
References
- Do not silently ignore errors from Close() methods on resources like network connections. Log them to ensure resource leaks or other cleanup problems are visible.
| if obj.manifest == nil { | ||
| s.emitWarn("s3_orphan_chunks", | ||
| "bucket", b.name, | ||
| "object", obj.object, | ||
| "chunks", len(obj.chunkPaths), | ||
| "hint", "blob chunks present but no !s3|obj|head record matched") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 554dd42276
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // vs `path/to/sub`) are deferred until the master pipeline | ||
| // detects them across multiple manifests; this PR's per-object | ||
| // flush trusts the caller's collision detection. | ||
| return obj.object, "", nil |
There was a problem hiding this comment.
Sanitize object keys before deriving output paths
resolveObjectFilename returns the raw S3 object key and flushObject then does filepath.Join(bucketDir, objectName), so object names like ../x or /tmp/x will escape the bucket directory and write outside the dump root. Because S3 object keys are user-controlled, this enables path traversal/arbitrary file overwrite during backup decode and can corrupt host files rather than just the export tree.
Useful? React with 👍 / 👎.
| chunks := sortChunkKeys(obj.chunkPaths) | ||
| for _, k := range chunks { | ||
| path := obj.chunkPaths[k] | ||
| if err := appendFile(tmp, path); err != nil { |
There was a problem hiding this comment.
Restrict body assembly to the manifest-selected chunks
assembleObjectBody concatenates every entry in obj.chunkPaths after sorting, but never filters by the manifest's upload_id/declared parts, so stale chunks from a different upload attempt for the same (bucket,generation,object) can be merged into the output body. This produces corrupted object bytes whenever old or concurrent multipart chunks are still present in the blob keyspace.
Useful? React with 👍 / 👎.
| jl, err := openJSONL(filepath.Join(dir, "records.jsonl")) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| defer func() { _ = closeJSONL(jl) }() |
There was a problem hiding this comment.
Append incomplete-upload records instead of truncating
HandleIncompleteUpload reopens records.jsonl on every record via openJSONL(...), and that helper uses create/truncate semantics, so each call overwrites prior lines. With --include-incomplete-uploads, only the last upload-meta/part record survives per bucket, defeating the stated “one JSON line per record” behavior and losing forensic data.
Useful? React with 👍 / 👎.
…to feat/backup-phase0a-s3
Three correctness bugs and several quality issues, all valid. CRITICAL: path traversal (Codex P1 #425). S3 object keys are user-controlled. A key like "../../../etc/passwd" would, under the prior filepath.Join-and-write code, escape the bucket directory and overwrite host files. Added safeJoinUnderRoot which Cleans the joined path and asserts it stays rooted under the bucket dir; ".." traversal is rejected with ErrS3MalformedKey. Absolute paths (leading "/") are normalised under the bucket dir by filepath.Join, which is the safest outcome. Tests: TestS3_PathTraversalAttemptRejected, TestS3_AbsolutePathObjectKeyConfinedUnderBucket. CRITICAL: stale upload-id chunks merged into body (Codex P1 #500, Gemini HIGH #106/#476/#504). A snapshot mid-delete-and-recreate or mid-retry can carry blob chunks for multiple upload attempts under the same (bucket, gen, object). The prior assembleObjectBody concatenated every chunk regardless of upload_id, producing corrupted bytes. Now: - s3ObjectState gains uploadID; HandleObjectManifest sets it. - New filterChunksForManifest takes the chunkPaths map and the manifest's uploadID, returns only matching chunks sorted by (partNo, partVersion, chunkNo). Stale-uploadID chunks never enter the assembled body. Test: TestS3_StaleUploadIDChunksFilteredFromAssembledBody. CRITICAL: incomplete-uploads file truncated per record (Codex P2 #318, Gemini HIGH+MEDIUM #318). HandleIncompleteUpload re-opened records.jsonl on every call; openJSONL uses os.Create which truncates. Only the last record survived per bucket. Now: - s3BucketState carries an incompleteUploadsJL *jsonlFile lazily opened on the first record and cached. - flushBucket closes it and surfaces the error (was silently ignored). Test: TestS3_IncompleteUploadsAppendsAcrossCalls (3 records, 3 lines on disk). QUALITY: - Gemini MEDIUM #285 (MkdirAll per blob): s3ObjectState gains a scratchDirCreated bool; HandleBlob runs MkdirAll once. - Gemini MEDIUM #318 (closeJSONL error ignored): keymap and incomplete-uploads writers now surface close errors via closeBucketKeymap / explicit closeJSONL return-check. - Gemini MEDIUM #386 (includeOrphans flag ignored): orphan chunks for objects without manifests now write to <bucket>/_orphans/<encoded-object>/<chunk>.bin under WithIncludeOrphans(true). Test: TestS3_OrphanChunksWrittenWhenIncludeOrphans. Also removed the now-unused attachManifestParts placeholder and sortChunkKeys helper; their logic moved into filterChunksForManifest which combines uploadID filtering with the sort.
|
3 critical correctness bugs + 3 quality issues addressed in Critical Codex P1 #425 — Path traversal. A user-supplied key like Codex P1 #500 + Gemini HIGH #106/#476/#504 — Stale upload-id chunks. A snapshot mid-delete-and-recreate or mid-retry can carry blob chunks for multiple upload attempts under the same Codex P2 #318 + Gemini HIGH+MEDIUM #318 — Truncated Quality Gemini MEDIUM #285 — Gemini MEDIUM #318 — closeJSONL error. Gemini MEDIUM #386 — Removed the now-unused @claude review |
|
Claude finished @bootjp's task in 4m 18s —— View job Review —
|
| Item | Severity | Location |
|---|---|---|
"." object key hits EISDIR instead of ErrS3MalformedKey |
Low | s3.go:safeJoinUnderRoot |
Silent bypass when manifest upload_id is empty |
Low | s3.go:HandleObjectManifest |
Scratch dir recomputed on every blob after gate set |
Negligible | s3.go:HandleBlob |
Orphan chunks loaded into memory via os.ReadFile |
Low | s3.go:flushOrphanObject |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 19ae3289ee
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if manifestUploadID != "" && k.uploadID != manifestUploadID { | ||
| continue | ||
| } | ||
| keys = append(keys, k) |
There was a problem hiding this comment.
Assemble only manifest-declared part versions
Fresh evidence: this revision now filters by upload_id, but filterChunksForManifest still appends every (partNo, partVersion, chunkNo) for that upload without checking the manifest’s declared parts. In S3, overwritten parts can leave older blob versions present until async cleanup runs, so this logic can concatenate stale part versions (or silently miss required chunks) and produce corrupted object bodies even when the manifest points to a single canonical part version.
Useful? React with 👍 / 👎.
| joined := filepath.Clean(filepath.Join(cleanRoot, rel)) | ||
| rootSep := cleanRoot + string(filepath.Separator) | ||
| if joined != cleanRoot && !strings.HasPrefix(joined, rootSep) { |
There was a problem hiding this comment.
Avoid normalizing dot segments in object keys
safeJoinUnderRoot uses filepath.Clean(filepath.Join(...)), which collapses keys like a/../b to b and . to the bucket root. Because S3 object keys treat these bytes literally, this can make distinct keys collide/overwrite each other, and keys that clean to the root can cause finalize-time write errors instead of emitting the object data.
Useful? React with 👍 / 👎.
| // flush trusts the caller's collision detection. Path-traversal | ||
| // sanitisation runs in safeJoinUnderRoot, downstream of this | ||
| // function, where the bucket-directory root is in scope. | ||
| return obj.object, "", nil |
There was a problem hiding this comment.
Prevent cross-generation objects from sharing one output path
Object state is tracked per (object,generation), but resolveObjectFilename drops generation and returns only obj.object, so two generations of the same bucket/object can write to the same output path and whichever map iteration runs last wins. Since old-generation keys are intentionally isolated rather than guaranteed absent, this creates nondeterministic exports when stale generations coexist in the snapshot.
Useful? React with 👍 / 👎.
…to feat/backup-phase0a-s3
Three follow-up issues from Codex on top of round 1. #619 Codex P1 -- assemble only manifest-declared parts. filterChunksForManifest previously matched on uploadID only; the manifest declares specific (partNo, partVersion) tuples, and S3's overwrite-then-async-cleanup window can leave older partVersion chunks present under the same (bucket, generation, object, uploadID). Mixing them produces corrupted bytes. s3ObjectState gains declaredParts map[s3PartKey]struct{}, populated by HandleObjectManifest. filterChunksForManifest takes the set and restricts emission to declared (partNo, partVersion) tuples. Test TestS3_StalePartVersionExcludedFromAssembledBody asserts a stale partVersion=7 cannot leak when the manifest declares partVersion=9. #497 Codex P2 -- preserve dot segments. safeJoinUnderRoot used filepath.Clean which collapses "a/../b" to "b". S3 treats those bytes literally; "a/../b" and "b" are distinct keys that would have silently merged into one output file. Now explicitly rejects any object key whose segments are "." or ".." with ErrS3MalformedKey. NUL bytes also rejected. Test TestS3_DotSegmentObjectKeyRejected covers the four forms ("a/../b", "a/./b", "..", "."). #521 Codex P2 -- cross-generation collision. s3BucketState gains activeGen captured from the bucket-meta record. flushBucket suppresses objects whose generation differs from activeGen (under --include-orphans, those flow to _orphans/; by default they're dropped with an s3_stale_generation_objects warning). Tests TestS3_StaleGenerationObjectExcluded. flushBucket cyclomatic complexity stayed under the cap by extracting flushBucketObjects.
|
3 Codex follow-ups addressed in P1 #619 — Assemble only manifest-declared parts. P2 #497 — Preserve dot segments. P2 #521 — Cross-generation collision.
@claude review |
|
Claude finished @bootjp's task in 3m 49s —— View job Claude finished @bootjp's task in 4m 18s — Review —
|
| Item | Status |
|---|---|
Silent bypass when manifest.upload_id == "" |
Still present (by design — documented in comment) |
Scratch dir recomputed after scratchDirCreated gate set |
Still present (negligible) |
Orphan chunks loaded via os.ReadFile in flushOrphanObject |
Still present (low priority) |
Summary
All three Codex P1/P2 findings from round 2 (declaredParts allow-list, dot-segment rejection, generation guard) are correctly implemented and tested. No new blocking issues found. The PR is ready to merge once the stacked bases land.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 92ee22efad
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // flush trusts the caller's collision detection. Path-traversal | ||
| // sanitisation runs in safeJoinUnderRoot, downstream of this | ||
| // function, where the bucket-directory root is in scope. | ||
| return obj.object, "", nil |
There was a problem hiding this comment.
Handle file-vs-directory S3 key collisions before flush
When a bucket contains both path/to and path/to/sub, resolveObjectFilename returns the raw object key for both objects, so flushObject later tries to materialize mutually incompatible filesystem paths (one as a file, one as a parent directory). On POSIX this causes MkdirAll/Rename failures during finalize, so valid S3 keysets can make the backup fail instead of applying the documented leaf-data rename + KEYMAP record strategy.
Useful? React with 👍 / 👎.
…to feat/backup-phase0a-s3
Codex P1 #615: when a bucket contains both `path/to` and `path/to/sub`, the natural-path mapping puts the shorter key at a file path and the longer key requires that path to be a parent directory. POSIX cannot represent both; the prior code would fail with EEXIST/ENOTDIR at MkdirAll/Rename during finalize, breaking the documented leaf-data-suffix rename strategy. Pre-flush computeDirPrefixes() walks every active-gen object key and accumulates each parent prefix (a key "a/b/c" contributes "a" and "a/b"). flushBucketObjects checks whether each object's key is itself a prefix consumed by another active key; if so, flushObjectWithCollision is called with needsLeafDataRename=true, which routes resolveObjectFilename onto the existing S3LeafDataSuffix path. KEYMAP.jsonl records the rename with KindS3LeafData so restore can reverse it. flushObject (the no-collision wrapper) was inlined into flushObjectWithCollision since it was unused after the call sites moved to the explicit-flag form. Test TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename emits "path/to" and "path/to/sub", asserts: - "path/to/sub" lands at its natural path with body "CHILD" - "path/to" is renamed to "path/to.elastickv-leaf-data" with body "LEAF" - KEYMAP.jsonl carries one record with KindS3LeafData and original "path/to"
|
Codex P1 #615 addressed in
@claude review |
|
Claude finished @bootjp's task in 3m 30s —— View job Claude finished @bootjp's task —
Review —
|
| Severity | Location | |
|---|---|---|
Renamed path could itself be a directory prefix for another user object (e.g., path/to, path/to/sub, path/to.elastickv-leaf-data/x): the renamed path/to writes path/to.elastickv-leaf-data as a file, then path/to.elastickv-leaf-data/x needs it as a directory → ENOTDIR. Error surfaces correctly but is opaque (no ErrS3MalformedKey mention). |
Low | s3.go:flushObjectWithCollision |
No test for 3-object multi-rename ("a", "a/b", "a/b/c") — the algorithm handles it but the test suite doesn't verify both KEYMAP records are emitted. |
Low | s3_test.go |
emitObjectAtGen fixture emits part_version as 0 (default); the new declaredParts filter for {partNo:1, partVersion:0} passes correctly via ParseBlobKey → unversioned branch (partVersion=0). The filter is exercised but at the same version as the "no filter" escape hatch. |
Negligible | s3_test.go:emitObjectAtGen |
Carry-overs from prior rounds
| Item | Status |
|---|---|
Silent bypass when manifest.upload_id == "" |
Still present (by design, documented) |
Scratch dir recomputed after scratchDirCreated gate set |
Still present (negligible) |
Orphan chunks loaded via os.ReadFile in flushOrphanObject |
Still present (low priority) |
The fix is correct and ready to merge once stacked bases land. None of the remaining observations are blocking.
Codex P2 #146 round 3: ReadManifest accepted manifests where the adapters and/or exclusions sections were omitted entirely from the JSON, because they were decoded into non-pointer struct fields and validate() only checked scalar policy fields. The empty zero-value sections were treated as valid, hiding truncated or corrupted MANIFEST files and dropping provenance about which scopes/filters the producer used. Manifest.Adapters and Manifest.Exclusions are now *Adapters / *Exclusions pointer types. validateRequiredFields rejects nil for either with ErrInvalidManifest. NewPhase0SnapshotManifest seeds them as &Adapters{} / &Exclusions{} so producers continue to satisfy the contract by default; populating individual scopes reaches in via the now-non-nil pointer. Tests TestReadManifest_RejectsMissingAdapters and TestReadManifest_RejectsMissingExclusions cover the new validation path.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2c44292fba
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| chunks := filterChunksForManifest(obj.chunkPaths, obj.uploadID, obj.declaredParts) | ||
| for _, k := range chunks { | ||
| path := obj.chunkPaths[k] | ||
| if err := appendFile(tmp, path); err != nil { |
There was a problem hiding this comment.
Validate manifest chunk completeness before writing body
assembleObjectBody filters/sorts whatever chunk files happen to exist and writes them, but it never verifies that every manifest-declared chunk is present (e.g., via parts[].chunk_count/chunk_sizes). If one or more blob keys are missing in the snapshot (partial data, race, or corruption), finalize still succeeds and emits a truncated object body without any error signal, which silently corrupts the backup output.
Useful? React with 👍 / 👎.
| for _, seg := range strings.Split(rel, "/") { | ||
| switch seg { | ||
| case ".", "..": | ||
| return "", errors.Wrapf(ErrS3MalformedKey, |
There was a problem hiding this comment.
Reject empty slash segments in object keys
The path guard rejects . and .. segments, but it allows empty segments, and filepath.Join then normalizes keys like a/, a, and a//b onto the same filesystem path. Since S3 treats these keys as distinct, this causes collisions/last-write-wins overwrites during finalize for valid object names containing repeated or trailing / characters.
Useful? React with 👍 / 👎.
…est' into feat/backup-phase0a-s3
…718, round 4) Two Codex P1 follow-ups. #729 -- chunk completeness check. assembleObjectBody filtered + sorted whatever chunk files happened to exist and wrote them, but never verified the manifest's declared chunk_count was actually present. A partial / racy / corrupted snapshot would silently emit a truncated body. Added verifyChunkCompleteness which counts chunks per (partNo, partVersion) and asserts the count matches manifest.parts[].chunk_count AND the highest chunkNo equals chunk_count-1. Mismatch surfaces as ErrS3IncompleteBlobChunks before any bytes are written. The declaredParts map's value type changed from struct{} to s3DeclaredPart{chunkCount} to carry the contract. #614 -- empty slash segments. safeJoinUnderRoot rejected `.` and `..` but allowed empty segments ("a//b", "a/", trailing "/"). filepath.Join collapses these, so distinct S3 keys would silently overwrite each other at finalize. The dot-segment guard now also rejects "" segments anywhere except the leading position (where a leading "/" produces an initial empty segment that filepath.Join handles safely under the already-tested absolute-path-confined-under-bucket behaviour). Tests: TestS3_IncompleteChunksRejected, TestS3_EmptySlashSegmentsRejected (covers a//b, a/, /a//, x/).
|
2 Codex P1 follow-ups addressed in #729 — Chunk completeness. #614 — Empty slash segments. @claude review |
|
Claude finished @bootjp's task in 3m 43s —— View job Review —
|
| chunks for part | declared count | count check | maxIndex check | outcome |
|---|---|---|---|---|
| chunkNo=0, chunkNo=2 (missing 1) | 3 | 2 ≠ 3 → fail | — | error ✅ |
| chunkNo=0 × 3 (dup of 0) | 3 | 3 = 3 | maxIndex+1=1 ≠ 3 → fail | error ✅ |
| chunkNo=0, chunkNo=1, chunkNo=2 | 3 | 3 = 3 | maxIndex+1=3 = 3 | pass ✅ |
| no chunks | N > 0 | 0 ≠ N → fail | — | error ✅ |
The nil escape hatch is safe and correctly limits backward-compat scope to tests that pre-date the manifest-parts feature.
Minor — want.chunkCount == 0 bypass. The guard want.chunkCount > 0 skips validation for zero-chunk declared parts. If a snapshot contains stale chunks for a part declared with chunk_count=0, filterChunksForManifest passes them through (key is in declaredParts), but verifyChunkCompleteness skips the check — those bytes are silently assembled into the output. A chunk_count=0 part is pathological in the first place (an S3 part with no data) so this is an extreme edge case, not a regression.
Fix 2 — safeJoinUnderRoot empty segment rejection ✅
The guard now rejects "" at i != 0 (s3.go:688). The leading-"" carve-out for rel starting with / is correct: filepath.Join(cleanRoot, "/a/b") = filepath.Clean(cleanRoot + "/" + "/a/b") = cleanRoot/a/b on POSIX — the absolute-path is absorbed under the bucket dir, which is covered by TestS3_AbsolutePathObjectKeyConfinedUnderBucket from round 1. The residual boundary check (strings.HasPrefix(joined, rootSep)) still guards against any future join anomaly.
All four test cases traced:
| key | segs |
first "" at i |
result |
|---|---|---|---|
"a//b" |
["a","","b"] |
i=1 | ErrS3MalformedKey ✅ |
"a/" |
["a",""] |
i=1 | ErrS3MalformedKey ✅ |
"/a//" |
["","a","",""] |
i=2 | ErrS3MalformedKey ✅ |
"x/" |
["x",""] |
i=1 | ErrS3MalformedKey ✅ |
The flushOrphanObject path is exempt (it uses EncodeSegment([]byte(obj.object)) for the directory name, not the raw key), so no gap there.
The error propagates cleanly: safeJoinUnderRoot → flushObjectWithCollision → flushBucketObjects → Finalize with no swallowing — TestS3_EmptySlashSegmentsRejected asserts errors.Is(err, ErrS3MalformedKey) on the Finalize call. ✓
Carry-overs from prior rounds
| Item | Status |
|---|---|
Silent bypass when manifest.upload_id == "" |
Still present (by design, documented) |
Scratch dir recomputed after scratchDirCreated gate set |
Still present (negligible) |
Orphan chunks loaded via os.ReadFile in flushOrphanObject |
Still present (low priority) |
No test for --include-orphans + stale-gen routing |
Still present (low priority) |
Both P1 findings are correctly fixed and tested. No new blocking issues. The PR is ready to merge once the stacked bases (feat/backup-phase0a-dynamodb, feat/s3keys-parse-blob-key) land.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba33df8e5a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if i != 0 { | ||
| return "", errors.Wrapf(ErrS3MalformedKey, | ||
| "object name has empty path segment %q", rel) |
There was a problem hiding this comment.
Reject leading-slash keys to avoid object overwrite collisions
safeJoinUnderRoot intentionally permits an empty first segment, so object keys like /a are accepted and then normalized by filepath.Join to the same output path as a. Because S3 treats /a and a as distinct keys, a bucket containing both will silently collide and whichever object flushes last wins, corrupting backup output without any keymap record. This is triggered whenever a bucket has leading-slash keys alongside equivalent non-leading-slash keys.
Useful? React with 👍 / 👎.
Codex P2 round 5 raised two correctness issues:
1. KeymapReader.Next accepted records missing the `original` field
because base64.RawURLEncoding.DecodeString("") succeeds. A
corrupted line dropping `original` silently rewrote the
encoded->original mapping to empty bytes, breaking exact key
recovery for SHA-fallback / collision-renamed entries. Now the
reader decodes into a presence-aware map first and rejects
records missing any of {encoded, original, kind}. Explicit empty
string remains valid.
2. ReadManifest decoded into Manifest before applying the
format_version gate. A future-major manifest that also changed
the JSON type of a known field (e.g. `phase` string -> int) was
surfacing as ErrInvalidManifest instead of
ErrUnsupportedFormatVersion, breaking the advertised
version-branching contract. Now ReadManifest probes
format_version with a relaxed shape first, branches on the
result, and only runs the strict struct decode on a
known-supported version.
Tests: TestKeymapReader_RejectsMissingOriginalField,
TestKeymapReader_AcceptsExplicitEmptyOriginal,
TestReadManifest_FutureMajorVersionTakesPrecedenceOverTypeMismatch.
…est' into feat/backup-phase0a-redis-simple
…est' into feat/backup-phase0a-sqs
…est' into feat/backup-phase0a-s3
…' into feat/backup-phase0a-sqs
…' into feat/backup-phase0a-s3
… round 7) Codex round 7 raised two issues on commit 0afb5e3: 1. P1: KEYMAP records with `"original": null` were silently accepted as empty bytes. The presence-aware decode added in round 5 only checked field-name presence; a null literal round-trips through json.Unmarshal into "" for a string field, and base64.RawURLEncoding.DecodeString("") succeeds. Add an explicit json.RawMessage compare against the literal `null` token so corrupted/truncated records fail at parse time instead of rewriting mappings to empty bytes. Apply the same guard to `encoded` and `kind` for symmetry. 2. P2: Exclusions sub-fields (booleans) defaulted to `false` on absence. validateRequiredFields only checked Exclusions != nil; a manifest that omitted `preserve_sqs_visibility` (or any other flag) silently passed with altered exclusion semantics, losing the producer-side provenance the section is meant to capture. Add `validateExclusionsFieldsPresent` that walks the raw JSON payload and rejects manifests missing or null'ing any of the four flags. Tests: - TestKeymapReader_RejectsExplicitNullField (3 sub-cases) - TestReadManifest_RejectsMissingExclusionFlag (5 sub-cases including explicit-null)
…d 5) Codex P2 round 7 (commit d965b65): `syscall.O_NOFOLLOW` is not defined on Windows, so the round-3 fix that used it directly broke the cross-platform build (`GOOS=windows GOARCH=amd64 go build ./internal/backup` failed at `undefined: syscall.O_NOFOLLOW`). Extract the platform-specific open into a small build-tagged helper: - open_nofollow_unix.go (`//go:build unix`) keeps the atomic O_NOFOLLOW open with ELOOP wrapping. - open_nofollow_windows.go (`//go:build windows`) uses Lstat then OpenFile. The residual TOCTOU window is acceptable on Windows because mounting a successful attack on the dump tree there already requires write access to the output directory plus SeCreateSymbolicLinkPrivilege — a much higher bar than the unix case the round-3 fix was hardening against. Both `openJSONL` and the keymap-creation path in `recordIfFallback` now go through `openSidecarFile`, eliminating the duplicate inline OpenFile blocks. Verified: `go test -race` passes on Linux/macOS; `GOOS=windows GOARCH=amd64 go build ./internal/backup/...` now succeeds (was the Codex repro).
…est' into feat/backup-phase0a-sqs
…' into feat/backup-phase0a-sqs
…est' into feat/backup-phase0a-s3
…' into feat/backup-phase0a-s3
Codex P2 round 8 (commit 86fbf3a): When `Finalize` encountered a queue state without a !sqs|queue|meta record (most commonly post-DeleteQueue, where the meta row was removed but tombstones remain), it warned and skipped the queue entirely — including any buffered side records. With --include-sqs-side-records on, that silently dropped exactly the deletion-era state operators opted in to capture, contradicting the opt-in contract. Now the orphan branch flushes side records to a `<encoded>.orphan` sub-directory while still warning that messages were dropped. The `.orphan` suffix prevents a restore tool from mistaking it for a real queue dir produced by a successful meta flush; the encoded prefix is the only stable identifier available without meta. Orphan messages still drop because restoring them without queue config would silently create a default-settings queue. The branch is gated on `s.includeSideRecords` so the default-off contract is preserved: without the flag, no `.orphan` directory is ever created. Tests: - TestSQS_OrphanQueueSideRecordsPreserved — drives a vis-record without a meta row, asserts the .orphan/_internals path is populated and the orphan-messages warning fires. - TestSQS_OrphanQueueSideRecordsSuppressedWhenOptOut — same scenario without --include-sqs-side-records; asserts no .orphan dir is created.
…at/backup-phase0a-s3
Codex P2 round 8 (commit 6b2dea0): 1. Missing `format_version` was misclassifying corrupted/truncated manifests as ErrUnsupportedFormatVersion. The version probe decoded straight into `uint32`, so an absent field collapsed silently to 0 and the gate said "upgrade required". A truncated manifest belongs in ErrInvalidManifest so callers branching on errors.Is can distinguish "this file is broken" from "newer producer". Add a presence-and-null probe on the raw JSON `format_version` token before the typed unmarshal. Extracted the whole probe into `probeManifestFormatVersion` to keep ReadManifest's cyclomatic complexity under the project's linter ceiling. 2. `WriteManifest` did not enforce `FormatVersion <= CurrentFormatVersion`. A caller mutating `m.FormatVersion = CurrentFormatVersion + 1` could write a manifest that the same package's ReadManifest then rejects — self-incompatible backup metadata for a one-line field mutation. validateRequiredFields now refuses that case. Tests: - TestReadManifest_RejectsMissingFormatVersion (also asserts the error is *not* ErrUnsupportedFormatVersion). - TestReadManifest_RejectsNullFormatVersion (mirror for explicit JSON `null`). - TestWriteManifest_RejectsFutureFormatVersion (round-trip self-compat check).
…est' into feat/backup-phase0a-redis-simple
…est' into feat/backup-phase0a-sqs
…est' into feat/backup-phase0a-s3
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 19d33a6511
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| f, err := os.Create(path) //nolint:gosec // path composed from output root | ||
| if err != nil { | ||
| return errors.WithStack(err) | ||
| } | ||
| b.keymap = NewKeymapWriter(f) |
There was a problem hiding this comment.
Close KEYMAP file descriptors after writing
recordKeymap opens KEYMAP.jsonl with os.Create and only stores a KeymapWriter; later closeBucketKeymap calls KeymapWriter.Close(), which flushes buffers but does not close the underlying file descriptor. In dumps that produce keymaps for many buckets (e.g., repeated collision renames), descriptors accumulate until EMFILE and subsequent bucket flushes fail, leaving backup output incomplete.
Useful? React with 👍 / 👎.
| func (s *S3Encoder) recordKeymap(b *s3BucketState, bucketDir, encodedFilename string, original []byte, kind string) error { | ||
| if b.keymap == nil { | ||
| path := filepath.Join(bucketDir, "KEYMAP.jsonl") | ||
| f, err := os.Create(path) //nolint:gosec // path composed from output root |
There was a problem hiding this comment.
Open KEYMAP sidecar with nofollow protections
This path uses os.Create directly for KEYMAP.jsonl, which follows symlinks. If an attacker (or stale prior run) places a symlink at that location, finalize can truncate/overwrite an arbitrary target file when the first keymap record is written. Other backup sidecars already use openSidecarFile to prevent this class of overwrite; KEYMAP should use the same guarded open path.
Useful? React with 👍 / 👎.
Codex P2 round 9 (commit 87971ae): The round-5 unix sidecar opener used O_NOFOLLOW so the kernel refuses symlinks atomically, but it kept O_TRUNC on the same call. A local adversary who can write the output directory could pre-create strings_ttl.jsonl / hll_ttl.jsonl / KEYMAP.jsonl as a hard link to a writable file outside the dump tree (e.g. /etc/passwd) and the open's O_TRUNC would zero that inode despite the symlink guard. Restructure the unix open: 1. Open with O_WRONLY|O_CREATE|O_NOFOLLOW (no O_TRUNC). 2. fstat() the descriptor. 3. If sysStat.Nlink > 1, refuse with a wrapped error and close. 4. Otherwise call f.Truncate(0) — atomic with the file we just verified is single-linked. The Windows build (open_nofollow_windows.go) is unchanged because its symlink/hard-link attack model is materially different (SeCreateSymbolicLinkPrivilege required, ACLs gate the directory) and the platform's link-count semantics under fstat are nontrivial. Test: - TestRedisDB_OpenJSONLRefusesHardLinkClobber — pre-creates strings_ttl.jsonl as a hard link to a bait file, exercises HandleString (which lazy-opens the sidecar), asserts the "refusing to overwrite hard-linked file" error and that the bait file is untouched.
…' into feat/backup-phase0a-sqs
…' into feat/backup-phase0a-s3
Codex round 9 raised two issues on commit ab38eb0: 1. P1: closeBucketKeymap leaked file descriptors. recordKeymap stored only the *KeymapWriter; closeBucketKeymap called KeymapWriter.Close() which flushes the bufio buffer but does NOT close the underlying *os.File. A dump producing keymaps for many buckets accumulated descriptors until EMFILE, after which subsequent bucket flushes failed and the dump output was incomplete. Track the *os.File on s3BucketState and close it from closeBucketKeymap alongside the KeymapWriter flush. 2. P2: recordKeymap used os.Create for KEYMAP.jsonl, which follows symlinks and clobbers hard links. The redis encoder already routes through openSidecarFile for the same kind of sidecar; mirror that path so a stale prior run (or local adversary) cannot turn a missing KEYMAP into an arbitrary-write primitive against /etc/passwd or similar. Test: TestS3_KeymapRefusesSymlinkAtFinalize pre-creates KEYMAP.jsonl as a symlink to a bait file, drives a meta-suffix rename (so recordKeymap fires), and asserts both that the finalize returns the symlink-refusal error and that the bait file is untouched.
…d 7) Codex round 9 raised two issues on commit 87971ae: 1. P1: O_NOFOLLOW only blocks the trailing component. A pre-existing directory symlink anywhere up the path (e.g. `<outRoot>/redis/db_0/strings -> /tmp/outside`) let MkdirAll silently honor it, then writeFileAtomic / openSidecarFile wrote outside outRoot. ensureDir now Lstat-walks each ancestor under outRoot via assertNoSymlinkAncestors and refuses if any is a symlink. This is best-effort against TOCTOU (an adversary can race us at any boundary), but closes the much more common case of a stale or misconfigured symlink in the dump tree. Full race-free traversal would need os.Root / openat which is out of scope at this layer. 2. P2: IsBlobAtomicWriteOutOfSpace only checked syscall.ENOSPC, missing Windows disk-full failures (ERROR_DISK_FULL = 112, ERROR_HANDLE_DISK_FULL = 39). Retry/alarm logic that keys off this helper would silently misclassify Windows disk exhaustion. Move the platform-specific check into disk_full_unix.go and disk_full_windows.go; the latter accepts both Win32 codes plus ENOSPC for the Go-translated case. Tests: - TestRedisDB_RefusesSymlinkedAncestor: pre-places a `redis/db_0/strings -> /tmp/...` symlink before HandleString and asserts the ancestor refusal + that the bait dir is untouched. Verified: - go test -race ./internal/backup/... clean. - GOOS=windows GOARCH=amd64 go build ./internal/backup/... clean. - golangci-lint clean.
…' into feat/backup-phase0a-sqs
…' into feat/backup-phase0a-s3
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
@claude review Round 7 fixes for Codex round 9 on commit ab38eb0 (commit
|
|
Claude finished @bootjp's task in 2m 29s —— View job Review —
|
| File | Opener |
|---|---|
KEYMAP.jsonl |
openSidecarFile ← this fix |
_incomplete_uploads/records.jsonl |
openSidecarFile (via openJSONL) |
The Unix path uses O_NOFOLLOW atomically — no TOCTOU race. The Windows path uses Lstat-then-OpenFile with the documented acceptable residual window. Both produce the "refusing to overwrite symlink at %s" error string the test checks. ✓
return err (not errors.WithStack(err)) at s3.go:769 is correct: openSidecarFile already wraps the error via cockroachdberr.WithStack. ✓
Test: TestS3_KeymapRefusesSymlinkAtFinalize ✅
The test (s3_test.go:292) correctly:
- Pre-creates
bucketDir/KEYMAP.jsonlas a symlink to a bait file - Triggers
recordKeymapviaWithRenameCollisions(true)+ a.elastickv-meta.jsonkey - Asserts
Finalizereturns an error containing"refusing to overwrite symlink" - Asserts
os.ReadFile(bait)still contains"stay-out"— the bait was not clobbered ✓
The propagation path is clean: Finalize → flushBucket → closeBucketKeymap is called first (line 496), but recordKeymap fires earlier via flushObjectWithCollision → the error surfaces there and bubbles through flushBucketObjects → flushBucket → Finalize. ✓
Stale comment (carry-over from rounds 5 and 6)
s3.go:706–709 still reads:
// Use filepath.Join here — its only behavioural change vs. raw
// concatenation after the dot-segment guard above is normalising
// a leading "/" off `rel` (which is what we want: absolute-path
// keys collapse safely under bucketDir).Since round 5 (leading-slash rejection), no leading / can reach filepath.Join — it is caught as an empty segment by the guard. The phrase "which is what we want: absolute-path keys collapse safely" actively contradicts the current design. Still low priority, still not blocking.
Carry-overs from prior rounds
| Item | Status |
|---|---|
Silent bypass when manifest.upload_id == "" |
Still present (by design, documented) |
Scratch dir recomputed after scratchDirCreated gate |
Still present (negligible) |
Orphan chunks loaded via os.ReadFile in flushOrphanObject |
Still present (low priority) |
No test for --include-orphans + stale-gen routing |
Still present (low priority) |
Stale comment at safeJoinUnderRoot re: absolute-path confinement |
Still present (low priority) |
Both P1 and P2 findings are correctly fixed and tested. No new blocking issues. The PR is ready to merge once the stacked bases (feat/backup-phase0a-dynamodb, feat/s3keys-parse-blob-key) land.
Codex round 9 (commit 8d08827): `parseSQSMessageDataKey` and `parseSQSGenericKey` only handled the legacy SQS key shape: !sqs|msg|<family>|<encQueue><gen 8B><rest> HT-FIFO partitioned queues use the discriminator-prefixed shape (adapter/sqs_keys.go SqsPartitionedMsg*Prefix): !sqs|msg|<family>|p|<encQueue>|<part 4B><gen 8B><rest> The legacy heuristic read "p" as the queue segment and aborted with ErrSQSMalformedKey. In environments running partitioned FIFO queues HandleMessageData would refuse every record and the dump pass would abort entirely (P1); side records would similarly collapse into one mis-routed bucket under --include-sqs-side-records (P2). Recognise the partitioned shape via the literal "p|" prefix on the post-family suffix and route into a new parseSQSPartitionedQueueAndTrailer helper that: - peels the "p|" discriminator - reads the queue segment up to the literal '|' terminator - validates that the trailer carries at least partition u32 + gen u64 bytes - if the caller is parsing a msg-data key, validates the msg-id segment decodes as base64url Tests: - TestSQS_ParsePartitionedMessageDataKey - TestSQS_ParsePartitionedSideRecordKey (vis/byage/dedup/group) - TestSQS_ParsePartitionedMessageDataKey_RejectsTruncatedTrailer The legacy parser path is unchanged for non-partitioned keys.
…at/backup-phase0a-s3
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1dc688429b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if needsLeafDataRename { | ||
| return obj.object + S3LeafDataSuffix, KindS3LeafData, nil | ||
| } |
There was a problem hiding this comment.
Prevent leaf-data rename from colliding with user keys
When needsLeafDataRename is true, this always rewrites "<object>" to "<object>.elastickv-leaf-data" without checking whether that target is already a real S3 object key in the same bucket. A valid bucket containing both path/to + path/to/sub and an existing path/to.elastickv-leaf-data will have two distinct objects mapped to one output path, so finalize becomes last-write-wins and backup data is silently lost.
Useful? React with 👍 / 👎.
| if strings.HasSuffix(obj.object, S3MetaSuffixReserved) { | ||
| if !s.renameCollisions { | ||
| return "", "", errors.Wrapf(ErrS3MetaSuffixCollision, | ||
| "bucket %q object %q", b.name, obj.object) | ||
| } | ||
| return obj.object + ".user-data", KindMetaCollision, nil | ||
| } |
There was a problem hiding this comment.
Guard meta-collision rename target against key conflicts
In rename-collision mode, objects ending with .elastickv-meta.json are rewritten to "<object>.user-data" unconditionally. If the bucket already contains a real key with that .user-data suffix, both objects are emitted to the same filesystem path and one overwrites the other, and KEYMAP cannot represent both originals, so restore cannot reconstruct the original keyset.
Useful? React with 👍 / 👎.
| st.manifest = &s3PublicManifest{ | ||
| FormatVersion: 1, | ||
| ETag: live.ETag, | ||
| SizeBytes: live.SizeBytes, | ||
| ContentType: live.ContentType, | ||
| ContentEncoding: live.ContentEncoding, | ||
| CacheControl: live.CacheControl, | ||
| ContentDisposition: live.ContentDisposition, | ||
| UserMetadata: live.UserMetadata, | ||
| } |
There was a problem hiding this comment.
Populate sidecar last_modified from manifest timestamp
The decoded live manifest includes last_modified_hlc, and s3PublicManifest exposes a last_modified field, but this struct literal never sets it. As a result every exported .elastickv-meta.json drops object modification time metadata, reducing backup fidelity for tooling that relies on the sidecar to preserve HEAD-visible timestamps.
Useful? React with 👍 / 👎.
Summary
Stacked on top of #716 (DynamoDB) and #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder — the most complex per-adapter piece because it must reassemble multipart object bodies from independent blob chunks.
Snapshot prefixes handled:
!s3|bucket|meta|<bucket>→s3/<bucket>/_bucket.json!s3|obj|head|<bucket><gen><object>→s3/<bucket>/<object>.elastickv-meta.json(sidecar)!s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]→ spilled to scratch on arrival; concatenated intos3/<bucket>/<object>at Finalize in(partNo, partVersion, chunkNo)order!s3|upload|meta|/!s3|upload|part|: excluded by default; opt in via--include-incomplete-uploadsto emit under_incomplete_uploads/records.jsonl!s3|bucket|gen|,!s3|gc|upload|,!s3route|: ignoredHow body assembly works
Each blob chunk is written atomically to a per-(bucket, object) scratch directory as it arrives, registered under the
(uploadID, partNo, chunkNo, partVersion)routing key. At Finalize, chunks for each object are sorted and concatenated into a single body file via tmp+rename. Body bytes are never held in memory — only the scratch path map. The scratch tree is removed on Finalize.Reserved-suffix collision handling
A user object key ending in
.elastickv-meta.jsonis rejected withErrS3MetaSuffixCollisionby default.WithRenameCollisions(true)appends.user-datato the body file and records the rename ins3/<bucket>/KEYMAP.jsonlwithKindMetaCollisionso the dump remains reversible. Per-bucketKEYMAP.jsonlis opened lazily and dropped on Finalize if no records were written.Test plan
go test -race ./internal/backup/...— pass.golangci-lint run ./internal/backup/...— clean.--include-incomplete-uploadsround-trip + default skip, versioned blob assembly.Stacking
Base:
feat/backup-phase0a-dynamodb(PR #716). Also depends onfeat/s3keys-parse-blob-key(PR #717).