Skip to content

backup: S3 encoder for buckets, objects, and blob reassembly (Phase 0a)#718

Open
bootjp wants to merge 51 commits intofeat/backup-phase0a-dynamodbfrom
feat/backup-phase0a-s3
Open

backup: S3 encoder for buckets, objects, and blob reassembly (Phase 0a)#718
bootjp wants to merge 51 commits intofeat/backup-phase0a-dynamodbfrom
feat/backup-phase0a-s3

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 30, 2026

Summary

Stacked on top of #716 (DynamoDB) and #717 (s3keys.ParseBlobKey). Adds the S3 encoder for the Phase 0 logical-backup decoder — the most complex per-adapter piece because it must reassemble multipart object bodies from independent blob chunks.

Snapshot prefixes handled:

  • !s3|bucket|meta|<bucket>s3/<bucket>/_bucket.json
  • !s3|obj|head|<bucket><gen><object>s3/<bucket>/<object>.elastickv-meta.json (sidecar)
  • !s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>] → spilled to scratch on arrival; concatenated into s3/<bucket>/<object> at Finalize in (partNo, partVersion, chunkNo) order
  • !s3|upload|meta|/!s3|upload|part|: excluded by default; opt in via --include-incomplete-uploads to emit under _incomplete_uploads/records.jsonl
  • !s3|bucket|gen|, !s3|gc|upload|, !s3route|: ignored

How body assembly works

Each blob chunk is written atomically to a per-(bucket, object) scratch directory as it arrives, registered under the (uploadID, partNo, chunkNo, partVersion) routing key. At Finalize, chunks for each object are sorted and concatenated into a single body file via tmp+rename. Body bytes are never held in memory — only the scratch path map. The scratch tree is removed on Finalize.

Reserved-suffix collision handling

A user object key ending in .elastickv-meta.json is rejected with ErrS3MetaSuffixCollision by default. WithRenameCollisions(true) appends .user-data to the body file and records the rename in s3/<bucket>/KEYMAP.jsonl with KindMetaCollision so the dump remains reversible. Per-bucket KEYMAP.jsonl is opened lazily and dropped on Finalize if no records were written.

Test plan

  • go test -race ./internal/backup/... — pass.
  • golangci-lint run ./internal/backup/... — clean.
  • 11 tests covering bucket-meta round-trip, multipart out-of-order chunk reassembly, orphan-chunks warning, sidecar collision (rejected/renamed), JSON-decode rejection on both paths, ignored-prefix no-op, --include-incomplete-uploads round-trip + default skip, versioned blob assembly.

Stacking

Base: feat/backup-phase0a-dynamodb (PR #716). Also depends on feat/s3keys-parse-blob-key (PR #717).

bootjp added 2 commits April 30, 2026 18:52
The Phase 0a logical-backup decoder (internal/backup) needs to route
each !s3|blob| record to its assembled object body without holding a
live cluster. Today the package only exports parsers for object
manifests and upload parts; blob keys are constructable via
BlobKey / VersionedBlobKey but not parseable.

ParseBlobKey decodes the 6-component form (BlobKey output) and the
7-component form (VersionedBlobKey output with partVersion != 0)
into all components: bucket, generation, object, uploadID, partNo,
chunkNo, partVersion. Truncated keys, malformed segments, and trailers
that aren't either zero bytes or exactly one u64 are rejected with
ok=false.

Implementation is split into parseBlobKeyHead (the 6-component head)
and parseOptionalPartVersion (the trailer) so cyclomatic complexity
stays under the package cap without a //nolint marker.

Tests cover the un-versioned round-trip, versioned round-trip,
partVersion=0 fallback to un-versioned shape (matching
VersionedBlobKey's documented behaviour), rejection of non-blob keys
(bucket meta, object manifest, upload part, junk), and rejection of
trailing-garbage keys.
Builds on PR #716 (DynamoDB) and PR #717 (s3keys.ParseBlobKey).
Adds the S3 encoder for the Phase 0 logical-backup decoder.

Snapshot prefixes handled:
- !s3|bucket|meta|<bucket> -> s3/<bucket>/_bucket.json (live
  s3BucketMeta projected into the dump-format s3PublicBucket; cluster-
  internal fields like CreatedAtHLC and Generation stripped).
- !s3|obj|head|<bucket><gen><object> -> s3/<bucket>/<object>.elastickv-meta.json
  (live s3ObjectManifest projected into the dump-format
  s3PublicManifest; UploadID and per-part chunk arrays stripped).
- !s3|blob|<bucket><gen><object><uploadID><partNo><chunkNo>[<partVersion>]:
  spilled to a per-(bucket, object) scratch directory as it arrives;
  assembled at Finalize by walking sortChunkKeys output (partNo,
  partVersion, chunkNo) and concatenating into the final body file
  via writeFileAtomic-style tmp+rename. Memory: O(num_objects); body
  bytes never held in memory.
- !s3|upload|meta|, !s3|upload|part|: excluded by default;
  --include-incomplete-uploads emits them as opaque {prefix, key,
  value} JSONL records under s3/<bucket>/_incomplete_uploads/.
- !s3|bucket|gen|, !s3|gc|upload|, !s3route|: HandleIgnored no-op so
  the master pipeline can dispatch all !s3|* prefixes uniformly.

Reserved-suffix collision:
- A user object key ending in .elastickv-meta.json is rejected with
  ErrS3MetaSuffixCollision by default; --rename-collisions appends
  .user-data and records the rename in s3/<bucket>/KEYMAP.jsonl with
  KindMetaCollision so the on-disk reverse map is sound.
- Per-bucket KEYMAP.jsonl is opened lazily and dropped if Finalize
  finds zero records (the spec's "omit empty file" rule).

Body assembly:
- Each blob chunk is written atomically to scratch as it arrives;
  the chunk path is registered in the object's chunkPaths map keyed
  by (uploadID, partNo, chunkNo, partVersion).
- Finalize sorts the chunk keys by (partNo, partVersion, chunkNo)
  and concatenates the matching scratch files into a single
  <bodyPath> via tmp+rename.
- The scratch tree is rm -rf'd on Finalize via deferred RemoveAll,
  so a successful run leaves nothing behind.

Tests: bucket meta + single-chunk object round-trip, multipart
ordering verification (chunks emitted out of order, body assembled
in part/chunk order), orphan-chunks warning, .elastickv-meta.json
collision rejection (default) and rename (with flag) including
KEYMAP entry, malformed JSON rejection on both manifest and bucket
paths, ignored-prefix no-op, --include-incomplete-uploads round-
trip, default no _incomplete_uploads dir, versioned blob assembly.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 627bc9f9-173b-4817-ae9f-3e9bc3d46a91

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/backup-phase0a-s3

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an S3Encoder to process S3 backup records, including bucket metadata, object manifests, and blob chunks, with final assembly performed during the Finalize step. It also adds a ParseBlobKey utility for decoding S3 blob keys. Feedback focuses on critical improvements for data integrity and performance, such as filtering blobs by uploadID during assembly to avoid corruption from multiple upload attempts, optimizing file system operations by caching directory creation status and file handles, and correctly implementing the includeOrphans flag. Additionally, the reviewer advised against ignoring errors from resource cleanup methods.

Comment thread internal/backup/s3.go
Comment on lines +99 to +106
type s3ObjectState struct {
bucket string
generation uint64
object string
manifest *s3PublicManifest
// chunkPaths maps (uploadID, partNo, chunkNo) -> scratch path.
chunkPaths map[s3ChunkKey]string
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The s3ObjectState should store the uploadID from the object manifest to ensure that only blobs belonging to the active upload are assembled. Additionally, adding a dirCreated flag can optimize directory creation in HandleBlob.

type s3ObjectState struct {
	bucket     string
	generation uint64
	object     string
	uploadID   string
	dirCreated bool
	manifest   *s3PublicManifest
	// chunkPaths maps (uploadID, partNo, chunkNo) -> scratch path.
	chunkPaths map[s3ChunkKey]string
}

Comment thread internal/backup/s3.go Outdated
Comment on lines +314 to +318
jl, err := openJSONL(filepath.Join(dir, "records.jsonl"))
if err != nil {
return err
}
defer func() { _ = closeJSONL(jl) }()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Opening and closing the records.jsonl file for every incomplete upload record is inefficient. Consider lazily opening the file and caching the writer in s3BucketState to amortize the I/O cost. Furthermore, if openJSONL uses os.Create, this logic will incorrectly truncate the file on every call, preserving only the last record. Avoid silently dropping entries during serialization to ensure state consistency.

References
  1. Avoid silently dropping entries during serialization or normalization. If placeholder or incomplete entries are valid in some contexts, provide a specific, more lenient handling path for them to ensure they are preserved on disk and avoid state inconsistencies.

Comment thread internal/backup/s3.go Outdated
Comment on lines +472 to +476
func (st *s3ObjectState) attachManifestParts(_ string, _ []s3LivePart) {
// Intentionally empty: assembleObjectBody consumes the manifest
// directly via st.manifest at Finalize. Kept as a hook so the
// callsite reads symmetrically with HandleBlob.
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Capture the uploadID from the manifest so it can be used to filter blobs during assembly.

func (st *s3ObjectState) attachManifestParts(uploadID string, _ []s3LivePart) {
	st.uploadID = uploadID
}

Comment thread internal/backup/s3.go Outdated
Comment on lines +497 to +504
chunks := sortChunkKeys(obj.chunkPaths)
for _, k := range chunks {
path := obj.chunkPaths[k]
if err := appendFile(tmp, path); err != nil {
_ = tmp.Close()
return err
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Filter the chunks by the uploadID specified in the manifest. This prevents data corruption if the snapshot contains blobs from multiple upload attempts for the same object key. Additionally, ensure that errors from tmp.Close() are not silently ignored.

	chunks := sortChunkKeys(obj.chunkPaths)
	for _, k := range chunks {
		if k.uploadID != obj.uploadID {
			continue
		}
		path := obj.chunkPaths[k]
		if err := appendFile(tmp, path); err != nil {
			if closeErr := tmp.Close(); closeErr != nil {
				// Log the error as per repository rules
			}
			return err
		}
	}
References
  1. Do not silently ignore errors from Close() methods on resources like network connections. Log them to ensure resource leaks or other cleanup problems are visible.

Comment thread internal/backup/s3.go
Comment on lines +282 to +285
dir := filepath.Join(s.scratchRoot, EncodeSegment([]byte(bucket)), EncodeSegment([]byte(object)))
if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode
return errors.WithStack(err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling os.MkdirAll for every blob chunk is inefficient. Use the dirCreated flag in s3ObjectState to perform this operation only once per object.

	if !st.dirCreated {
		dir := filepath.Join(s.scratchRoot, EncodeSegment([]byte(bucket)), EncodeSegment([]byte(object)))
		if err := os.MkdirAll(dir, 0o755); err != nil { //nolint:mnd // 0755 == standard dir mode
			return errors.WithStack(err)
		}
		st.dirCreated = true
	}

Comment thread internal/backup/s3.go Outdated
if err != nil {
return err
}
defer func() { _ = closeJSONL(jl) }()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error from closeJSONL is silently ignored. For a writable resource, failing to close properly might mean data was not flushed to disk. Per general rules, errors from Close() methods should not be silently ignored.

References
  1. Do not silently ignore errors from Close() methods on resources like network connections. Log them to ensure resource leaks or other cleanup problems are visible.

Comment thread internal/backup/s3.go
Comment on lines +379 to +386
if obj.manifest == nil {
s.emitWarn("s3_orphan_chunks",
"bucket", b.name,
"object", obj.object,
"chunks", len(obj.chunkPaths),
"hint", "blob chunks present but no !s3|obj|head record matched")
return nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The includeOrphans flag is currently ignored. The implementation only emits a warning but does not surface the orphan chunks under _orphans/ as described in the documentation and the WithIncludeOrphans method.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 554dd42276

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go
// vs `path/to/sub`) are deferred until the master pipeline
// detects them across multiple manifests; this PR's per-object
// flush trusts the caller's collision detection.
return obj.object, "", nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Sanitize object keys before deriving output paths

resolveObjectFilename returns the raw S3 object key and flushObject then does filepath.Join(bucketDir, objectName), so object names like ../x or /tmp/x will escape the bucket directory and write outside the dump root. Because S3 object keys are user-controlled, this enables path traversal/arbitrary file overwrite during backup decode and can corrupt host files rather than just the export tree.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go Outdated
Comment on lines +497 to +500
chunks := sortChunkKeys(obj.chunkPaths)
for _, k := range chunks {
path := obj.chunkPaths[k]
if err := appendFile(tmp, path); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict body assembly to the manifest-selected chunks

assembleObjectBody concatenates every entry in obj.chunkPaths after sorting, but never filters by the manifest's upload_id/declared parts, so stale chunks from a different upload attempt for the same (bucket,generation,object) can be merged into the output body. This produces corrupted object bytes whenever old or concurrent multipart chunks are still present in the blob keyspace.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go Outdated
Comment on lines +314 to +318
jl, err := openJSONL(filepath.Join(dir, "records.jsonl"))
if err != nil {
return err
}
defer func() { _ = closeJSONL(jl) }()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Append incomplete-upload records instead of truncating

HandleIncompleteUpload reopens records.jsonl on every record via openJSONL(...), and that helper uses create/truncate semantics, so each call overwrites prior lines. With --include-incomplete-uploads, only the last upload-meta/part record survives per bucket, defeating the stated “one JSON line per record” behavior and losing forensic data.

Useful? React with 👍 / 👎.

bootjp added 2 commits April 30, 2026 19:38
Three correctness bugs and several quality issues, all valid.

CRITICAL: path traversal (Codex P1 #425).
S3 object keys are user-controlled. A key like "../../../etc/passwd"
would, under the prior filepath.Join-and-write code, escape the
bucket directory and overwrite host files. Added safeJoinUnderRoot
which Cleans the joined path and asserts it stays rooted under the
bucket dir; ".." traversal is rejected with ErrS3MalformedKey.
Absolute paths (leading "/") are normalised under the bucket dir by
filepath.Join, which is the safest outcome. Tests:
TestS3_PathTraversalAttemptRejected,
TestS3_AbsolutePathObjectKeyConfinedUnderBucket.

CRITICAL: stale upload-id chunks merged into body (Codex P1 #500,
Gemini HIGH #106/#476/#504).
A snapshot mid-delete-and-recreate or mid-retry can carry blob
chunks for multiple upload attempts under the same (bucket, gen,
object). The prior assembleObjectBody concatenated every chunk
regardless of upload_id, producing corrupted bytes. Now:
- s3ObjectState gains uploadID; HandleObjectManifest sets it.
- New filterChunksForManifest takes the chunkPaths map and the
  manifest's uploadID, returns only matching chunks sorted by
  (partNo, partVersion, chunkNo). Stale-uploadID chunks never
  enter the assembled body.
Test: TestS3_StaleUploadIDChunksFilteredFromAssembledBody.

CRITICAL: incomplete-uploads file truncated per record (Codex P2
#318, Gemini HIGH+MEDIUM #318).
HandleIncompleteUpload re-opened records.jsonl on every call;
openJSONL uses os.Create which truncates. Only the last record
survived per bucket. Now:
- s3BucketState carries an incompleteUploadsJL *jsonlFile lazily
  opened on the first record and cached.
- flushBucket closes it and surfaces the error (was silently
  ignored).
Test: TestS3_IncompleteUploadsAppendsAcrossCalls (3 records, 3
lines on disk).

QUALITY:
- Gemini MEDIUM #285 (MkdirAll per blob): s3ObjectState gains a
  scratchDirCreated bool; HandleBlob runs MkdirAll once.
- Gemini MEDIUM #318 (closeJSONL error ignored): keymap and
  incomplete-uploads writers now surface close errors via
  closeBucketKeymap / explicit closeJSONL return-check.
- Gemini MEDIUM #386 (includeOrphans flag ignored): orphan chunks
  for objects without manifests now write to
  <bucket>/_orphans/<encoded-object>/<chunk>.bin under
  WithIncludeOrphans(true). Test:
  TestS3_OrphanChunksWrittenWhenIncludeOrphans.

Also removed the now-unused attachManifestParts placeholder and
sortChunkKeys helper; their logic moved into
filterChunksForManifest which combines uploadID filtering with the
sort.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

3 critical correctness bugs + 3 quality issues addressed in 19ae3289. All driven by reviewer findings.

Critical

Codex P1 #425 — Path traversal. A user-supplied key like ../../../etc/passwd would have escaped the bucket dir and overwritten host files. Added safeJoinUnderRoot which filepath.Cleans the joined path and asserts it stays rooted under the bucket dir; .. traversal returns ErrS3MalformedKey. Absolute paths (leading /) collapse safely under the bucket dir via filepath.Join, which I assert with TestS3_AbsolutePathObjectKeyConfinedUnderBucket.

Codex P1 #500 + Gemini HIGH #106/#476/#504 — Stale upload-id chunks. A snapshot mid-delete-and-recreate or mid-retry can carry blob chunks for multiple upload attempts under the same (bucket, gen, object). The prior assembleObjectBody concatenated every chunk regardless of upload_id → corrupted bytes. s3ObjectState.uploadID is now set by HandleObjectManifest; new filterChunksForManifest returns only matching chunks. TestS3_StaleUploadIDChunksFilteredFromAssembledBody.

Codex P2 #318 + Gemini HIGH+MEDIUM #318 — Truncated records.jsonl. HandleIncompleteUpload reopened the file on every call (os.Create → truncate); only the last record survived. Now s3BucketState.incompleteUploadsJL is opened lazily once per bucket, cached, closed in flushBucket (with error surfaced). TestS3_IncompleteUploadsAppendsAcrossCalls.

Quality

Gemini MEDIUM #285MkdirAll per blob. s3ObjectState.scratchDirCreated makes the call once per object.

Gemini MEDIUM #318 — closeJSONL error. closeBucketKeymap and the incompleteUploadsJL close path now both surface the error.

Gemini MEDIUM #386includeOrphans unimplemented. Orphan chunks (chunks present without matching manifest) now write to <bucket>/_orphans/<encoded-object>/<chunk>.bin under WithIncludeOrphans(true). TestS3_OrphanChunksWrittenWhenIncludeOrphans.

Removed the now-unused attachManifestParts placeholder; sortChunkKeys's logic moved into filterChunksForManifest.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 4m 18s —— View job


Review — 19ae3289 post-reviewer-feedback pass

  • Gather context and read changed files
  • Review path traversal fix (safeJoinUnderRoot)
  • Review stale upload-id chunk filtering (filterChunksForManifest)
  • Review records.jsonl truncation fix (lazy open + cached writer)
  • Review MkdirAll per-blob optimization (scratchDirCreated)
  • Review closeJSONL error surfacing
  • Review includeOrphans implementation
  • Check test coverage for each fix
  • Check for any remaining issues

All six reviewer-flagged defects are correctly addressed. Here's the per-fix verdict plus a few remaining observations.

Fix 1 — Path traversal: safeJoinUnderRoot

The guard (filepath.Clean + HasPrefix(joined, cleanRoot+sep)) correctly rejects ../../../etc/passwd-style keys. The test TestS3_PathTraversalAttemptRejected covers the canonical attack; TestS3_AbsolutePathObjectKeyConfinedUnderBucket documents the deliberate "absolute path → confined under bucket dir" semantics.

Minor — object key "." passes the guard. filepath.Join(cleanRoot, ".") cleans to cleanRoot, satisfying the joined == cleanRoot branch. flushObject would then attempt to write a body file at the bucket directory path, which the OS would reject with EISDIR. The error surfaces, so there's no silent data corruption — but the error message would point to os.Rename rather than key validation, making it hard to diagnose. Worth catching explicitly:

if rel == "." || rel == ".." {
    return "", errors.Wrapf(ErrS3MalformedKey, "object name %q is a directory alias", rel)
}

Fix 2 — Stale upload-ID chunks: filterChunksForManifest

st.uploadID is correctly captured from live.UploadID in HandleObjectManifest (line 288) and consumed in assembleObjectBody (line 589). The filter correctly rejects chunks whose uploadID doesn't match. Test TestS3_StaleUploadIDChunksFilteredFromAssembledBody covers the multi-attempt scenario.

Minor — empty uploadID bypass is silent. The escape hatch (manifestUploadID != "" && k.uploadID != manifestUploadID) is documented in the comment. However, if live.UploadID is empty (malformed snapshot or very old format), the filter is a no-op and all chunks merge regardless of upload attempt — potentially corrupting the output. Consider adding a structured warning when live.UploadID == "" in HandleObjectManifest:

if live.UploadID == "" {
    s.emitWarn("s3_manifest_missing_upload_id",
        "bucket", bucket, "object", object,
        "hint", "stale-chunk filtering disabled for this object")
}

Fix 3 — records.jsonl truncation: lazy open + cached writer ✅

The lazy-open pattern (b.incompleteUploadsJL == nil check in HandleIncompleteUpload) is correct: openJSONL uses os.Create (truncating), but since it's called exactly once per bucket and the file handle is cached, no subsequent call can truncate the file. The flushBucket path closes the writer and surfaces the error. Test TestS3_IncompleteUploadsAppendsAcrossCalls asserting lines == 3 would catch any regression.

Note: openJSONL also calls os.MkdirAll, but HandleIncompleteUpload already calls os.MkdirAll(dir, ...) on the parent before calling openJSONL. The double MkdirAll is harmless but redundant.

Fix 4 — MkdirAll per-blob: scratchDirCreated flag ✅

s3ObjectState.scratchDirCreated gates the MkdirAll call correctly. The scratch dir path is still recomputed on every HandleBlob call even after the gate is set — computing filepath.Join(...) is cheap, but caching dir on the struct would eliminate the repeated allocation and the three EncodeSegment calls. Not blocking, just worth noting for very blob-heavy objects.

Fix 5 — closeJSONL error surfacing ✅

Both closeBucketKeymap (surfacing b.keymap.Close() errors) and flushBucket (surfacing closeJSONL(b.incompleteUploadsJL) errors) now return errors from Close. The closeJSONL implementation correctly flushes the bufio.Writer before closing — flush errors take precedence over close errors.

Fix 6 — includeOrphans implementation ✅

Orphan chunks are written to <bucketDir>/_orphans/<EncodeSegment(obj.object)>/ as individual .bin files (named via blobScratchName). Using EncodeSegment here is correct: it percent-encodes /, .., NUL, etc., so the encoded object key produces a safe directory name. Test TestS3_OrphanChunksWrittenWhenIncludeOrphans confirms the directory is created.

Minor — orphan chunks read into memory. flushOrphanObject uses os.ReadFile(scratchPath) (entire chunk in RAM), whereas the main assembly path streams via appendFile/io.Copy. For a diagnostic recovery path this is acceptable, but a very large chunk would spike memory. Low priority.


Unrelated: s3keys.ParseBlobKey

parsedBlobHead correctly factors the 6-component head; parseOptionalPartVersion handles the exact-u64Bytes trailer for versioned keys and treats excess bytes as a parse failure. Tests cover unversioned, versioned, version=0 (falls back to unversioned), and non-blob key rejection.


Summary

The three Codex P1/P2 bugs (path traversal, stale-upload corruption, records truncation) and three Gemini items (MkdirAll per-blob, closeJSONL error, includeOrphans unimplemented) are all fixed and tested. The remaining observations are minor and none is blocking:

Item Severity Location
"." object key hits EISDIR instead of ErrS3MalformedKey Low s3.go:safeJoinUnderRoot
Silent bypass when manifest upload_id is empty Low s3.go:HandleObjectManifest
Scratch dir recomputed on every blob after gate set Negligible s3.go:HandleBlob
Orphan chunks loaded into memory via os.ReadFile Low s3.go:flushOrphanObject

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 19ae3289ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go
Comment on lines +616 to +619
if manifestUploadID != "" && k.uploadID != manifestUploadID {
continue
}
keys = append(keys, k)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Assemble only manifest-declared part versions

Fresh evidence: this revision now filters by upload_id, but filterChunksForManifest still appends every (partNo, partVersion, chunkNo) for that upload without checking the manifest’s declared parts. In S3, overwritten parts can leave older blob versions present until async cleanup runs, so this logic can concatenate stale part versions (or silently miss required chunks) and produce corrupted object bodies even when the manifest points to a single canonical part version.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go Outdated
Comment on lines +495 to +497
joined := filepath.Clean(filepath.Join(cleanRoot, rel))
rootSep := cleanRoot + string(filepath.Separator)
if joined != cleanRoot && !strings.HasPrefix(joined, rootSep) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid normalizing dot segments in object keys

safeJoinUnderRoot uses filepath.Clean(filepath.Join(...)), which collapses keys like a/../b to b and . to the bucket root. Because S3 object keys treat these bytes literally, this can make distinct keys collide/overwrite each other, and keys that clean to the root can cause finalize-time write errors instead of emitting the object data.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go
// flush trusts the caller's collision detection. Path-traversal
// sanitisation runs in safeJoinUnderRoot, downstream of this
// function, where the bucket-directory root is in scope.
return obj.object, "", nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent cross-generation objects from sharing one output path

Object state is tracked per (object,generation), but resolveObjectFilename drops generation and returns only obj.object, so two generations of the same bucket/object can write to the same output path and whichever map iteration runs last wins. Since old-generation keys are intentionally isolated rather than guaranteed absent, this creates nondeterministic exports when stale generations coexist in the snapshot.

Useful? React with 👍 / 👎.

bootjp added 2 commits April 30, 2026 20:01
Three follow-up issues from Codex on top of round 1.

#619 Codex P1 -- assemble only manifest-declared parts.
filterChunksForManifest previously matched on uploadID only; the
manifest declares specific (partNo, partVersion) tuples, and S3's
overwrite-then-async-cleanup window can leave older partVersion
chunks present under the same (bucket, generation, object,
uploadID). Mixing them produces corrupted bytes.

s3ObjectState gains declaredParts map[s3PartKey]struct{}, populated
by HandleObjectManifest. filterChunksForManifest takes the set and
restricts emission to declared (partNo, partVersion) tuples. Test
TestS3_StalePartVersionExcludedFromAssembledBody asserts a stale
partVersion=7 cannot leak when the manifest declares partVersion=9.

#497 Codex P2 -- preserve dot segments.
safeJoinUnderRoot used filepath.Clean which collapses "a/../b" to
"b". S3 treats those bytes literally; "a/../b" and "b" are distinct
keys that would have silently merged into one output file. Now
explicitly rejects any object key whose segments are "." or ".."
with ErrS3MalformedKey. NUL bytes also rejected. Test
TestS3_DotSegmentObjectKeyRejected covers the four forms ("a/../b",
"a/./b", "..", ".").

#521 Codex P2 -- cross-generation collision.
s3BucketState gains activeGen captured from the bucket-meta record.
flushBucket suppresses objects whose generation differs from
activeGen (under --include-orphans, those flow to _orphans/; by
default they're dropped with an s3_stale_generation_objects
warning). Tests TestS3_StaleGenerationObjectExcluded.

flushBucket cyclomatic complexity stayed under the cap by extracting
flushBucketObjects.
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

3 Codex follow-ups addressed in 92ee22ef:

P1 #619 — Assemble only manifest-declared parts. filterChunksForManifest previously matched on uploadID only; the manifest declares specific (partNo, partVersion) tuples, and S3's overwrite-then-async-cleanup window can leave older partVersion chunks present. Mixing them produced corrupted bytes. s3ObjectState.declaredParts is now populated from manifest.parts and used as a strict allow-list. TestS3_StalePartVersionExcludedFromAssembledBody.

P2 #497 — Preserve dot segments. safeJoinUnderRoot used filepath.Clean which collapsed a/../b to b. S3 treats those bytes literally; two distinct keys would have silently merged. Now rejects any object key whose path segments are . or .. with ErrS3MalformedKey. NUL bytes also rejected. TestS3_DotSegmentObjectKeyRejected.

P2 #521 — Cross-generation collision. s3BucketState.activeGen captures the bucket-meta record's generation. Objects from older incarnations of the same bucket name are now suppressed (with s3_stale_generation_objects warning) or routed to _orphans/ under --include-orphans. TestS3_StaleGenerationObjectExcluded.

flushBucket complexity kept under the cap by extracting flushBucketObjects.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 3m 49s —— View job


Claude finished @bootjp's task in 4m 18s


Review — 92ee22ef Codex follow-ups (round 2)


All three Codex findings are correctly addressed. Per-fix verdict follows.

Fix 1 — Manifest-declared part filtering: declaredParts allow-list ✅

s3PartKey{partNo, partVersion} is populated from live.Parts in HandleObjectManifest (line 315-317) and consumed in filterChunksForManifest (line 716-719). The dual-filter (uploadID first, then declaredParts) is correctly ordered. TestS3_StalePartVersionExcludedFromAssembledBody covers the canonical overwrite scenario: manifest declares partVersion=9, stale chunk at partVersion=7 is excluded, only "OKBYE" assembles.

Minor — empty Parts list creates a non-nil empty map. When a manifest JSON omits or nulls the parts array, live.Parts is nil and the for range loop over it produces make(map[s3PartKey]struct{}, 0) — a non-nil empty map. In filterChunksForManifest, declaredParts != nil fires, and since no keys match, every chunk is filtered → the assembled body is empty. For a zero-byte object (SizeBytes=0) that is correct. For a malformed manifest with SizeBytes>0 but empty parts it is silently wrong, but that is a pre-existing manifest-format invariant, not a regression here.

Note — nil vs non-nil backward-compat path. The comment correctly documents that nil declaredParts means "no filter" for tests that pre-date the feature. All production callers go through HandleObjectManifest, which always sets a non-nil map. The escape hatch is safe.

Fix 2 — Dot-segment key rejection ✅

strings.Split(rel, "/") followed by a switch "." / ".." guard (lines 577-583) correctly rejects the four test cases (a/../b, a/./b, .., .). The comment in safeJoinUnderRoot explains why rejection beats normalization (S3 treats dot-segment bytes literally; filepath.Clean would merge distinct keys).

After the guard, filepath.Join(cleanRoot, rel) — which calls filepath.Clean internally — can only perform harmless normalization (double-slash collapse, trailing-slash strip) since no ./.. segments remain. The residual joined != cleanRoot check still catches any remaining escape attempts. ✓

The "." object-key issue I flagged in the prior round is now resolved by the segment guard (strings.Split(".", "/") = ["."] → caught). ✓

Minor — empty segment from trailing slash ("a/") passes the guard. strings.Split("a/", "/") = ["a", ""]; the "" segment doesn't match . or ... filepath.Join normalizes a/a, so the output path is <bucketDir>/a — a valid file. This is fine behaviour (trailing-slash keys are legal in S3 and produce a file at the key's literal path), but worth being aware of: "a/" and "a" would collide at the same output file. That's a path-collision concern, documented as deferred to the master pipeline.

Fix 3 — Cross-generation collision: activeGen guard ✅

b.activeGen is captured from live.Generation in HandleBucketMeta (line 268). flushBucketObjects skips objects whose obj.generation != b.activeGen when b.activeGen != 0 (line 457). The b.activeGen == 0 bypass ("no bucket-meta seen yet") is correct: without a reference generation, all objects flush and the existing orphan-warning path covers them.

TestS3_StaleGenerationObjectExcluded correctly verifies: gen=7 live object appears, gen=6 stale object absent, s3_stale_generation_objects warning fired. ✓

The cyclomatic-complexity reduction via flushBucketObjects extraction is clean; the stale counter accumulates even when --include-orphans routes the object to flushOrphanObject, so the summary warning count is always accurate.

Minor — stale-gen objects with --include-orphans emit two warning types. flushOrphanObject emits s3_orphan_chunks; the parent then emits s3_stale_generation_objects via the stale counter. An operator with --include-orphans set will see both events for the same objects. The semantics are not wrong, but the dual emission could be confusing in log aggregation. A future improvement would be to suppress s3_orphan_chunks for the stale-gen routing path, but it's not blocking.

Minor — no test for --include-orphans + stale-gen routing. TestS3_StaleGenerationObjectExcluded only exercises the suppression path. There is no test verifying that stale-gen objects land under _orphans/ when --include-orphans is set. Low priority but worth adding.

Minor — generation=0 bucket identity. If a bucket's live Generation field is 0 (e.g., very old data or a schema that omits it), HandleBucketMeta sets b.activeGen = 0, which the guard treats as "no filter". All objects flush regardless of their generation. This is the correct safe fallback — no object is incorrectly suppressed — but it does mean generation-0 buckets cannot use the stale-object suppression path. Acceptable given the server presumably never writes generation=0 as an active bucket generation.


Remaining carry-overs from prior review

Item Status
Silent bypass when manifest.upload_id == "" Still present (by design — documented in comment)
Scratch dir recomputed after scratchDirCreated gate set Still present (negligible)
Orphan chunks loaded via os.ReadFile in flushOrphanObject Still present (low priority)

Summary

All three Codex P1/P2 findings from round 2 (declaredParts allow-list, dot-segment rejection, generation guard) are correctly implemented and tested. No new blocking issues found. The PR is ready to merge once the stacked bases land.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92ee22efad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go
// flush trusts the caller's collision detection. Path-traversal
// sanitisation runs in safeJoinUnderRoot, downstream of this
// function, where the bucket-directory root is in scope.
return obj.object, "", nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle file-vs-directory S3 key collisions before flush

When a bucket contains both path/to and path/to/sub, resolveObjectFilename returns the raw object key for both objects, so flushObject later tries to materialize mutually incompatible filesystem paths (one as a file, one as a parent directory). On POSIX this causes MkdirAll/Rename failures during finalize, so valid S3 keysets can make the backup fail instead of applying the documented leaf-data rename + KEYMAP record strategy.

Useful? React with 👍 / 👎.

bootjp added 2 commits April 30, 2026 20:17
Codex P1 #615: when a bucket contains both `path/to` and
`path/to/sub`, the natural-path mapping puts the shorter key at a
file path and the longer key requires that path to be a parent
directory. POSIX cannot represent both; the prior code would fail
with EEXIST/ENOTDIR at MkdirAll/Rename during finalize, breaking
the documented leaf-data-suffix rename strategy.

Pre-flush computeDirPrefixes() walks every active-gen object key
and accumulates each parent prefix (a key "a/b/c" contributes "a"
and "a/b"). flushBucketObjects checks whether each object's key is
itself a prefix consumed by another active key; if so,
flushObjectWithCollision is called with needsLeafDataRename=true,
which routes resolveObjectFilename onto the existing
S3LeafDataSuffix path. KEYMAP.jsonl records the rename with
KindS3LeafData so restore can reverse it.

flushObject (the no-collision wrapper) was inlined into
flushObjectWithCollision since it was unused after the call sites
moved to the explicit-flag form.

Test TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename emits
"path/to" and "path/to/sub", asserts:
  - "path/to/sub" lands at its natural path with body "CHILD"
  - "path/to" is renamed to "path/to.elastickv-leaf-data" with
    body "LEAF"
  - KEYMAP.jsonl carries one record with KindS3LeafData and
    original "path/to"
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

Codex P1 #615 addressed in 2c44292f. When a bucket holds both path/to and path/to/sub the natural-path mapping requires path/to to be a regular file AND a directory; POSIX can't, so the prior code failed with EEXIST/ENOTDIR at finalize.

flushBucketObjects now pre-computes the set of "directory prefixes" required by the union of active-gen object keys (object a/b/c contributes a and a/b). Each object whose own key is a member of that set is flagged for the documented leaf-data rename: path/topath/to.elastickv-leaf-data + KEYMAP.jsonl record with KindS3LeafData. The longer key flushes at its natural path unchanged.

TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename covers the round-trip end-to-end (both files exist, KEYMAP record has KindS3LeafData and original path/to).

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 3m 30s —— View job


Claude finished @bootjp's task

  • Gather context and read changed files
  • Review file-vs-directory collision fix (computeDirPrefixes + flushBucketObjects)
  • Review leaf-data rename path and KEYMAP record emission
  • Review test coverage (TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename)
  • Check for remaining issues from prior rounds

Review — 2c44292f file-vs-directory collision fix (round 4)

The Codex P1 #615 fix is correct and clean. Detailed per-component verdict follows.

Fix: computeDirPrefixes + flushBucketObjects

computeDirPrefixes (s3.go:489) builds the prefix set by walking every / position in each active-gen object key. For "a/b/c" it emits "a" and "a/b". Stale-gen objects are correctly excluded (the b.activeGen != 0 && obj.generation != b.activeGen guard matches the one in flushBucketObjects), so a stale object whose key is a prefix of an active object does not incorrectly trigger a rename.

flushBucketObjects (s3.go:447) pre-computes the set before the flush loop, then consults it per-object via needsLeafDataRename := dirPrefixes[obj.object]. Since the prefix set is fixed before iteration begins, the non-deterministic Go map iteration order cannot affect which objects get renamed — the result is deterministic regardless of flush order. ✓

Fix: resolveObjectFilename + leaf-data rename ✅

When needsLeafDataRename is true, resolveObjectFilename (s3.go:641) returns obj.object + S3LeafDataSuffix with kind KindS3LeafData. This goes through safeJoinUnderRoot, so the renamed filename also passes the dot-segment and NUL guard. ✓

The sidecar for the renamed object becomes path/to.elastickv-leaf-data.elastickv-meta.json — a distinct path from the child's sidecar path/to/sub.elastickv-meta.json. No naming conflict. ✓

The meta-suffix check comes before the leaf-data check in resolveObjectFilename. For a key that ends in .elastickv-meta.json (handled first) AND is a directory prefix, only the meta-suffix rename fires — the leaf-data POSIX conflict is not resolved for that specific key. This is a pathological double-collision and the resulting ENOTDIR error would surface correctly (no silent data loss).

Fix: KEYMAP recording ✅

recordKeymap (s3.go:658) is called with encodedFilename = "path/to.elastickv-leaf-data" and original = []byte("path/to"). The KEYMAP.jsonl entry records both the renamed filesystem path and the original S3 key, which is the information restore needs to reverse the rename. closeBucketKeymap still removes a zero-count KEYMAP file, which is safe since any leaf-data rename always writes at least one record. ✓

Test: TestS3_FileVsDirectoryKeyCollisionGetsLeafDataRename

The test (s3_test.go:523) covers:

  • Body of the longer key (path/to/sub) at its natural path ✓
  • Body of the shorter key (path/to) at the leaf-data path ✓
  • KEYMAP Kind == KindS3LeafData
  • KEYMAP Original == "path/to" (round-trip) ✓

One gap worth noting: readKeymapFirstRecord assumes the only KEYMAP record is the leaf-data one. For the 2-object scenario this is correct. A 3-object scenario ("a", "a/b", "a/b/c") would produce two KEYMAP records in non-deterministic order; readKeymapFirstRecord would be unreliable there. This isn't a current bug but is worth a future readKeymapAllRecords-style helper if multi-rename tests are added.

Minor observations

Severity Location
Renamed path could itself be a directory prefix for another user object (e.g., path/to, path/to/sub, path/to.elastickv-leaf-data/x): the renamed path/to writes path/to.elastickv-leaf-data as a file, then path/to.elastickv-leaf-data/x needs it as a directory → ENOTDIR. Error surfaces correctly but is opaque (no ErrS3MalformedKey mention). Low s3.go:flushObjectWithCollision
No test for 3-object multi-rename ("a", "a/b", "a/b/c") — the algorithm handles it but the test suite doesn't verify both KEYMAP records are emitted. Low s3_test.go
emitObjectAtGen fixture emits part_version as 0 (default); the new declaredParts filter for {partNo:1, partVersion:0} passes correctly via ParseBlobKey → unversioned branch (partVersion=0). The filter is exercised but at the same version as the "no filter" escape hatch. Negligible s3_test.go:emitObjectAtGen

Carry-overs from prior rounds

Item Status
Silent bypass when manifest.upload_id == "" Still present (by design, documented)
Scratch dir recomputed after scratchDirCreated gate set Still present (negligible)
Orphan chunks loaded via os.ReadFile in flushOrphanObject Still present (low priority)

The fix is correct and ready to merge once stacked bases land. None of the remaining observations are blocking.

Codex P2 #146 round 3: ReadManifest accepted manifests where the
adapters and/or exclusions sections were omitted entirely from
the JSON, because they were decoded into non-pointer struct
fields and validate() only checked scalar policy fields. The
empty zero-value sections were treated as valid, hiding
truncated or corrupted MANIFEST files and dropping provenance
about which scopes/filters the producer used.

Manifest.Adapters and Manifest.Exclusions are now *Adapters /
*Exclusions pointer types. validateRequiredFields rejects nil
for either with ErrInvalidManifest. NewPhase0SnapshotManifest
seeds them as &Adapters{} / &Exclusions{} so producers continue
to satisfy the contract by default; populating individual
scopes reaches in via the now-non-nil pointer.

Tests TestReadManifest_RejectsMissingAdapters and
TestReadManifest_RejectsMissingExclusions cover the new
validation path.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c44292fba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go
Comment on lines +726 to +729
chunks := filterChunksForManifest(obj.chunkPaths, obj.uploadID, obj.declaredParts)
for _, k := range chunks {
path := obj.chunkPaths[k]
if err := appendFile(tmp, path); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate manifest chunk completeness before writing body

assembleObjectBody filters/sorts whatever chunk files happen to exist and writes them, but it never verifies that every manifest-declared chunk is present (e.g., via parts[].chunk_count/chunk_sizes). If one or more blob keys are missing in the snapshot (partial data, race, or corruption), finalize still succeeds and emits a truncated object body without any error signal, which silently corrupts the backup output.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go Outdated
Comment on lines +611 to +614
for _, seg := range strings.Split(rel, "/") {
switch seg {
case ".", "..":
return "", errors.Wrapf(ErrS3MalformedKey,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject empty slash segments in object keys

The path guard rejects . and .. segments, but it allows empty segments, and filepath.Join then normalizes keys like a/, a, and a//b onto the same filesystem path. Since S3 treats these keys as distinct, this causes collisions/last-write-wins overwrites during finalize for valid object names containing repeated or trailing / characters.

Useful? React with 👍 / 👎.

bootjp added 2 commits April 30, 2026 20:32
…718, round 4)

Two Codex P1 follow-ups.

#729 -- chunk completeness check.
assembleObjectBody filtered + sorted whatever chunk files happened
to exist and wrote them, but never verified the manifest's
declared chunk_count was actually present. A partial / racy /
corrupted snapshot would silently emit a truncated body. Added
verifyChunkCompleteness which counts chunks per (partNo,
partVersion) and asserts the count matches manifest.parts[].chunk_count
AND the highest chunkNo equals chunk_count-1. Mismatch surfaces as
ErrS3IncompleteBlobChunks before any bytes are written. The
declaredParts map's value type changed from struct{} to
s3DeclaredPart{chunkCount} to carry the contract.

#614 -- empty slash segments.
safeJoinUnderRoot rejected `.` and `..` but allowed empty segments
("a//b", "a/", trailing "/"). filepath.Join collapses these, so
distinct S3 keys would silently overwrite each other at finalize.
The dot-segment guard now also rejects "" segments anywhere except
the leading position (where a leading "/" produces an initial
empty segment that filepath.Join handles safely under the
already-tested absolute-path-confined-under-bucket behaviour).

Tests: TestS3_IncompleteChunksRejected,
TestS3_EmptySlashSegmentsRejected (covers a//b, a/, /a//, x/).
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

2 Codex P1 follow-ups addressed in ba33df8e:

#729 — Chunk completeness. assembleObjectBody previously appended whatever chunk files happened to exist; a partial/racy/corrupted snapshot would silently emit a truncated body. New verifyChunkCompleteness counts chunks per (partNo, partVersion) and asserts both the count equals manifest's chunk_count AND maxChunkNo + 1 == chunk_count. Mismatch returns ErrS3IncompleteBlobChunks before any bytes hit disk. declaredParts value type changed from struct{} to s3DeclaredPart{chunkCount}. Test TestS3_IncompleteChunksRejected (manifest declares 3 chunks, snapshot has chunkNo 0 + 2; chunk 1 missing → error).

#614 — Empty slash segments. safeJoinUnderRoot rejected . and .. but allowed "" segments ("a//b", "a/", trailing "/"). filepath.Join collapses these; distinct S3 keys would silently overwrite each other. Now rejects "" segments anywhere except the leading position (where the leading "/" → initial empty segment is handled safely by the already-tested absolute-path-confined-under-bucket behaviour). Test TestS3_EmptySlashSegmentsRejected covers "a//b", "a/", "/a//", "x/".

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 3m 43s —— View job


Review — ba33df8e round 5 (chunk completeness + empty slash segments)

  • Gather context and read changed files
  • Review chunk completeness validation (verifyChunkCompleteness)
  • Review empty slash segment rejection in safeJoinUnderRoot
  • Check test coverage for each fix
  • Check for remaining issues from prior rounds

Both Codex P1 findings are correctly addressed. Detailed verdict follows.

Fix 1 — verifyChunkCompleteness: dual invariant ✅

The function is called between filterChunksForManifest and the assembly loop (s3.go:803), so ErrS3IncompleteBlobChunks fires before any bytes reach the tmp file. The dual invariant — o.count == want.chunkCount and o.maxIndex + 1 == want.chunkCount — is correctly motivated by the comment: a snapshot with N duplicates of chunkNo=0 satisfies the count check alone but fails on maxIndex. Tracing the four interesting cases:

chunks for part declared count count check maxIndex check outcome
chunkNo=0, chunkNo=2 (missing 1) 3 2 ≠ 3 → fail error ✅
chunkNo=0 × 3 (dup of 0) 3 3 = 3 maxIndex+1=1 ≠ 3 → fail error ✅
chunkNo=0, chunkNo=1, chunkNo=2 3 3 = 3 maxIndex+1=3 = 3 pass ✅
no chunks N > 0 0 ≠ N → fail error ✅

The nil escape hatch is safe and correctly limits backward-compat scope to tests that pre-date the manifest-parts feature.

Minor — want.chunkCount == 0 bypass. The guard want.chunkCount > 0 skips validation for zero-chunk declared parts. If a snapshot contains stale chunks for a part declared with chunk_count=0, filterChunksForManifest passes them through (key is in declaredParts), but verifyChunkCompleteness skips the check — those bytes are silently assembled into the output. A chunk_count=0 part is pathological in the first place (an S3 part with no data) so this is an extreme edge case, not a regression.

Fix 2 — safeJoinUnderRoot empty segment rejection ✅

The guard now rejects "" at i != 0 (s3.go:688). The leading-"" carve-out for rel starting with / is correct: filepath.Join(cleanRoot, "/a/b") = filepath.Clean(cleanRoot + "/" + "/a/b") = cleanRoot/a/b on POSIX — the absolute-path is absorbed under the bucket dir, which is covered by TestS3_AbsolutePathObjectKeyConfinedUnderBucket from round 1. The residual boundary check (strings.HasPrefix(joined, rootSep)) still guards against any future join anomaly.

All four test cases traced:

key segs first "" at i result
"a//b" ["a","","b"] i=1 ErrS3MalformedKey
"a/" ["a",""] i=1 ErrS3MalformedKey
"/a//" ["","a","",""] i=2 ErrS3MalformedKey
"x/" ["x",""] i=1 ErrS3MalformedKey

The flushOrphanObject path is exempt (it uses EncodeSegment([]byte(obj.object)) for the directory name, not the raw key), so no gap there.

The error propagates cleanly: safeJoinUnderRootflushObjectWithCollisionflushBucketObjectsFinalize with no swallowing — TestS3_EmptySlashSegmentsRejected asserts errors.Is(err, ErrS3MalformedKey) on the Finalize call. ✓

Carry-overs from prior rounds

Item Status
Silent bypass when manifest.upload_id == "" Still present (by design, documented)
Scratch dir recomputed after scratchDirCreated gate set Still present (negligible)
Orphan chunks loaded via os.ReadFile in flushOrphanObject Still present (low priority)
No test for --include-orphans + stale-gen routing Still present (low priority)

Both P1 findings are correctly fixed and tested. No new blocking issues. The PR is ready to merge once the stacked bases (feat/backup-phase0a-dynamodb, feat/s3keys-parse-blob-key) land.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba33df8e5a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go Outdated
Comment on lines +688 to +690
if i != 0 {
return "", errors.Wrapf(ErrS3MalformedKey,
"object name has empty path segment %q", rel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject leading-slash keys to avoid object overwrite collisions

safeJoinUnderRoot intentionally permits an empty first segment, so object keys like /a are accepted and then normalized by filepath.Join to the same output path as a. Because S3 treats /a and a as distinct keys, a bucket containing both will silently collide and whichever object flushes last wins, corrupting backup output without any keymap record. This is triggered whenever a bucket has leading-slash keys alongside equivalent non-leading-slash keys.

Useful? React with 👍 / 👎.

bootjp added 4 commits April 30, 2026 20:43
Codex P2 round 5 raised two correctness issues:

1. KeymapReader.Next accepted records missing the `original` field
   because base64.RawURLEncoding.DecodeString("") succeeds. A
   corrupted line dropping `original` silently rewrote the
   encoded->original mapping to empty bytes, breaking exact key
   recovery for SHA-fallback / collision-renamed entries. Now the
   reader decodes into a presence-aware map first and rejects
   records missing any of {encoded, original, kind}. Explicit empty
   string remains valid.

2. ReadManifest decoded into Manifest before applying the
   format_version gate. A future-major manifest that also changed
   the JSON type of a known field (e.g. `phase` string -> int) was
   surfacing as ErrInvalidManifest instead of
   ErrUnsupportedFormatVersion, breaking the advertised
   version-branching contract. Now ReadManifest probes
   format_version with a relaxed shape first, branches on the
   result, and only runs the strict struct decode on a
   known-supported version.

Tests: TestKeymapReader_RejectsMissingOriginalField,
TestKeymapReader_AcceptsExplicitEmptyOriginal,
TestReadManifest_FutureMajorVersionTakesPrecedenceOverTypeMismatch.
bootjp added 10 commits April 30, 2026 21:08
… round 7)

Codex round 7 raised two issues on commit 0afb5e3:

1. P1: KEYMAP records with `"original": null` were silently
   accepted as empty bytes. The presence-aware decode added in
   round 5 only checked field-name presence; a null literal
   round-trips through json.Unmarshal into "" for a string field,
   and base64.RawURLEncoding.DecodeString("") succeeds. Add an
   explicit json.RawMessage compare against the literal `null`
   token so corrupted/truncated records fail at parse time
   instead of rewriting mappings to empty bytes. Apply the same
   guard to `encoded` and `kind` for symmetry.

2. P2: Exclusions sub-fields (booleans) defaulted to `false` on
   absence. validateRequiredFields only checked Exclusions != nil;
   a manifest that omitted `preserve_sqs_visibility` (or any other
   flag) silently passed with altered exclusion semantics, losing
   the producer-side provenance the section is meant to capture.
   Add `validateExclusionsFieldsPresent` that walks the raw JSON
   payload and rejects manifests missing or null'ing any of the
   four flags.

Tests:
  - TestKeymapReader_RejectsExplicitNullField (3 sub-cases)
  - TestReadManifest_RejectsMissingExclusionFlag (5 sub-cases
    including explicit-null)
…d 5)

Codex P2 round 7 (commit d965b65):

`syscall.O_NOFOLLOW` is not defined on Windows, so the round-3 fix
that used it directly broke the cross-platform build
(`GOOS=windows GOARCH=amd64 go build ./internal/backup` failed at
`undefined: syscall.O_NOFOLLOW`).

Extract the platform-specific open into a small build-tagged
helper:
  - open_nofollow_unix.go (`//go:build unix`) keeps the atomic
    O_NOFOLLOW open with ELOOP wrapping.
  - open_nofollow_windows.go (`//go:build windows`) uses Lstat
    then OpenFile. The residual TOCTOU window is acceptable on
    Windows because mounting a successful attack on the dump
    tree there already requires write access to the output
    directory plus SeCreateSymbolicLinkPrivilege — a much higher
    bar than the unix case the round-3 fix was hardening
    against.

Both `openJSONL` and the keymap-creation path in
`recordIfFallback` now go through `openSidecarFile`, eliminating
the duplicate inline OpenFile blocks.

Verified: `go test -race` passes on Linux/macOS;
`GOOS=windows GOARCH=amd64 go build ./internal/backup/...` now
succeeds (was the Codex repro).
Codex P2 round 8 (commit 86fbf3a):

When `Finalize` encountered a queue state without a !sqs|queue|meta
record (most commonly post-DeleteQueue, where the meta row was
removed but tombstones remain), it warned and skipped the queue
entirely — including any buffered side records. With
--include-sqs-side-records on, that silently dropped exactly the
deletion-era state operators opted in to capture, contradicting
the opt-in contract.

Now the orphan branch flushes side records to a `<encoded>.orphan`
sub-directory while still warning that messages were dropped. The
`.orphan` suffix prevents a restore tool from mistaking it for a
real queue dir produced by a successful meta flush; the encoded
prefix is the only stable identifier available without meta.
Orphan messages still drop because restoring them without queue
config would silently create a default-settings queue.

The branch is gated on `s.includeSideRecords` so the
default-off contract is preserved: without the flag, no `.orphan`
directory is ever created.

Tests:
  - TestSQS_OrphanQueueSideRecordsPreserved — drives a vis-record
    without a meta row, asserts the .orphan/_internals path is
    populated and the orphan-messages warning fires.
  - TestSQS_OrphanQueueSideRecordsSuppressedWhenOptOut — same
    scenario without --include-sqs-side-records; asserts no
    .orphan dir is created.
bootjp added 4 commits April 30, 2026 21:22
Codex P2 round 8 (commit 6b2dea0):

1. Missing `format_version` was misclassifying corrupted/truncated
   manifests as ErrUnsupportedFormatVersion. The version probe
   decoded straight into `uint32`, so an absent field collapsed
   silently to 0 and the gate said "upgrade required". A truncated
   manifest belongs in ErrInvalidManifest so callers branching on
   errors.Is can distinguish "this file is broken" from "newer
   producer". Add a presence-and-null probe on the raw JSON
   `format_version` token before the typed unmarshal. Extracted
   the whole probe into `probeManifestFormatVersion` to keep
   ReadManifest's cyclomatic complexity under the project's
   linter ceiling.

2. `WriteManifest` did not enforce `FormatVersion <=
   CurrentFormatVersion`. A caller mutating
   `m.FormatVersion = CurrentFormatVersion + 1` could write a
   manifest that the same package's ReadManifest then rejects —
   self-incompatible backup metadata for a one-line field
   mutation. validateRequiredFields now refuses that case.

Tests:
  - TestReadManifest_RejectsMissingFormatVersion (also asserts
    the error is *not* ErrUnsupportedFormatVersion).
  - TestReadManifest_RejectsNullFormatVersion (mirror for
    explicit JSON `null`).
  - TestWriteManifest_RejectsFutureFormatVersion (round-trip
    self-compat check).
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 19d33a6511

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go Outdated
Comment on lines +748 to +752
f, err := os.Create(path) //nolint:gosec // path composed from output root
if err != nil {
return errors.WithStack(err)
}
b.keymap = NewKeymapWriter(f)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Close KEYMAP file descriptors after writing

recordKeymap opens KEYMAP.jsonl with os.Create and only stores a KeymapWriter; later closeBucketKeymap calls KeymapWriter.Close(), which flushes buffers but does not close the underlying file descriptor. In dumps that produce keymaps for many buckets (e.g., repeated collision renames), descriptors accumulate until EMFILE and subsequent bucket flushes fail, leaving backup output incomplete.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go Outdated
func (s *S3Encoder) recordKeymap(b *s3BucketState, bucketDir, encodedFilename string, original []byte, kind string) error {
if b.keymap == nil {
path := filepath.Join(bucketDir, "KEYMAP.jsonl")
f, err := os.Create(path) //nolint:gosec // path composed from output root
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Open KEYMAP sidecar with nofollow protections

This path uses os.Create directly for KEYMAP.jsonl, which follows symlinks. If an attacker (or stale prior run) places a symlink at that location, finalize can truncate/overwrite an arbitrary target file when the first keymap record is written. Other backup sidecars already use openSidecarFile to prevent this class of overwrite; KEYMAP should use the same guarded open path.

Useful? React with 👍 / 👎.

bootjp added 7 commits April 30, 2026 21:27
Codex P2 round 9 (commit 87971ae):

The round-5 unix sidecar opener used O_NOFOLLOW so the kernel
refuses symlinks atomically, but it kept O_TRUNC on the same call.
A local adversary who can write the output directory could
pre-create strings_ttl.jsonl / hll_ttl.jsonl / KEYMAP.jsonl as a
hard link to a writable file outside the dump tree (e.g.
/etc/passwd) and the open's O_TRUNC would zero that inode despite
the symlink guard.

Restructure the unix open:

  1. Open with O_WRONLY|O_CREATE|O_NOFOLLOW (no O_TRUNC).
  2. fstat() the descriptor.
  3. If sysStat.Nlink > 1, refuse with a wrapped error and close.
  4. Otherwise call f.Truncate(0) — atomic with the file we just
     verified is single-linked.

The Windows build (open_nofollow_windows.go) is unchanged because
its symlink/hard-link attack model is materially different
(SeCreateSymbolicLinkPrivilege required, ACLs gate the directory)
and the platform's link-count semantics under fstat are nontrivial.

Test:
  - TestRedisDB_OpenJSONLRefusesHardLinkClobber — pre-creates
    strings_ttl.jsonl as a hard link to a bait file, exercises
    HandleString (which lazy-opens the sidecar), asserts the
    "refusing to overwrite hard-linked file" error and that the
    bait file is untouched.
Codex round 9 raised two issues on commit ab38eb0:

1. P1: closeBucketKeymap leaked file descriptors. recordKeymap
   stored only the *KeymapWriter; closeBucketKeymap called
   KeymapWriter.Close() which flushes the bufio buffer but does
   NOT close the underlying *os.File. A dump producing keymaps
   for many buckets accumulated descriptors until EMFILE, after
   which subsequent bucket flushes failed and the dump output
   was incomplete. Track the *os.File on s3BucketState and
   close it from closeBucketKeymap alongside the KeymapWriter
   flush.

2. P2: recordKeymap used os.Create for KEYMAP.jsonl, which
   follows symlinks and clobbers hard links. The redis encoder
   already routes through openSidecarFile for the same kind of
   sidecar; mirror that path so a stale prior run (or local
   adversary) cannot turn a missing KEYMAP into an
   arbitrary-write primitive against /etc/passwd or similar.

Test: TestS3_KeymapRefusesSymlinkAtFinalize pre-creates
KEYMAP.jsonl as a symlink to a bait file, drives a meta-suffix
rename (so recordKeymap fires), and asserts both that the
finalize returns the symlink-refusal error and that the bait
file is untouched.
…d 7)

Codex round 9 raised two issues on commit 87971ae:

1. P1: O_NOFOLLOW only blocks the trailing component. A
   pre-existing directory symlink anywhere up the path (e.g.
   `<outRoot>/redis/db_0/strings -> /tmp/outside`) let MkdirAll
   silently honor it, then writeFileAtomic / openSidecarFile
   wrote outside outRoot. ensureDir now Lstat-walks each
   ancestor under outRoot via assertNoSymlinkAncestors and
   refuses if any is a symlink. This is best-effort against
   TOCTOU (an adversary can race us at any boundary), but
   closes the much more common case of a stale or
   misconfigured symlink in the dump tree. Full race-free
   traversal would need os.Root / openat which is out of
   scope at this layer.

2. P2: IsBlobAtomicWriteOutOfSpace only checked syscall.ENOSPC,
   missing Windows disk-full failures (ERROR_DISK_FULL = 112,
   ERROR_HANDLE_DISK_FULL = 39). Retry/alarm logic that keys
   off this helper would silently misclassify Windows disk
   exhaustion. Move the platform-specific check into
   disk_full_unix.go and disk_full_windows.go; the latter
   accepts both Win32 codes plus ENOSPC for the Go-translated
   case.

Tests:
  - TestRedisDB_RefusesSymlinkedAncestor: pre-places a
    `redis/db_0/strings -> /tmp/...` symlink before HandleString
    and asserts the ancestor refusal + that the bait dir is
    untouched.

Verified:
  - go test -race ./internal/backup/... clean.
  - GOOS=windows GOARCH=amd64 go build ./internal/backup/... clean.
  - golangci-lint clean.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 30, 2026

@claude review

Round 7 fixes for Codex round 9 on commit ab38eb0 (commit 46cb56fe):

  1. P1 — KEYMAP file descriptor leak. recordKeymap stored only the *KeymapWriter; closeBucketKeymap called KeymapWriter.Close() which flushes the bufio buffer but does NOT close the underlying *os.File. A dump producing keymaps for many buckets accumulated descriptors until EMFILE and subsequent flushes failed. Now s3BucketState tracks the *os.File and closeBucketKeymap closes it explicitly.

  2. P2 — KEYMAP open followed symlinks. recordKeymap was using os.Create directly, which follows symlinks and clobbers hard links. The redis encoder already routes through openSidecarFile (atomic O_NOFOLLOW + nlink check); s3 now uses the same helper. Test: TestS3_KeymapRefusesSymlinkAtFinalize pre-places a symlink at the KEYMAP path and asserts the bait file is untouched.

go test -race and golangci-lint clean.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

Claude finished @bootjp's task in 2m 29s —— View job


Review — 46cb56f round 7 (KEYMAP fd leak + symlink protection)

  • Gather context and read changed files
  • Review P1: KEYMAP file descriptor leak fix
  • Review P2: KEYMAP symlink protection via openSidecarFile
  • Check test coverage (TestS3_KeymapRefusesSymlinkAtFinalize)
  • Check for remaining issues from prior rounds

Both Codex round-9 findings are correctly addressed. Detailed verdict follows.

Fix 1 — KEYMAP file descriptor leak: keymapFile *os.File

s3BucketState now tracks the *os.File separately from the *KeymapWriter (s3.go:156). closeBucketKeymap (s3.go:578) calls both in sequence: b.keymap.Close() flushes the bufio buffer, b.keymapFile.Close() closes the OS descriptor.

The error-priority logic is correct: if flushErr != nil, it takes precedence regardless of closeErr; if only closeErr is set, it's wrapped and surfaced. This is consistent with how closeJSONL in redis_string.go:550 handles the same two-phase close (flush→close).

Minor — b.keymapFile != nil guard is always true when b.keymap != nil. Both fields are set together in recordKeymap (s3.go:771-772) and neither is reset to nil independently, so the guard at s3.go:584 is technically dead. Harmless defense-in-depth, but a future reader may wonder when keymapFile could be nil while keymap is non-nil. Not blocking.

Minor — os.Remove runs after both closes. When Count() == 0, the empty KEYMAP file is removed via _ = os.Remove(...) (s3.go:591). This happens after both keymap.Close() and keymapFile.Close(), which is the correct order (close-then-remove avoids removing a file with open descriptors). ✓

Fix 2 — Symlink protection: openSidecarFile in recordKeymap

recordKeymap (s3.go:767) now calls openSidecarFile instead of os.Create. This is the same helper used by openJSONL (redis_string.go:540), making the two S3 sidecar files consistent:

File Opener
KEYMAP.jsonl openSidecarFilethis fix
_incomplete_uploads/records.jsonl openSidecarFile (via openJSONL)

The Unix path uses O_NOFOLLOW atomically — no TOCTOU race. The Windows path uses Lstat-then-OpenFile with the documented acceptable residual window. Both produce the "refusing to overwrite symlink at %s" error string the test checks. ✓

return err (not errors.WithStack(err)) at s3.go:769 is correct: openSidecarFile already wraps the error via cockroachdberr.WithStack. ✓

Test: TestS3_KeymapRefusesSymlinkAtFinalize

The test (s3_test.go:292) correctly:

  1. Pre-creates bucketDir/KEYMAP.jsonl as a symlink to a bait file
  2. Triggers recordKeymap via WithRenameCollisions(true) + a .elastickv-meta.json key
  3. Asserts Finalize returns an error containing "refusing to overwrite symlink"
  4. Asserts os.ReadFile(bait) still contains "stay-out" — the bait was not clobbered ✓

The propagation path is clean: FinalizeflushBucketcloseBucketKeymap is called first (line 496), but recordKeymap fires earlier via flushObjectWithCollision → the error surfaces there and bubbles through flushBucketObjectsflushBucketFinalize. ✓

Stale comment (carry-over from rounds 5 and 6)

s3.go:706–709 still reads:

// Use filepath.Join here — its only behavioural change vs. raw
// concatenation after the dot-segment guard above is normalising
// a leading "/" off `rel` (which is what we want: absolute-path
// keys collapse safely under bucketDir).

Since round 5 (leading-slash rejection), no leading / can reach filepath.Join — it is caught as an empty segment by the guard. The phrase "which is what we want: absolute-path keys collapse safely" actively contradicts the current design. Still low priority, still not blocking.


Carry-overs from prior rounds

Item Status
Silent bypass when manifest.upload_id == "" Still present (by design, documented)
Scratch dir recomputed after scratchDirCreated gate Still present (negligible)
Orphan chunks loaded via os.ReadFile in flushOrphanObject Still present (low priority)
No test for --include-orphans + stale-gen routing Still present (low priority)
Stale comment at safeJoinUnderRoot re: absolute-path confinement Still present (low priority)

Both P1 and P2 findings are correctly fixed and tested. No new blocking issues. The PR is ready to merge once the stacked bases (feat/backup-phase0a-dynamodb, feat/s3keys-parse-blob-key) land.

bootjp added 2 commits April 30, 2026 21:39
Codex round 9 (commit 8d08827):

`parseSQSMessageDataKey` and `parseSQSGenericKey` only handled the
legacy SQS key shape:

	!sqs|msg|<family>|<encQueue><gen 8B><rest>

HT-FIFO partitioned queues use the discriminator-prefixed shape
(adapter/sqs_keys.go SqsPartitionedMsg*Prefix):

	!sqs|msg|<family>|p|<encQueue>|<part 4B><gen 8B><rest>

The legacy heuristic read "p" as the queue segment and aborted with
ErrSQSMalformedKey. In environments running partitioned FIFO queues
HandleMessageData would refuse every record and the dump pass
would abort entirely (P1); side records would similarly collapse
into one mis-routed bucket under --include-sqs-side-records (P2).

Recognise the partitioned shape via the literal "p|" prefix on the
post-family suffix and route into a new
parseSQSPartitionedQueueAndTrailer helper that:

  - peels the "p|" discriminator
  - reads the queue segment up to the literal '|' terminator
  - validates that the trailer carries at least partition u32 +
    gen u64 bytes
  - if the caller is parsing a msg-data key, validates the
    msg-id segment decodes as base64url

Tests:
  - TestSQS_ParsePartitionedMessageDataKey
  - TestSQS_ParsePartitionedSideRecordKey (vis/byage/dedup/group)
  - TestSQS_ParsePartitionedMessageDataKey_RejectsTruncatedTrailer

The legacy parser path is unchanged for non-partitioned keys.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1dc688429b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/backup/s3.go
Comment on lines +751 to +753
if needsLeafDataRename {
return obj.object + S3LeafDataSuffix, KindS3LeafData, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prevent leaf-data rename from colliding with user keys

When needsLeafDataRename is true, this always rewrites "<object>" to "<object>.elastickv-leaf-data" without checking whether that target is already a real S3 object key in the same bucket. A valid bucket containing both path/to + path/to/sub and an existing path/to.elastickv-leaf-data will have two distinct objects mapped to one output path, so finalize becomes last-write-wins and backup data is silently lost.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go
Comment on lines +744 to +750
if strings.HasSuffix(obj.object, S3MetaSuffixReserved) {
if !s.renameCollisions {
return "", "", errors.Wrapf(ErrS3MetaSuffixCollision,
"bucket %q object %q", b.name, obj.object)
}
return obj.object + ".user-data", KindMetaCollision, nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard meta-collision rename target against key conflicts

In rename-collision mode, objects ending with .elastickv-meta.json are rewritten to "<object>.user-data" unconditionally. If the bucket already contains a real key with that .user-data suffix, both objects are emitted to the same filesystem path and one overwrites the other, and KEYMAP cannot represent both originals, so restore cannot reconstruct the original keyset.

Useful? React with 👍 / 👎.

Comment thread internal/backup/s3.go
Comment on lines +360 to +369
st.manifest = &s3PublicManifest{
FormatVersion: 1,
ETag: live.ETag,
SizeBytes: live.SizeBytes,
ContentType: live.ContentType,
ContentEncoding: live.ContentEncoding,
CacheControl: live.CacheControl,
ContentDisposition: live.ContentDisposition,
UserMetadata: live.UserMetadata,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Populate sidecar last_modified from manifest timestamp

The decoded live manifest includes last_modified_hlc, and s3PublicManifest exposes a last_modified field, but this struct literal never sets it. As a result every exported .elastickv-meta.json drops object modification time metadata, reducing backup fidelity for tooling that relies on the sidecar to preserve HEAD-visible timestamps.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant