Skip to content

feat(git): add parallel range download for snapshot restore#361

Merged
alecthomas merged 1 commit into
mainfrom
aat/parallel-snapshot-download
Jun 26, 2026
Merged

feat(git): add parallel range download for snapshot restore#361
alecthomas merged 1 commit into
mainfrom
aat/parallel-snapshot-download

Conversation

@alecthomas

Copy link
Copy Markdown
Collaborator

Add a --download-concurrency flag (default 1) to cachew git restore
that fetches the snapshot with bounded concurrent range requests via
client.ParallelGet, downloading into a temp file and extracting from it.
A --download-chunk-size-mb flag (default 8) tunes the chunk size. A
concurrency of 1, an old server, or a missing ETag transparently falls
back to today's single streaming download.

ParallelGet drives the object-key API, but the snapshot lives behind the
/git endpoint, so add a RangeReader adapter that issues ranged GETs to
the snapshot URL and captures the freshen metadata (commit / bundle URL)
from the discovery response.

Route the cold-start serve paths through ServeCacheHit + ConditionalOptions
so cold serves advertise an ETag and Accept-Ranges and honour Range,
letting clients parallelise during mirror warm-up too.

Add a --download-concurrency flag (default 1) to `cachew git restore`
that fetches the snapshot with bounded concurrent range requests via
client.ParallelGet, downloading into a temp file and extracting from it.
A --download-chunk-size-mb flag (default 8) tunes the chunk size. A
concurrency of 1, an old server, or a missing ETag transparently falls
back to today's single streaming download.

ParallelGet drives the object-key API, but the snapshot lives behind the
/git endpoint, so add a RangeReader adapter that issues ranged GETs to
the snapshot URL and captures the freshen metadata (commit / bundle URL)
from the discovery response.

Route the cold-start serve paths through ServeCacheHit + ConditionalOptions
so cold serves advertise an ETag and Accept-Ranges and honour Range,
letting clients parallelise during mirror warm-up too.
@alecthomas alecthomas requested a review from a team as a code owner June 26, 2026 00:59
@alecthomas alecthomas requested review from joshfriend and removed request for a team June 26, 2026 00:59
@alecthomas alecthomas enabled auto-merge (squash) June 26, 2026 00:59
@alecthomas alecthomas merged commit 93d399c into main Jun 26, 2026
6 of 7 checks passed
@alecthomas alecthomas deleted the aat/parallel-snapshot-download branch June 26, 2026 01:01

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee4f7bf116

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread cmd/cachew/git.go
// WriteAt so it cannot stream into extraction; the temp file is removed on
// return.
func (c *GitRestoreCmd) parallelFetchAndExtract(ctx context.Context, api *client.Client) (string, string, error) {
tmp, err := os.CreateTemp("", "cachew-snapshot-*.tar.zst")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Store parallel snapshots on the target filesystem

When users enable --download-concurrency for a multi-GB snapshot on hosts where /tmp is a small tmpfs or separate quota, this writes the entire compressed snapshot to the default temp directory before extraction. That can fail with ENOSPC even when c.Directory has enough space for the restore; creating the temp file under the target directory's filesystem, or making the temp location configurable, avoids making parallel restore unusable in those environments.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant