feat(client): stream ParallelGet in-order to an io.Writer by worstell · Pull Request #360 · block/cachew

worstell · 2026-06-26T00:24:04Z

ParallelGet — and the DownloadGitSnapshot helper that wraps it — previously wrote chunks to an io.WriterAt, which requires a seekable destination (e.g. a temp file) and so prevents a consumer from overlapping the download with processing. They now fetch chunks in parallel but emit in-order bytes to a plain io.Writer via a bounded reorder buffer, letting a streaming consumer (e.g. a decompress/extract pipeline) run concurrently with the download.

A concurrency-sized window caps fetched-but-unwritten chunks, and the reorder buffer is a ring of that many slots, bounding peak memory to O(concurrency * chunkSize) regardless of object size or consumer speed. A chunk whose body length differs from its requested range (short, or overlong from a backend that ignored the range) is rejected rather than spliced or truncated. Revision safety (ETag pinning via If-Range), empty-object handling, range-ignore degrade, and the concurrency == 1 shortcut are unchanged.

The io.WriterAt variant is removed: no consumer benefited from scatter-writes, and the only use — download-to-temp-file then extract — is slower than streaming because it gives up download/extract overlap. *os.File satisfies io.Writer, so the CLI caller is unaffected.

Tests cover in-order reassembly, out-of-order completion, the single-worker/empty-object/range-ignore fallbacks, ETag-mismatch and overlong-chunk rejection, and error propagation, all under -race.

ParallelGet (and the DownloadGitSnapshot helper that wraps it) previously wrote chunks to an io.WriterAt, which requires a seekable destination (e.g. a temp file) and so prevents a consumer from overlapping the download with processing. They now fetch chunks in parallel but emit in-order bytes to a plain io.Writer via a bounded reorder buffer, letting a streaming consumer (e.g. a decompress/extract pipeline) run concurrently with the download. A concurrency-sized window caps fetched-but-unwritten chunks, and the reorder buffer is a ring of that many slots, bounding peak memory to O(concurrency * chunkSize) regardless of object size or consumer speed. A chunk whose body length differs from its requested range (short or overlong, e.g. a backend that ignored the range) is rejected rather than splicing or truncating. Revision safety (ETag pinning via If-Range), empty-object handling, range-ignore degrade, and the concurrency==1 shortcut are unchanged. The io.WriterAt variant is removed: no consumer benefited from scatter-writes, and the only use (download-to-temp-file then extract) is slower than streaming because it gives up download/extract overlap. Amp-Thread-ID: https://ampcode.com/threads/T-019ef6a9-a407-7389-bc43-001405e3ae9e Co-authored-by: Amp <amp@ampcode.com>

worstell force-pushed the worstell/parallel-get-stream branch from 4d97fb3 to 83e3e16 Compare June 26, 2026 00:30

worstell changed the title ~~feat(client): add ParallelGetStream for in-order streaming parallel downloads~~ feat(client): stream ParallelGet in-order to an io.Writer Jun 26, 2026

worstell force-pushed the worstell/parallel-get-stream branch from 83e3e16 to 250ab5b Compare June 26, 2026 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(client): stream ParallelGet in-order to an io.Writer#360

feat(client): stream ParallelGet in-order to an io.Writer#360
worstell wants to merge 1 commit into
mainfrom
worstell/parallel-get-stream

worstell commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

worstell commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

worstell commented Jun 26, 2026 •

edited

Loading