Skip to content

Add optional background concatenation for final uploads#1374

Draft
m7kvqbe1 wants to merge 12 commits into
tus:mainfrom
m7kvqbe1:background-concatenation
Draft

Add optional background concatenation for final uploads#1374
m7kvqbe1 wants to merge 12 commits into
tus:mainfrom
m7kvqbe1:background-concatenation

Conversation

@m7kvqbe1
Copy link
Copy Markdown

Note

Draft, and stacked on #1366 (Idempotency-Key support). Until #1366 merges, the diff here also includes its commits. The background-concatenation work itself is the last 3 commits. I'll rebase once #1366 lands.

Summary

Follow-up to #1364 / #1366 that addresses @Acconut's point that idempotency-key alone does not solve concatenation timeouts: retrying a concat that timed out will likely time out again. The actual fix is to run the concatenation detached from the HTTP request so it can finish even when the client connection or a proxy times out.

  • Add Config.EnableBackgroundConcatenation (CLI: --enable-background-concatenation), opt-in and disabled by default
  • When enabled, final (concat) uploads are assembled in a goroutine detached from the request via context.WithoutCancel; the handler responds 201 Created immediately and the client polls with HEAD until offset == size
  • An in-process tracker records in-flight concatenations so retried requests (deduplicated via the Idempotency-Key from Add Idempotency-Key header support for upload creation requests #1366) do not start a second concat and do not misread a partial in-progress offset as a corrupted state

How this composes with Idempotency-Key (#1366)

  1. First request: creates the final upload, starts the background concat, stores the key to upload-ID mapping, returns 201
  2. Retry while concat is running: the key resolves to the existing upload, the in-progress tracker short-circuits and returns 201 (client keeps polling)
  3. Retry after concat finished: the key resolves to the completed upload and returns 201

Open questions for maintainers

These are the points I flagged in the issue discussion and would like your steer on:

  1. Opt-in flag vs. default behavior - currently opt-in to be safe.
  2. You noted resumable concat is feasible for filestore/gcsstore but not s3store. This PR detaches execution but does not yet add a store capability to gate or resume. Should background concat be gated on a capability, with non-supporting stores falling back to synchronous?
  3. The in-progress tracker is process-local and does not survive restarts. After a restart, a retried request simply starts a fresh background concat (best-effort resume). Is that acceptable for a first iteration, or should this state be persisted?
  4. Background goroutines currently run to completion and are not bound to a server-shutdown context. Worth wiring into graceful shutdown?

Test plan

  • TestBackgroundConcatenation/Complete - 201 returned immediately, finish event emitted asynchronously after concat completes
  • TestBackgroundConcatenation/RetryWhileInProgress - retry mid-concat returns 201, no second concat, no false corruption error
  • Existing handler and concat tests pass
  • go test -race ./pkg/handler/... is clean

m7kvqbe1 added 12 commits April 16, 2026 12:26
Introduce the IdempotencyKeyStore interface for persisting mappings from
client-provided Idempotency-Key header values to upload IDs, enabling
detection of retried upload creation requests.

Add the interface to StoreComposer following the same pattern as Locker,
with UseIdempotencyKeyStore() and Capabilities() support.

Ref: https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-07.html
Persist idempotency key to upload ID mappings as JSON files on disk,
using {sha256(key)}.idempotency-key filenames. Stores the full original
key inside the file to guard against hash collisions.

Follows the filelocker pattern: New(path), UseIn(composer), and the
same directory can be shared with upload data.
In-memory store for cloud storage backends (S3, GCS, Azure) where no
local disk is available. Mappings are lost on process restart but still
protect against duplicate uploads within a single server lifetime.
When a client sends an Idempotency-Key header with a POST request and
an IdempotencyKeyStore is configured, the handler detects retried upload
creation requests:

- If the key maps to a completed concat upload: return it immediately
- If the key maps to an incomplete concat (offset==0): retry concat
- If the key maps to a corrupted concat (0 < offset < size): return error
- If the key maps to a non-concat upload: return it for PATCH resume
- If no mapping exists: create normally and store the key->ID mapping

Also adds Idempotency-Key to CORS AllowHeaders and updates CORS tests.
Configure the appropriate IdempotencyKeyStore for each storage backend:
- filestore: file-based store (persists to same directory as uploads)
- S3/GCS/Azure: in-memory store (lost on restart, but protects within
  a single server lifetime)

Add --disable-idempotency flag to opt out of Idempotency-Key support.
Cover the key scenarios:
- Concat retry when already complete (returns existing)
- Concat retry when incomplete at offset 0 (retries concat)
- Concat retry when corrupted (returns error)
- Regular upload retry (returns existing for PATCH resume)
- No idempotency store configured (header ignored)
- No header sent (normal flow)
- Deleted upload falls through to new creation
os.WriteFile uses O_TRUNC, so a crash between truncation and completed
write leaves an empty or partial JSON file. FindUploadID then returns a
raw json.Unmarshal error, which the handler treats as a hard 500 —
permanently breaking that idempotency key.

Fix both the cause and the symptom:
- StoreUploadID now writes to a .tmp file then renames atomically
- FindUploadID treats corrupted JSON as ErrNotFound (cache miss) so the
  handler falls through to create a new upload and overwrites the file
emitFinishEvents increments UploadsFinished metrics and sends to
CompleteUploads on every call. For an idempotent replay of an
already-completed concat upload, hooks and metrics should not fire
again — consistent with the non-concat replay path which correctly
skips them.

Only run PreFinishResponseCallback so callers can still inject
response headers on replay. Update the ConcatRetryComplete test to
no longer expect a spurious CompleteUploads event.
Add DirModePerm and FileModePerm fields to FileIdempotencyStore,
matching the filestore pattern. Fall back to 0775/0664 defaults when
zero. Wire the CLI flags through so .idempotency-key files are created
with the same permissions as upload data files.
When Config.EnableBackgroundConcatenation is set, final (concat) uploads
are assembled in a goroutine detached from the HTTP request lifetime via
context.WithoutCancel. The handler responds 201 Created immediately and
the client polls with HEAD to observe completion. This prevents large
concatenations from being aborted when a client or proxy times out — the
root cause behind the duplicate uploads in tus#1364.

An in-process tracker (concatTracker) records in-flight concatenations so
retried requests (deduplicated via Idempotency-Key) do not start a second
concat and do not misread a partial in-progress offset as a corrupted
state. Concatenation remains synchronous by default.
Wire the new EnableBackgroundConcatenation config through to the CLI,
opt-in and disabled by default.
- Complete: verifies a final upload responds 201 immediately and emits
  the finish event asynchronously once the background concat finishes.
- RetryWhileInProgress: verifies a retry (via Idempotency-Key) arriving
  mid-concat returns 201 without starting a second concat or misreading
  the partial offset as a corrupted state.

Verified clean under the race detector.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant