Add optional background concatenation for final uploads#1374
Draft
m7kvqbe1 wants to merge 12 commits into
Draft
Conversation
Introduce the IdempotencyKeyStore interface for persisting mappings from client-provided Idempotency-Key header values to upload IDs, enabling detection of retried upload creation requests. Add the interface to StoreComposer following the same pattern as Locker, with UseIdempotencyKeyStore() and Capabilities() support. Ref: https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-07.html
Persist idempotency key to upload ID mappings as JSON files on disk,
using {sha256(key)}.idempotency-key filenames. Stores the full original
key inside the file to guard against hash collisions.
Follows the filelocker pattern: New(path), UseIn(composer), and the
same directory can be shared with upload data.
In-memory store for cloud storage backends (S3, GCS, Azure) where no local disk is available. Mappings are lost on process restart but still protect against duplicate uploads within a single server lifetime.
When a client sends an Idempotency-Key header with a POST request and an IdempotencyKeyStore is configured, the handler detects retried upload creation requests: - If the key maps to a completed concat upload: return it immediately - If the key maps to an incomplete concat (offset==0): retry concat - If the key maps to a corrupted concat (0 < offset < size): return error - If the key maps to a non-concat upload: return it for PATCH resume - If no mapping exists: create normally and store the key->ID mapping Also adds Idempotency-Key to CORS AllowHeaders and updates CORS tests.
Configure the appropriate IdempotencyKeyStore for each storage backend: - filestore: file-based store (persists to same directory as uploads) - S3/GCS/Azure: in-memory store (lost on restart, but protects within a single server lifetime) Add --disable-idempotency flag to opt out of Idempotency-Key support.
Cover the key scenarios: - Concat retry when already complete (returns existing) - Concat retry when incomplete at offset 0 (retries concat) - Concat retry when corrupted (returns error) - Regular upload retry (returns existing for PATCH resume) - No idempotency store configured (header ignored) - No header sent (normal flow) - Deleted upload falls through to new creation
os.WriteFile uses O_TRUNC, so a crash between truncation and completed write leaves an empty or partial JSON file. FindUploadID then returns a raw json.Unmarshal error, which the handler treats as a hard 500 — permanently breaking that idempotency key. Fix both the cause and the symptom: - StoreUploadID now writes to a .tmp file then renames atomically - FindUploadID treats corrupted JSON as ErrNotFound (cache miss) so the handler falls through to create a new upload and overwrites the file
emitFinishEvents increments UploadsFinished metrics and sends to CompleteUploads on every call. For an idempotent replay of an already-completed concat upload, hooks and metrics should not fire again — consistent with the non-concat replay path which correctly skips them. Only run PreFinishResponseCallback so callers can still inject response headers on replay. Update the ConcatRetryComplete test to no longer expect a spurious CompleteUploads event.
Add DirModePerm and FileModePerm fields to FileIdempotencyStore, matching the filestore pattern. Fall back to 0775/0664 defaults when zero. Wire the CLI flags through so .idempotency-key files are created with the same permissions as upload data files.
When Config.EnableBackgroundConcatenation is set, final (concat) uploads are assembled in a goroutine detached from the HTTP request lifetime via context.WithoutCancel. The handler responds 201 Created immediately and the client polls with HEAD to observe completion. This prevents large concatenations from being aborted when a client or proxy times out — the root cause behind the duplicate uploads in tus#1364. An in-process tracker (concatTracker) records in-flight concatenations so retried requests (deduplicated via Idempotency-Key) do not start a second concat and do not misread a partial in-progress offset as a corrupted state. Concatenation remains synchronous by default.
Wire the new EnableBackgroundConcatenation config through to the CLI, opt-in and disabled by default.
- Complete: verifies a final upload responds 201 immediately and emits the finish event asynchronously once the background concat finishes. - RetryWhileInProgress: verifies a retry (via Idempotency-Key) arriving mid-concat returns 201 without starting a second concat or misreading the partial offset as a corrupted state. Verified clean under the race detector.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
Draft, and stacked on #1366 (Idempotency-Key support). Until #1366 merges, the diff here also includes its commits. The background-concatenation work itself is the last 3 commits. I'll rebase once #1366 lands.
Summary
Follow-up to #1364 / #1366 that addresses @Acconut's point that idempotency-key alone does not solve concatenation timeouts: retrying a concat that timed out will likely time out again. The actual fix is to run the concatenation detached from the HTTP request so it can finish even when the client connection or a proxy times out.
Config.EnableBackgroundConcatenation(CLI:--enable-background-concatenation), opt-in and disabled by defaultcontext.WithoutCancel; the handler responds201 Createdimmediately and the client polls withHEADuntiloffset == sizeHow this composes with Idempotency-Key (#1366)
Open questions for maintainers
These are the points I flagged in the issue discussion and would like your steer on:
Test plan
TestBackgroundConcatenation/Complete- 201 returned immediately, finish event emitted asynchronously after concat completesTestBackgroundConcatenation/RetryWhileInProgress- retry mid-concat returns 201, no second concat, no false corruption errorgo test -race ./pkg/handler/...is clean