Add Idempotency-Key header support for upload creation requests#1366
Add Idempotency-Key header support for upload creation requests#1366m7kvqbe1 wants to merge 9 commits into
Conversation
Introduce the IdempotencyKeyStore interface for persisting mappings from client-provided Idempotency-Key header values to upload IDs, enabling detection of retried upload creation requests. Add the interface to StoreComposer following the same pattern as Locker, with UseIdempotencyKeyStore() and Capabilities() support. Ref: https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-07.html
Persist idempotency key to upload ID mappings as JSON files on disk,
using {sha256(key)}.idempotency-key filenames. Stores the full original
key inside the file to guard against hash collisions.
Follows the filelocker pattern: New(path), UseIn(composer), and the
same directory can be shared with upload data.
In-memory store for cloud storage backends (S3, GCS, Azure) where no local disk is available. Mappings are lost on process restart but still protect against duplicate uploads within a single server lifetime.
When a client sends an Idempotency-Key header with a POST request and an IdempotencyKeyStore is configured, the handler detects retried upload creation requests: - If the key maps to a completed concat upload: return it immediately - If the key maps to an incomplete concat (offset==0): retry concat - If the key maps to a corrupted concat (0 < offset < size): return error - If the key maps to a non-concat upload: return it for PATCH resume - If no mapping exists: create normally and store the key->ID mapping Also adds Idempotency-Key to CORS AllowHeaders and updates CORS tests.
Configure the appropriate IdempotencyKeyStore for each storage backend: - filestore: file-based store (persists to same directory as uploads) - S3/GCS/Azure: in-memory store (lost on restart, but protects within a single server lifetime) Add --disable-idempotency flag to opt out of Idempotency-Key support.
Cover the key scenarios: - Concat retry when already complete (returns existing) - Concat retry when incomplete at offset 0 (retries concat) - Concat retry when corrupted (returns error) - Regular upload retry (returns existing for PATCH resume) - No idempotency store configured (header ignored) - No header sent (normal flow) - Deleted upload falls through to new creation
|
Review by Devin: https://app.devin.ai/review/tus/tusd/pull/1366 |
os.WriteFile uses O_TRUNC, so a crash between truncation and completed write leaves an empty or partial JSON file. FindUploadID then returns a raw json.Unmarshal error, which the handler treats as a hard 500 — permanently breaking that idempotency key. Fix both the cause and the symptom: - StoreUploadID now writes to a .tmp file then renames atomically - FindUploadID treats corrupted JSON as ErrNotFound (cache miss) so the handler falls through to create a new upload and overwrites the file
emitFinishEvents increments UploadsFinished metrics and sends to CompleteUploads on every call. For an idempotent replay of an already-completed concat upload, hooks and metrics should not fire again — consistent with the non-concat replay path which correctly skips them. Only run PreFinishResponseCallback so callers can still inject response headers on replay. Update the ConcatRetryComplete test to no longer expect a spurious CompleteUploads event.
Add DirModePerm and FileModePerm fields to FileIdempotencyStore, matching the filestore pattern. Fall back to 0775/0664 defaults when zero. Wire the CLI flags through so .idempotency-key files are created with the same permissions as upload data files.
Thanks @Murderlon. I've added some fixups to this PR based on Devin's review: |
|
cc @Acconut |
Acconut
left a comment
There was a problem hiding this comment.
Thank you for opening this PR. I'm wondering if this is actually capable of solving the timeout issues when concatenating large uploads. Tusd would just retry the concatenating operation when it detects an existing upload resource with the same idempotency key.
But if we assume that the concatenation operation times out because the underlying process is too slow, retrying the operation would likely not help. And tusd aborts the concatenation operation whenever the request times out on the client or proxy side. Unless the process is just sometimes too slow, and there is a chance that it will be faster at the new attempt, retrying offers little improvement. Tusd would actually have to resume the concatenation operation, but I'm not sure this is possible for all storage implementations. It might be doable for filestore and gcsstore, but not for s3store as far as I can tell.
Idempotency keys are a helpful tool to prevent duplicate upload resources from being created, but it might not be that helpful for solving concatenation failures on its own as I initially assumed. If we take a look at the original issue of duplicate uploads because of failed concatenations, the duplicates are just a symptom, but not the actual fault. That is the concatenation failing in the first place due to the request timing out. To prevent that the concatenation could be run in a background process, which has more time to complete than the short-lived request context. So even if the request times out after X seconds, the concatenation can continue. Deduplication via Idempotency-Key would still play an important role here as it ensures that client retrieves the original upload resources when it retries the upload concatenation request.
What do you think about this idea? The hurdle to getting this in tusd is higher since it currently doesn't have a system for running tasks in the background. But we might want to have this for other features as well (e.g. cleaning up old uploads).
Summary
POST) requests, allowing clients to safely retry without creating duplicate uploadsIdempotencyKeyStoreinterface with two implementations: file-based (for disk storage backends, persists across restarts) and in-memory (for cloud storage backends)Motivation
Closes #1364
Upload creation requests (particularly concatenation) are not currently idempotent. When a concat request is interrupted (e.g. by a WAF timeout), the client never receives the response and retries, creating a duplicate final upload.
This implements the approach suggested by @Acconut in #1364: a general-purpose
Idempotency-Keyheader that works for all upload creation requests, not just concatenation. For the filestore backend, key-to-upload-ID mappings are persisted to disk alongside upload data (as.idempotency-keyfiles), so they survive server restarts.See also #1365 for a previous concat-only approach.
Design
New interface (
IdempotencyKeyStore) added toStoreComposer, following the same pattern asLocker:Two implementations shipped:
fileidempotencystore(disk)memoryidempotencystore(memory)Handler logic (in
PostFile):Idempotency-KeyheaderOpt-out:
--disable-idempotencyCLI flag.Test plan
TestIdempotency/ConcatRetryComplete-- retried concat with already-complete upload returns existingTestIdempotency/ConcatRetryIncomplete-- retried concat with offset==0 retries the concatenationTestIdempotency/ConcatRetryCorrupted-- retried concat with partial data returns errorTestIdempotency/RegularUploadRetry-- retried regular upload returns existing with offsetTestIdempotency/NoStoreConfigured-- header ignored when no store configuredTestIdempotency/NoHeaderSent-- normal flow when no headerTestIdempotency/DeletedUploadFallsThrough-- stale mapping to deleted upload creates new uploadTestFileIdempotencyStore-- file-based store unit testsgo test ./pkg/...passes