fix: accept GCS gzip responses without Content-Length#782
Open
nkemnitz wants to merge 1 commit into
Open
Conversation
GCS serves large objects stored with `Content-Encoding: gzip` using chunked transfer with no `Content-Length` (and decompressive transcoding when the client does not accept gzip encoding). The GET path required `Content-Length` unconditionally and failed with `MissingContentLength`, even though a chunked body is a valid self-delimiting response (RFC 9112 §6.2 forbids `Content-Length` with `Transfer-Encoding: chunked`). Add `HeaderConfig::stored_size_header`: when `Content-Length` is absent the size falls back to this header. GCS sets it to `x-goog-stored-content-length` (always present); S3, Azure and HTTP leave it `None`, so a missing `Content-Length` remains an error for them. This fixes the reported `MissingContentLength` failure. Some transcoded GCS responses also omit the ETag and still fail with `MissingEtag`; that is left for a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Part of #774 (does not fully close it — see below).
Rationale for this change
GCS serves large objects stored with
Content-Encoding: gzipusing chunked transfer with noContent-Length(and decompressive transcoding when the client does not accept gzip encoding).ObjectStore::get/headon GCS requiredContent-Lengthunconditionally and failed withGeneric { store: "GCS", source: Header { source: MissingContentLength } }, even though a chunked, self-delimiting body is a valid response (RFC 9112 §6.2 forbidsContent-LengthalongsideTransfer-Encoding: chunked).What changes are included in this PR?
HeaderConfiggainsstored_size_header: Option<&'static str>. WhenContent-Lengthis absent,header_metareads the object size from this header. GCS sets it tox-goog-stored-content-length(always present); S3, Azure and theHTTP store leave it
None, so a missingContent-Lengthstays a hard error for them.Are there any user-facing changes?
get()/head()now succeed on chunked gzip GCS objects. On a server-decompressed (transcoded) read,ObjectMeta.sizeis the stored (compressed) size, since the decompressed length is not known without reading the body; on a passthrough read (Accept-Encoding: gzip) it is exact.Not fully resolved: some transcoded GCS responses (default reads without
Accept-Encoding: gzip) also omit the ETag entirely and still fail withMissingEtag. Left for a follow-up.🤖 AI disclaimer:
All the code written by Claude. I made the changes as targeted and minimal as possible, for now only focusing on the chunked encoding, because that's my major blocker. Decompressive transcoding feels kind of niche. And the ETag handling involves some more thought and knowledge about this repo. E.g. I think the different Cloud vendors rely on custom metadata
versionheaders to allow resuming downloads, rather than the ETag(?)...