Skip to content

Ranged GET can return an empty stream: AdjustedRangeSubscriber prematurely signals onComplete when the first chunk is smaller than the pending skip #517

@tbaeg

Description

@tbaeg

Problem:

A ranged getObject through S3EncryptionClient (with enableLegacyUnauthenticatedModes(true)) can intermittently complete with 0 bytes even though the requested range contains data — no exception is surfaced to the caller.

For any plaintext-relative range starting at offset ≥ 1, the crypto range sent to S3 is aligned down to a cipher-block boundary (one extra block for offsets ≥ 16, see RangedGetUtils.getCipherBlockLowerBound), so AdjustedRangeSubscriber must skip numBytesToSkip (1–31) bytes of decrypted output before delivering data (initializeForRead). The skip logic in onNext treats "chunk smaller than the remaining skip" as end-of-stream:

// legacy/internal/AdjustedRangeSubscriber.java (v3.6.1; byte-identical since at least v3.1.2, also on main)
if (numBytesToSkip > buf.length) {
    numBytesToSkip -= buf.length;
    wrappedSubscriber.onComplete();   // <-- premature: upstream is still delivering chunks
}

If the first decrypted chunk is smaller than the pending skip, onComplete() is signaled while more chunks are still in flight. There is also no return after the branch, so execution falls through and throws NullPointerException at Math.min(virtualAvailable, outputBuffer.length) (the outputBuffer field is still null on the first chunk) — but only after onComplete was already delivered. Downstream subscribers that latch the first terminal signal — e.g. InputStreamSubscriber used by AsyncResponseTransformer.toBlockingInputStream(), which the synchronous S3EncryptionClient.getObject path joins on — silently drop the subsequent chunks and the error signal. The caller observes a successful, empty stream: the first read() returns -1.

Triggers

  • Any transport delivery where the first ByteBuffer reaching the subscriber chain is smaller than the skip (1–31 bytes) — e.g. TLS-record / TCP-segmentation fragmentation producing a tiny first chunk. Timing dependent, hence intermittent; a retry usually succeeds.
  • AES/CBC (v1-format) objects are especially exposed: CipherSubscriber intentionally emits ByteBuffer.allocate(0) when cipher.update produces no output (behavior introduced by the fix: do not signal onComplete when the incoming buffer length is less than the cipher block #209 fix), and AES/CBC/PKCS5Padding decryption produces no output for any input ≤ 31 bytes (a full block is withheld for padding). An empty buffer always satisfies numBytesToSkip > buf.length.
  • For CTR (i.e. ranged GETs of AES-GCM objects), a tiny-but-nonempty first chunk smaller than the skip triggers the same branch directly.

Expected behavior

The skip should consume bytes across successive chunks without signaling completion; onComplete should only propagate when the upstream actually completes.

Reproduction

Driving the subscriber chain directly with the released artifacts (amazon-s3-encryption-client-java:3.6.1, software.amazon.awssdk:utils:2.34.1):

  1. Construct AdjustedRangeSubscriber for desired range [100, 199] with a content range giving numBytesToSkip = 20, wrapping an InputStreamSubscriber.
  2. Call onSubscribe(...), then onNext(ByteBuffer.allocate(10)) — a first chunk smaller than the skip.
  3. Observe: the downstream InputStreamSubscriber receives onComplete; the same onNext call throws NullPointerException ("this.outputBuffer" is null); a subsequent 200-byte onNext and a late onError are silently ignored; read() returns -1, so readNBytes(buf, off, len) returns 0.

In production we observe this as intermittent 0-byte results from ranged getObject calls through the encryption client (e.g. Parquet/ORC footer reads at large offsets) that succeed when retried. We have carried an application-level retry-on-zero-bytes workaround across client versions 3.1.2 → 3.6.1 because the behavior persists.

Versions affected

AdjustedRangeSubscriber.java is byte-identical in v3.1.2, v3.3.0, v3.4.0, v3.6.1, and current main (verified by diffing the tags), so all of these are affected.

Solution:

Possible solution

In the numBytesToSkip > buf.length branch, consume the chunk toward the skip and request the next chunk from the subscription instead of completing:

if (numBytesToSkip > buf.length) {
    numBytesToSkip -= buf.length;
    // need more data before anything can be delivered; do not complete —
    // upstream will deliver the remaining chunks and the real onComplete
    subscription.request(1);
    return;
}

(If the upstream genuinely ends before the skip is satisfied, the upstream onComplete still arrives through the normal path.) This also makes the empty-buffer emission from CipherSubscriber flow through harmlessly.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions