Skip to content

repr: fix panic decoding an empty Avro decimal value#37130

Closed
def- wants to merge 1 commit into
MaterializeInc:mainfrom
def-:pr-avro-panic-decode
Closed

repr: fix panic decoding an empty Avro decimal value#37130
def- wants to merge 1 commit into
MaterializeInc:mainfrom
def-:pr-avro-panic-decode

Conversation

@def-

@def- def- commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

twos_complement_be_to_numeric_inner read input[0] to determine the sign before checking that the input was non-empty, so a zero-length two's-complement byte run panicked with an out-of-bounds index.

This is reachable from Avro source ingestion: a decimal logical type backed by bytes (or fixed) whose value is encoded as a zero-length byte array is a valid wire form, and that data arrives from external, possibly hostile, schema registries and producers. A conformant producer encodes zero as [0x00], but a non-conformant one can send empty bytes, crashing the decoder and so the source.

An empty big-endian two's-complement run canonically denotes 0. Guarding the sign check with !input.is_empty() makes the existing logic produce 0 for that case (head == 0, no chunks), with no other change needed. The only caller is the Avro decimal decode path.

Found by the avro_decode_fuzzed_schema cargo-fuzz target.

Tests: adds test_twos_complement_empty_is_zero, covering the wrapper at several scales and the generic inner at both numeric widths.

Release note: Fix a panic in Avro-formatted sources when decoding a decimal column whose value was encoded as a zero-length byte array.

`twos_complement_be_to_numeric_inner` read `input[0]` to determine the
sign before checking that the input was non-empty, so a zero-length
two's-complement byte run panicked with an out-of-bounds index.

This is reachable from Avro source ingestion: a `decimal` logical type
backed by `bytes` (or `fixed`) whose value is encoded as a zero-length
byte array is a valid wire form, and that data arrives from external,
possibly hostile, schema registries and producers. A conformant producer
encodes zero as `[0x00]`, but a non-conformant one can send empty bytes,
crashing the decoder and so the source.

An empty big-endian two's-complement run canonically denotes 0. Guarding
the sign check with `!input.is_empty()` makes the existing logic produce
0 for that case (`head == 0`, no chunks), with no other change needed.
The only caller is the Avro decimal decode path.

Found by the avro_decode_fuzzed_schema cargo-fuzz target.

Tests: adds `test_twos_complement_empty_is_zero`, covering the wrapper at
several scales and the generic inner at both numeric widths.

Release note: Fix a panic in Avro-formatted sources when decoding a
`decimal` column whose value was encoded as a zero-length byte array.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@def- def- requested a review from martykulma June 18, 2026 09:47
@def- def- requested a review from a team as a code owner June 18, 2026 09:47
@def-

def- commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

This fix is already part of #36984. Really need to get this stuff merged or I'll keep rerunning into the same bugs.

@def- def- closed this Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant