Skip to content

Commit 84d109e

Browse files
def-claude
andcommitted
repr: Fix panic decoding an Avro decimal with empty unscaled bytes
twos_complement_be_to_numeric_inner read input[0] without first checking that the slice is non-empty, panicking with "index out of bounds: the len is 0 but the index is 0" on an empty byte string. That input is reachable from the wire: an Avro `decimal` logical type whose unscaled value is encoded as zero-length `bytes` flows straight from a Kafka message body through AvroFlatDecoder into this function, so a single crafted (or buggy-producer) message crashes Avro source decoding — an availability bug for ingestion. Our own encoder never emits an empty byte string (zero is the single byte 0x00), and Java's Avro decoder likewise rejects empty input (BigInteger throws "Zero length BigInteger"), so the right behaviour is a clean decode error, not a silent zero. The caller already maps the Err to a DecodeError, so the malformed row now surfaces as a decode error instead of aborting the worker. Found by the new avro_decode_fuzzed_schema fuzz target. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 0455a96 commit 84d109e

1 file changed

Lines changed: 23 additions & 0 deletions

File tree

src/repr/src/adt/numeric.rs

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -399,6 +399,14 @@ pub fn twos_complement_be_to_numeric(
399399
pub fn twos_complement_be_to_numeric_inner<D: Dec<N>, const N: usize>(
400400
input: &mut [u8],
401401
) -> Result<Decimal<N>, anyhow::Error> {
402+
if input.is_empty() {
403+
// An empty byte string is not a valid two's-complement integer. Our own
404+
// encoder never emits one (zero is the single byte `0x00`), so this only
405+
// arises from untrusted input — e.g. an Avro `decimal` field whose
406+
// unscaled value was encoded as zero-length `bytes`. Reject it rather
407+
// than indexing `input[0]` below and panicking.
408+
bail!("cannot parse a numeric value from an empty byte string");
409+
}
402410
let is_neg = if (input[0] & 0x80) != 0 {
403411
// byte-level negate all negative values, guaranteeing all bytes are
404412
// readable as unsigned.
@@ -469,6 +477,21 @@ fn test_twos_complement_roundtrip() {
469477
inner("-7.2e-35");
470478
}
471479

480+
#[mz_ore::test]
481+
#[cfg_attr(miri, ignore)] // unsupported operation: can't call foreign function `decNumberFromInt32` on OS `linux`
482+
fn test_twos_complement_empty_is_error() {
483+
// An empty byte string is not a valid two's-complement integer and used to
484+
// panic with an out-of-bounds index. It only arrives from untrusted input
485+
// (e.g. an Avro `decimal` encoded as zero-length `bytes`), so it must be a
486+
// clean error, not a panic. Regression test for that fix.
487+
for scale in [0u8, 1, 38] {
488+
assert!(twos_complement_be_to_numeric(&mut [], scale).is_err());
489+
}
490+
assert!(
491+
twos_complement_be_to_numeric_inner::<Numeric, NUMERIC_DATUM_WIDTH_USIZE>(&mut []).is_err()
492+
);
493+
}
494+
472495
#[mz_ore::test]
473496
#[cfg_attr(miri, ignore)] // unsupported operation: can't call foreign function `decNumberFromInt32` on OS `linux`
474497
fn test_twos_comp_numeric_primitive() {

0 commit comments

Comments
 (0)