Skip to content

Commit b921999

Browse files
authored
fix[vortex-array]: fix rebuild_trim_elements overflow (#6995)
This commit just casts offsets and sizes to u64/i64 before summing for simplicitly. We could make the logic more complicated but not sure it's worth it. <!-- Thank you for submitting a pull request! We appreciate your time and effort. Please make sure to provide enough information so that we can review your pull request. The Summary and Testing sections below contain guidance on what to include. --> ## Summary <!-- If this PR is related to a tracked effort, please link to the relevant issue here (e.g., `Closes: #123`). Otherwise, feel free to ignore / delete this. In this section, please: 1. Explain the rationale for this change. 2. Summarize the changes included in this PR. A general rule of thumb is that larger PRs should have larger summaries. If there are a lot of changes, please help us review the code by explaining what was changed and why. If there is an issue or discussion attached, there is no need to duplicate all the details, but clarity is always preferred over brevity. --> Closes: #6973 <!-- ## API Changes Uncomment this section if there are any user-facing changes. Consider whether the change affects users in one of the following ways: 1. Breaks public APIs in some way. 2. Changes the underlying behavior of one of the engine integrations. 3. Should some documentation be updated to reflect this change? If a public API is changed in a breaking manner, make sure to add the appropriate label. You can run `./scripts/public-api.sh` locally to see if there are any public API changes (and this also runs in our CI). --> ## Testing <!-- Please describe how this change was tested. Here are some common categories for testing in Vortex: 1. Verifying existing behavior is maintained. 2. Verifying new behavior and functionality works correctly. 3. Serialization compatibility (backwards and forwards) should be maintained or explicitly broken. --> Regression test added Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
1 parent 876813b commit b921999

1 file changed

Lines changed: 27 additions & 14 deletions

File tree

vortex-array/src/arrays/listview/rebuild.rs

Lines changed: 27 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,10 @@ use crate::arrays::ListViewArray;
1414
use crate::builders::builder_with_capacity;
1515
use crate::builtins::ArrayBuiltins;
1616
use crate::compute;
17+
use crate::dtype::DType;
1718
use crate::dtype::IntegerPType;
1819
use crate::dtype::Nullability;
20+
use crate::dtype::PType;
1921
use crate::match_each_integer_ptype;
2022
use crate::scalar::Scalar;
2123
use crate::scalar_fn::fns::operators::Operator;
@@ -294,21 +296,16 @@ impl ListViewArray {
294296
let last_size = self.size_at(self.len() - 1);
295297
last_offset + last_size
296298
} else {
297-
// Offsets and sizes can have different primitive types (e.g. u32 vs u16).
298-
// Cast the narrower to the wider since arithmetic requires identical operand types.
299-
let (offsets, sizes) = if self.offsets().dtype().as_ptype().byte_width()
300-
>= self.sizes().dtype().as_ptype().byte_width()
301-
{
302-
(
303-
self.offsets().clone(),
304-
self.sizes().cast(self.offsets().dtype().clone())?,
305-
)
299+
// Cast offsets and sizes to the widest integer type to prevent
300+
// overflow when computing offsets + sizes. The end offset may not
301+
// fit in the integer width otherwise.
302+
let wide_dtype = DType::from(if self.offsets().dtype().as_ptype().is_unsigned_int() {
303+
PType::U64
306304
} else {
307-
(
308-
self.offsets().cast(self.sizes().dtype().clone())?,
309-
self.sizes().clone(),
310-
)
311-
};
305+
PType::I64
306+
});
307+
let offsets = self.offsets().cast(wide_dtype.clone())?;
308+
let sizes = self.sizes().cast(wide_dtype)?;
312309

313310
let min_max = compute::min_max(
314311
&offsets
@@ -612,4 +609,20 @@ mod tests {
612609
// avg = 128 → LBL
613610
assert!(!ListViewArray::should_use_take(128_000, 1_000));
614611
}
612+
613+
/// Regression test for <https://github.com/vortex-data/vortex/issues/6973>.
614+
/// Both offsets and sizes are u8, and offset + size exceeds u8::MAX.
615+
#[test]
616+
fn test_rebuild_trim_elements_sum_overflows_type() -> VortexResult<()> {
617+
let elements = PrimitiveArray::from_iter(vec![0i32; 261]).into_array();
618+
let offsets = PrimitiveArray::from_iter(vec![215u8, 0]).into_array();
619+
let sizes = PrimitiveArray::from_iter(vec![46u8, 10]).into_array();
620+
621+
let listview = ListViewArray::new(elements, offsets, sizes, Validity::NonNullable);
622+
let trimmed = listview.rebuild(ListViewRebuildMode::TrimElements)?;
623+
624+
// min(offsets) = 0, so nothing to trim; output should equal input.
625+
assert_arrays_eq!(trimmed, listview);
626+
Ok(())
627+
}
615628
}

0 commit comments

Comments
 (0)