fix(layout): don't panic collecting an empty stream#8472
Conversation
`CollectStrategy` panicked when its input stream was empty (writing a zero-row nullable struct column) because no chunk supplied a sequence id. mint the collected chunk's id from `eof` instead. Closes vortex-data#8347 Signed-off-by: Han Damin <miniex@daminstudio.net>
Merging this PR will degrade performance by 25.57%
Warning Please fix the performance issues or acknowledge them on CodSpeed. Performance Changes
Tip Investigate this regression by commenting Comparing Footnotes
|
Summary
CollectStrategycollects its whole input into a single chunk for a child that requires exactly one chunk. When the input stream is empty, e.g. writing a zero-row table with a nullable struct column whose validity substream is empty, no chunk supplied a sequence id and it panicked withmust have visited at least one chunk. An empty stream now still yields a single empty chunk, taking its sequence id fromeofso it stays ordered in the sequence universe.The issue mentions the fuzzer's
assume()guard could be dropped once this is fixed. I left it in place here: reading a nullable struct nested in a struct back is a separate bug (#8348), so the round-trip only works once both are fixed.Closes: #8347
Testing
Added a regression test in
vortex-file/tests/test_write_table.rsthat writes a zero-row nullable struct column; it panicked before this change and passes after. The issue's Python repro (vx.io.writeof an emptystructcolumn) no longer panics.cargo nextest run -p vortex-layout -p vortex-filepasses (175 tests, including the segment-ordering tests).fmt --all+clippy --all-targets --all-featuresclean.I'm Korean, so sorry if any wording reads a little awkward.