Commit 381faa1
committed
Refactor: per-RG independent reverse scan (modeled after Atlas ReverseParquetSource)
Replace ReversedRowGroupStream's rg_row_counts boundary detection with
per-row-group independent reading. Each RG gets its own
ParquetRecordBatchStreamBuilder with RowFilter applied independently,
then batches are reversed per-RG. This fixes the correctness issue
where RowFilter reduces actual rows below rg_row_counts predictions.
Memory: O(largest RG), same as Atlas's ReverseParquetSource.
Added SLT test for exact reverse + pushdown_filters + predicate.1 parent 67e72af commit 381faa1
1 file changed
Lines changed: 285 additions & 167 deletions
0 commit comments