Skip to content

Commit 72bbff2

Browse files
committed
fix(reader): auto-include equality delete key columns in projection
When a user scans with `.select(["col_a", "col_b"])` and the table has merge-on-read equality delete files keyed on a column NOT in the select list (e.g. `id`), the HashSet-based `apply_eq_delete_filter` fails with: Equality delete key column 'id' (field_id=1) not found in batch The fix augments the Parquet projection mask and RecordBatchTransformer with any equality delete key field IDs that are missing from the user's projection. After applying equality deletes, the extra columns are stripped from the output batches so the user sees only their requested columns. This matches the behavior of Spark, Flink, and Trino, which transparently widen the internal projection for delete evaluation.
1 parent 29cf9f6 commit 72bbff2

1 file changed

Lines changed: 452 additions & 12 deletions

File tree

0 commit comments

Comments
 (0)