Commit 2c871b2
authored
Project only accessed struct leaves in Parquet row filter pushdown (#20854)
## Which issue does this PR close?
- Related #20822
- Closes #20603
## Rationale for this change
This PR refines how the `FilterCandidateBuilder` projects struct columns
during Parquet row filter pushdown.
Previously, a filter like `s['value'] > 10` would cause the reader to
decode all leaf columns of a struct `s`, because `PushdownChecker` only
tracked the root column index and expanded it to every leaf. This wastes
I/O and decode time on fields the filter never touches
Now, the builder resolves only the matching Parquet leaf columns. It
does this by building a pruned filter schema that reflects exactly what
the Parquet reader produces when projecting a subset of struct leaves,
ensuring the expression evaluates against the correct types1 parent d09ff92 commit 2c871b2
1 file changed
+395
-48
lines changed
0 commit comments