Skip to content

Field pruning in nested struct scan#7327

Open
brancz wants to merge 2 commits intovortex-data:developfrom
polarsignals:prune-scan
Open

Field pruning in nested struct scan#7327
brancz wants to merge 2 commits intovortex-data:developfrom
polarsignals:prune-scan

Conversation

@brancz
Copy link
Copy Markdown
Contributor

@brancz brancz commented Apr 7, 2026

Summary

I'm trying to have scans read only a subset of a nested struct. It's of the shape: List<Struct<a: Utf8, b: Int64>, so I can't use regular projections in DataFusion, since those only support indices. This is my best attempt, combined with an optimizer rule that modifies the table schema to the subset under certain circumstances.

Testing

Added a unit test that fails without the second commit.

Notes

Happy to think through other approaches to this, this was just the first thing that I came up with.

Comment on lines +63 to +67
let can_fast_path = match target_fields {
None => true,
Some(fields) => fields.len() >= n_struct_fields,
};
if can_fast_path {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use is_none_or here

@connortsui20 connortsui20 added changelog/performance A performance improvement action/benchmark Trigger full benchmarks to run on this PR labels Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action/benchmark Trigger full benchmarks to run on this PR changelog/performance A performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants