Commit 2a2a060
Resolve MIN/MAX from Parquet metadata for Single-mode aggregates and CAST projections (#21651)
## Which issue does this PR close?
Related to improving ClickBench performance (metadata-only query
resolution)
## Rationale for this change
ClickBench Q6 (`SELECT MIN("EventDate"), MAX("EventDate") FROM hits`)
was doing a full column scan despite the answer being available in
Parquet row group statistics. Two issues prevented the
`AggregateStatistics` optimizer from firing:
1. **`take_optimizable` missed `Single` mode** — it only matched the
`Final → Partial` pair, not single-partition scans.
2. **Statistics lost through CAST projections** — `project_statistics`
returned `unknown` for any non-Column/Literal expression, discarding
Parquet min/max through casts like `CAST(CAST(EventDate AS Int32) AS
Date32)`.
This now avoids scanning any columns, going from ~6ms to ~1.5ms
```
│ QQuery 6 │ 5.12 / 6.29 ±0.83 / 7.65 ms │ 1.26 / 1.43 ±0.26 / 1.93 ms │ +4.39x faster │
```
## What changes are included in this PR?
- **`aggregate_statistics.rs`**: `take_optimizable` now also matches
`Single`/`SinglePartitioned` aggregates.
- **`projection.rs`**: Added `project_column_statistics_through_expr()`
which propagates min/max statistics through `CastExpr`.
Result: Q6 now resolves entirely from Parquet metadata (zero I/O).
## Are these changes tested?
Yes — existing tests pass, ClickBench sqllogictest updated with new
expected plan for Q6.
## Are there any user-facing changes?
Scalar `MIN`/`MAX` aggregates over CAST projections now resolve from
file metadata when statistics are available, avoiding unnecessary I/O.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 9a3d96a commit 2a2a060
4 files changed
Lines changed: 137 additions & 21 deletions
File tree
- datafusion
- physical-expr/src
- physical-optimizer/src
- sqllogictest/test_files
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
714 | 714 | | |
715 | 715 | | |
716 | 716 | | |
717 | | - | |
718 | | - | |
719 | | - | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
720 | 721 | | |
721 | 722 | | |
722 | 723 | | |
| |||
726 | 727 | | |
727 | 728 | | |
728 | 729 | | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
729 | 763 | | |
730 | 764 | | |
731 | 765 | | |
| |||
1256 | 1290 | | |
1257 | 1291 | | |
1258 | 1292 | | |
1259 | | - | |
| 1293 | + | |
1260 | 1294 | | |
1261 | 1295 | | |
1262 | 1296 | | |
| |||
2791 | 2825 | | |
2792 | 2826 | | |
2793 | 2827 | | |
| 2828 | + | |
| 2829 | + | |
| 2830 | + | |
| 2831 | + | |
| 2832 | + | |
| 2833 | + | |
| 2834 | + | |
| 2835 | + | |
| 2836 | + | |
| 2837 | + | |
| 2838 | + | |
| 2839 | + | |
| 2840 | + | |
| 2841 | + | |
| 2842 | + | |
| 2843 | + | |
| 2844 | + | |
| 2845 | + | |
| 2846 | + | |
| 2847 | + | |
| 2848 | + | |
| 2849 | + | |
| 2850 | + | |
| 2851 | + | |
| 2852 | + | |
| 2853 | + | |
| 2854 | + | |
| 2855 | + | |
| 2856 | + | |
| 2857 | + | |
| 2858 | + | |
| 2859 | + | |
2794 | 2860 | | |
2795 | 2861 | | |
2796 | 2862 | | |
| |||
Lines changed: 23 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
24 | 26 | | |
25 | 27 | | |
26 | 28 | | |
| |||
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
52 | | - | |
| 54 | + | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
| |||
106 | 108 | | |
107 | 109 | | |
108 | 110 | | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
120 | 129 | | |
121 | | - | |
| 130 | + | |
122 | 131 | | |
123 | 132 | | |
124 | 133 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8977 | 8977 | | |
8978 | 8978 | | |
8979 | 8979 | | |
| 8980 | + | |
| 8981 | + | |
| 8982 | + | |
| 8983 | + | |
| 8984 | + | |
| 8985 | + | |
| 8986 | + | |
| 8987 | + | |
| 8988 | + | |
| 8989 | + | |
| 8990 | + | |
| 8991 | + | |
| 8992 | + | |
| 8993 | + | |
| 8994 | + | |
| 8995 | + | |
| 8996 | + | |
| 8997 | + | |
| 8998 | + | |
| 8999 | + | |
| 9000 | + | |
| 9001 | + | |
| 9002 | + | |
| 9003 | + | |
| 9004 | + | |
| 9005 | + | |
| 9006 | + | |
| 9007 | + | |
| 9008 | + | |
| 9009 | + | |
| 9010 | + | |
| 9011 | + | |
| 9012 | + | |
| 9013 | + | |
| 9014 | + | |
| 9015 | + | |
| 9016 | + | |
| 9017 | + | |
| 9018 | + | |
| 9019 | + | |
| 9020 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
192 | 192 | | |
193 | 193 | | |
194 | 194 | | |
195 | | - | |
196 | | - | |
| 195 | + | |
| 196 | + | |
197 | 197 | | |
198 | 198 | | |
199 | 199 | | |
| |||
0 commit comments