Commit 66a5a89
[SPARK-56467][SQL] Route scalar subquery partition filters into DSv2 runtime filtering
### What changes were proposed in this pull request?
Scalar subquery filters on partition columns (e.g., `WHERE d_date_sk = (SELECT min(d_date_sk) FROM ...)`) are excluded from pushdown in DSv2 at every stage. The filter lands as a `FilterExec` above `BatchScanExec`, evaluated row-by-row. The scan reads all partitions -- no partition pruning occurs.
DSv1 already handles this: `FileSourceStrategy` puts subquery filters in `partitionFilters`, `isDynamicFilter` classifies them as dynamic, and `getPartitionPruningFilterFromBroadcast` calls `ScalarSubquery.toLiteral` at execution time for partition pruning via `listFiles()`.
This PR routes partition-column scalar subquery filters into `BatchScanExec.runtimeFilters`, leveraging the existing `SupportsRuntimeV2Filtering.filter()` infrastructure:
- **DataSourceV2Strategy**: When the scan implements `SupportsRuntimeV2Filtering`, extract subquery filters from `postScanFilters` where references are a subset of partition columns. Add to `runtimeFilters` alongside existing DPP filters. They remain in `postScanFilters` as a correctness safety net (V2 `filter()` is advisory).
- **BatchScanExec**: In `filteredPartitions`, non-DPP runtime filters are literalized (replacing `ExecScalarSubquery` with its resolved literal) and translated to V2 predicates via `translateFilterV2`.
- **InMemoryTableWithV2Filter** (test infra): Added `=` predicate handling in `filter()` alongside existing `IN`, plus a `case _ =>` catch-all.
No new interfaces, no config flags, no connector changes needed.
### Why are the changes needed?
TPC-DS queries with scalar subquery partition filters (e.g., Q5, Q12, Q16, Q20, Q37, Q77, Q80, Q92, Q94, Q95) read all partitions in DSv2 scans even though the subquery resolves to a single value at runtime. This causes significant I/O overhead that DSv1 avoids.
### Does this PR introduce _any_ user-facing change?
No API changes. Queries with scalar subquery filters on partition columns will now benefit from partition pruning in DSv2 scans, reducing I/O.
### How was this patch tested?
New unit test in `DataSourceV2SQLSuiteV2Filter`:
- Creates a 10-partition table and a dimension table
- Runs `SELECT * FROM t WHERE part = (SELECT max(val) FROM dim)`
- Asserts query correctness, scalar subquery presence in `runtimeFilters`, and exactly 1 partition after pruning
### Was this patch authored or co-authored using generative AI tooling?
Yes, co-authored with Claude Code.
Closes #55335 from anton5798/scalar-subquery-dsv2-pruning.
Lead-authored-by: Anton Lykov <25360033+anton5798@users.noreply.github.com>
Co-authored-by: Anton Lykov <antony.lykov@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>1 parent 26a86f9 commit 66a5a89
5 files changed
Lines changed: 101 additions & 5 deletions
File tree
- sql
- catalyst/src
- main/scala/org/apache/spark/sql/execution/datasources/v2
- test/scala/org/apache/spark/sql/connector/catalog
- core/src
- main/scala/org/apache/spark/sql/execution/datasources/v2
- test/scala/org/apache/spark/sql/connector
Lines changed: 15 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
| 34 | + | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
174 | 175 | | |
175 | 176 | | |
176 | 177 | | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
177 | 190 | | |
178 | 191 | | |
179 | 192 | | |
| |||
sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryTableWithV2Filter.scala
Lines changed: 16 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
90 | 106 | | |
91 | 107 | | |
92 | 108 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
| 66 | + | |
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
| |||
Lines changed: 32 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
45 | | - | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| |||
155 | 156 | | |
156 | 157 | | |
157 | 158 | | |
158 | | - | |
| 159 | + | |
159 | 160 | | |
160 | 161 | | |
161 | 162 | | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
162 | 179 | | |
163 | 180 | | |
164 | 181 | | |
| |||
746 | 763 | | |
747 | 764 | | |
748 | 765 | | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
749 | 779 | | |
750 | 780 | | |
751 | 781 | | |
| |||
Lines changed: 37 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4315 | 4315 | | |
4316 | 4316 | | |
4317 | 4317 | | |
| 4318 | + | |
| 4319 | + | |
| 4320 | + | |
4318 | 4321 | | |
| 4322 | + | |
| 4323 | + | |
| 4324 | + | |
| 4325 | + | |
| 4326 | + | |
| 4327 | + | |
| 4328 | + | |
| 4329 | + | |
| 4330 | + | |
| 4331 | + | |
| 4332 | + | |
| 4333 | + | |
| 4334 | + | |
| 4335 | + | |
| 4336 | + | |
| 4337 | + | |
| 4338 | + | |
| 4339 | + | |
| 4340 | + | |
| 4341 | + | |
| 4342 | + | |
| 4343 | + | |
| 4344 | + | |
| 4345 | + | |
| 4346 | + | |
| 4347 | + | |
| 4348 | + | |
| 4349 | + | |
| 4350 | + | |
| 4351 | + | |
| 4352 | + | |
| 4353 | + | |
| 4354 | + | |
| 4355 | + | |
4319 | 4356 | | |
4320 | 4357 | | |
4321 | 4358 | | |
| |||
0 commit comments