Commit ba7e30e
committed
Add EnsureRequirements: idempotent merged EnforceDistribution + EnforceSorting
## Summary
Replace the separate `EnforceDistribution` and `EnforceSorting` optimizer rules
with a single `EnsureRequirements` rule in the default optimizer chain. This makes
the composition idempotent by fixing distribution-awareness in `pushdown_sorts`
and fetch preservation in `EnforceDistribution`.
## Problem
`EnforceDistribution` and `EnforceSorting` are coupled through
`SortExec.preserve_partitioning` but run as independent rules. This caused:
1. **Production 502 errors**: `pushdown_sorts` set `preserve_partitioning=true`
without `SortPreservingMergeExec`, violating `SinglePartition` requirements
from `GlobalLimitExec` → `SanityCheckPlan` failure.
2. **Non-idempotent composition**: Running the rules multiple times produced
different (sometimes invalid) plans.
3. **Lost fetch values** (#14150): `EnforceDistribution` dropped `fetch` from
`SortPreservingMergeExec` when stripping and re-adding distribution operators.
DataFusion was the only major engine with separate rules — Spark (`EnsureRequirements`)
and Presto (`AddExchanges`) use a single rule.
## Changes
### `EnsureRequirements` rule (new)
- Composes `EnforceDistribution::optimize()` + `EnforceSorting::optimize()`
- Replaces both rules in the default optimizer chain
- 53 comprehensive tests including idempotency verification
### Distribution-aware `pushdown_sorts` (fix)
- Add `distribution_requirement` field to `ParentRequirements`
- New `add_sort_above_with_distribution()` inserts `SortPreservingMergeExec`
when parent requires `SinglePartition` and input has multiple partitions
- Propagate distribution through recursion with `stronger_distribution()`
- Reset distribution below partition-merging nodes (SPM, single-partition outputs)
### Fix `EnforceDistribution` fetch preservation (#14150)
- `remove_dist_changing_operators()` now saves fetch from removed SPM/Coalesce
- `add_merge_on_top()` re-applies saved fetch to new operators
## Testing
| Suite | Result |
|-------|--------|
| EnsureRequirements (new) | 53 passed |
| enforce_sorting (existing) | 124 passed, 0 regressions |
| enforce_distribution (existing) | 66 passed, 0 regressions |
| SLT (465 files) | 1 pre-existing failure only |
| **Total** | **243 unit + 464 SLT, 0 new failures** |
Idempotency verified:
- All partition counts 1-64
- Triple + 10x consecutive optimization passes
- SortMergeJoin, HashJoin, Window, Aggregate topologies
- PR #53/#54 regression scenarios
- #14150 fetch preservation across passes
Closes: #14150
Part of: #219731 parent bb86364 commit ba7e30e
10 files changed
Lines changed: 2812 additions & 68 deletions
File tree
- datafusion
- core/src
- physical-optimizer/src
- enforce_sorting
- ensure_requirements
- sqllogictest/test_files
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | | - | |
| 77 | + | |
78 | 78 | | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
Lines changed: 50 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
970 | 970 | | |
971 | 971 | | |
972 | 972 | | |
973 | | - | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
974 | 977 | | |
975 | 978 | | |
976 | 979 | | |
| |||
979 | 982 | | |
980 | 983 | | |
981 | 984 | | |
982 | | - | |
983 | | - | |
984 | | - | |
985 | | - | |
986 | | - | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
987 | 994 | | |
988 | 995 | | |
989 | | - | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
990 | 999 | | |
991 | 1000 | | |
992 | 1001 | | |
| |||
1012 | 1021 | | |
1013 | 1022 | | |
1014 | 1023 | | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
1015 | 1033 | | |
1016 | 1034 | | |
1017 | | - | |
| 1035 | + | |
| 1036 | + | |
1018 | 1037 | | |
1019 | 1038 | | |
1020 | 1039 | | |
1021 | 1040 | | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
1022 | 1049 | | |
1023 | 1050 | | |
1024 | 1051 | | |
1025 | 1052 | | |
1026 | 1053 | | |
1027 | 1054 | | |
1028 | | - | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
1029 | 1059 | | |
1030 | 1060 | | |
1031 | 1061 | | |
| |||
1219 | 1249 | | |
1220 | 1250 | | |
1221 | 1251 | | |
1222 | | - | |
1223 | | - | |
1224 | | - | |
1225 | | - | |
1226 | | - | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
1227 | 1262 | | |
1228 | 1263 | | |
1229 | 1264 | | |
| |||
1359 | 1394 | | |
1360 | 1395 | | |
1361 | 1396 | | |
1362 | | - | |
| 1397 | + | |
1363 | 1398 | | |
1364 | 1399 | | |
1365 | 1400 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
262 | 262 | | |
263 | 263 | | |
264 | 264 | | |
265 | | - | |
| 265 | + | |
266 | 266 | | |
267 | 267 | | |
268 | 268 | | |
| |||
0 commit comments