Commit e292f33
authored
perf(physical-optimizer): skip ensure_distribution rebuild when children are unchanged (#22521)
## Which issue does this PR close?
- Closes #22520.
## Rationale for this change
`ensure_distribution` in
`datafusion/physical-optimizer/src/ensure_requirements/enforce_distribution.rs`
unconditionally calls `plan.with_new_children(children_plans)` after
collecting the (possibly redistributed) children, even when none of
those children were actually replaced. For nodes like `ProjectionExec`,
that path runs through `try_new` and recomputes the schema, equivalence
properties, output ordering, and output partitioning, then allocates a
new `Arc<dyn ExecutionPlan>`. When every child Arc is pointer-identical
to the input, that work produces a logically identical node — pure
overhead.
The cost is amplified by two factors:
1. **Plan depth.** Workloads dominated by point queries (no join /
aggregate / unmet ordering — i.e. nothing for `ensure_distribution` to
inject a `RepartitionExec` or `SortExec` for) hit this wasted rebuild at
every node in the plan. A 5–30 deep `ProjectionExec` stack pays the cost
N times.
2. **Schema width.** Most steps inside `ProjectionExec::try_new` are
`O(num_columns)`: per-column `data_type` / `nullable` lookup to build
the new schema, per-column remapping of equivalence classes through the
projection mapping, and per-column lookup when rewriting
`PhysicalSortExpr`s into the output ordering. Wide schemas (tens of
columns) make every wasted rebuild proportionally heavier.
Profiling a production point-query workload (wide schemas, deep
`ProjectionExec` stacks) showed `ProjectionExec::with_new_children` as
the single largest cost inside `ensure_distribution`:
- `ensure_distribution` total: 2.87s of a 60s CPU sample
- `ProjectionExec::with_new_children`: 1.94s (56% of the rule)
- `SortExec::with_new_children`: 0.11s
- Other ExecutionPlan nodes: 0.82s
## What changes are included in this PR?
After collecting `children_plans`, compare each new child Arc with the
original via `Arc::ptr_eq`. When every child is unchanged, reuse the
existing `plan` Arc and skip `with_new_children`. The `UnionExec` to
`InterleaveExec` special case still runs first because it intentionally
produces a new node even when child Arcs are unchanged.
This relies on the fact that `ensure_distribution` already produces
pointer-identical Arcs for children that need no redistribution (it
threads the original Arc through unchanged), so `Arc::ptr_eq` precisely
distinguishes "rewritten" from "untouched" children at O(1) per child.
## Are these changes tested?
Yes. The existing `enforce_distribution` suite passes unchanged (66/66):
```
cargo test --release -p datafusion --test core_integration -- physical_optimizer::enforce_distribution
```
The behavior is observable only as a CPU reduction; correctness is
preserved because `ExecutionPlan` nodes are immutable, so reusing the
original Arc produces the same plan tree as
`with_new_children(unchanged_children)` would have, just without the
schema / ordering / equivalence / partitioning recomputation.
## Are there any user-facing changes?
No. Same plans, lower planning time.
## Micro-benchmark
Plan shape: 30-deep `ProjectionExec` stack over a sorted parquet scan,
5000 iterations.
- Without fix: 852.74 ms total, 170.55 us/call
- With fix: 296.81 ms total, 59.36 us/call
- ~2.87x speedup, -65% CPU per call
Wider schemas (more projection expressions per node) widen the gap
further because each skipped `with_new_children` avoids more
O(num_columns) work.1 parent 11a79a6 commit e292f33
2 files changed
Lines changed: 48 additions & 2 deletions
File tree
- datafusion
- core/tests/physical_optimizer
- physical-optimizer/src/ensure_requirements
Lines changed: 35 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3971 | 3971 | | |
3972 | 3972 | | |
3973 | 3973 | | |
| 3974 | + | |
| 3975 | + | |
| 3976 | + | |
| 3977 | + | |
| 3978 | + | |
| 3979 | + | |
| 3980 | + | |
| 3981 | + | |
| 3982 | + | |
| 3983 | + | |
| 3984 | + | |
| 3985 | + | |
| 3986 | + | |
| 3987 | + | |
| 3988 | + | |
| 3989 | + | |
| 3990 | + | |
| 3991 | + | |
| 3992 | + | |
| 3993 | + | |
| 3994 | + | |
| 3995 | + | |
| 3996 | + | |
| 3997 | + | |
| 3998 | + | |
| 3999 | + | |
| 4000 | + | |
| 4001 | + | |
| 4002 | + | |
| 4003 | + | |
| 4004 | + | |
| 4005 | + | |
| 4006 | + | |
| 4007 | + | |
| 4008 | + | |
Lines changed: 13 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
69 | 71 | | |
70 | 72 | | |
71 | 73 | | |
| |||
1362 | 1364 | | |
1363 | 1365 | | |
1364 | 1366 | | |
1365 | | - | |
| 1367 | + | |
| 1368 | + | |
| 1369 | + | |
| 1370 | + | |
| 1371 | + | |
| 1372 | + | |
| 1373 | + | |
| 1374 | + | |
| 1375 | + | |
| 1376 | + | |
1366 | 1377 | | |
1367 | 1378 | | |
1368 | 1379 | | |
| |||
0 commit comments