Commit 5cccc1b
feat: Coalesce small batches before shuffle write for improved efficiency
This change adds batch coalescing before shuffle writes to reduce per-batch
overhead and improve vectorization efficiency. When enabled, small columnar
batches are combined until they reach the target batch size before being
processed by the shuffle writer.
Benefits observed in TPC-H Q18 benchmarks:
- 10.9% overall query time improvement
- Significantly reduced GC pressure (Stage 26: 3,602ms -> 56ms GC time)
- Better vectorization efficiency for downstream operators
New configuration options:
- spark.comet.shuffle.resizeBatches.input: Coalesce batches before shuffle write (default: false)
- spark.comet.shuffle.resizeBatches.output: Coalesce batches after shuffle read (default: true)
The native planner now wraps shuffle input with DataFusion's CoalesceBatchesExec
when spark.comet.shuffle.resizeBatches.input is enabled.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>1 parent d9ea22b commit 5cccc1b
3 files changed
Lines changed: 167 additions & 2 deletions
File tree
- common/src/main/scala/org/apache/comet
- native/core/src/execution
- spark/src/main/scala/org/apache/spark/sql/comet/execution/shuffle
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
558 | 558 | | |
559 | 559 | | |
560 | 560 | | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
561 | 582 | | |
562 | 583 | | |
563 | 584 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1152 | 1152 | | |
1153 | 1153 | | |
1154 | 1154 | | |
| 1155 | + | |
| 1156 | + | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
| 1160 | + | |
| 1161 | + | |
| 1162 | + | |
| 1163 | + | |
| 1164 | + | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
1155 | 1168 | | |
1156 | 1169 | | |
1157 | | - | |
| 1170 | + | |
1158 | 1171 | | |
1159 | 1172 | | |
1160 | 1173 | | |
| |||
1165 | 1178 | | |
1166 | 1179 | | |
1167 | 1180 | | |
1168 | | - | |
| 1181 | + | |
1169 | 1182 | | |
1170 | 1183 | | |
1171 | 1184 | | |
| 1185 | + | |
1172 | 1186 | | |
1173 | 1187 | | |
1174 | 1188 | | |
| |||
Lines changed: 130 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
0 commit comments