Commit 264510c
committed
fix: cap bloom_filter_agg numItems/numBits and skip null inputs
The Spark 4.0 BloomFilterAggregateQuerySuite CI job aborted the executor
with a multi-exabyte native allocation, and the Spark 3.4 CometExecRuleSuite
job failed analysis. Three bloom-filter issues surfaced once this branch
let bloom_filter_agg execute natively:
- Spark's BloomFilterAggregate caps numItems/numBits at maxNumItems/
maxNumBits, but CometBloomFilterAggregate forwarded the raw literals.
Comet's native aggregate stores them as i32, so an oversized Long
(e.g. the Long.MaxValue cases in BloomFilterAggregateQuerySuite) wrapped
to a negative size and triggered a 2^61-byte allocation. Apply the same
cap in the serde so the native side receives Spark-equivalent values.
- update_batch hit `unreachable!()` on a null input value. Spark's
BloomFilterAggregate.update ignores nulls; skip them, and return an
error rather than panicking on a genuinely unexpected type.
- The new CometExecRuleSuite BloomFilter cases used an int column, which
Spark 3.4's bloom_filter_agg rejects (it only accepts a long-typed
first argument); cast to bigint.
Adds a CometExec3_4PlusSuite regression test covering oversized
numItems/numBits with a null-containing input.1 parent d51c9d6 commit 264510c
4 files changed
Lines changed: 65 additions & 8 deletions
File tree
- native/spark-expr/src/bloom_filter
- spark/src
- main/scala/org/apache/comet/serde
- test/scala/org/apache/comet
- exec
- rules
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | | - | |
145 | | - | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
146 | 154 | | |
147 | 155 | | |
148 | 156 | | |
| |||
Lines changed: 15 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
649 | 649 | | |
650 | 650 | | |
651 | 651 | | |
652 | | - | |
653 | | - | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
654 | 666 | | |
655 | 667 | | |
656 | 668 | | |
| |||
Lines changed: 31 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| 46 | + | |
45 | 47 | | |
46 | 48 | | |
47 | 49 | | |
| |||
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
54 | 66 | | |
55 | 67 | | |
56 | 68 | | |
57 | 69 | | |
58 | 70 | | |
59 | 71 | | |
| 72 | + | |
60 | 73 | | |
61 | 74 | | |
62 | 75 | | |
| |||
185 | 198 | | |
186 | 199 | | |
187 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
188 | 219 | | |
189 | 220 | | |
190 | 221 | | |
| |||
Lines changed: 8 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
247 | 247 | | |
248 | 248 | | |
249 | 249 | | |
250 | | - | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
251 | 254 | | |
252 | 255 | | |
253 | 256 | | |
| |||
283 | 286 | | |
284 | 287 | | |
285 | 288 | | |
286 | | - | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
287 | 293 | | |
288 | 294 | | |
289 | 295 | | |
| |||
0 commit comments