Commit ea90bfa
committed
[SPARK-57766][SQL] Validate WKB element counts before allocation
### What changes were proposed in this pull request?
This PR validates WKB collection element counts before using them as `ArrayList` initial capacities in `WkbReader`.
The new `readCount` helper rejects negative counts and counts that cannot fit in the remaining WKB buffer before parsing these structures:
- LineString points
- Polygon ring points
- Polygon rings
- MultiPoint points
- MultiLineString line strings
- MultiPolygon polygons
- GeometryCollection geometries
It also adds regression coverage for negative and oversized counts, including the public `Geometry.fromWkb` path.
### Why are the changes needed?
Malformed WKB can encode invalid collection counts. Before this change, those counts were passed directly to `new ArrayList<>(count)`, which could throw raw allocation-related exceptions such as `IllegalArgumentException` for negative capacities or attempt excessive allocation for very large counts.
Invalid WKB should be rejected consistently as a WKB parse error before allocation.
### Does this PR introduce _any_ user-facing change?
Yes. For malformed WKB with invalid collection counts, parsing now fails with Spark's normal `WKB_PARSE_ERROR` instead of raw Java allocation failures. This affects unreleased WKB parsing behavior.
### How was this patch tested?
Added tests in `WkbErrorHandlingTest` for negative and oversized counts across all count-bearing WKB collection readers, plus a `Geometry.fromWkb` regression test.
Result: 37 tests passed, 0 failed.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: OpenAI Codex (GPT-5)
Closes #56875 from szehon-ho/SPARK-57766-wkb-count-validation.
Authored-by: Szehon Ho <szehon.apache@gmail.com>
Signed-off-by: Szehon Ho <szehon.apache@gmail.com>
(cherry picked from commit e58192f)
Signed-off-by: Szehon Ho <szehon.apache@gmail.com>1 parent 8b95cd3 commit ea90bfa
2 files changed
Lines changed: 70 additions & 7 deletions
File tree
- sql/catalyst/src
- main/java/org/apache/spark/sql/catalyst/util/geo
- test/java/org/apache/spark/sql/catalyst/util/geo
Lines changed: 24 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
178 | 178 | | |
179 | 179 | | |
180 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
181 | 195 | | |
182 | 196 | | |
183 | 197 | | |
| |||
386 | 400 | | |
387 | 401 | | |
388 | 402 | | |
389 | | - | |
| 403 | + | |
390 | 404 | | |
391 | 405 | | |
392 | 406 | | |
| |||
402 | 416 | | |
403 | 417 | | |
404 | 418 | | |
405 | | - | |
| 419 | + | |
406 | 420 | | |
407 | 421 | | |
408 | 422 | | |
| |||
425 | 439 | | |
426 | 440 | | |
427 | 441 | | |
428 | | - | |
| 442 | + | |
429 | 443 | | |
430 | 444 | | |
431 | 445 | | |
| |||
436 | 450 | | |
437 | 451 | | |
438 | 452 | | |
439 | | - | |
| 453 | + | |
440 | 454 | | |
441 | 455 | | |
442 | 456 | | |
| |||
452 | 466 | | |
453 | 467 | | |
454 | 468 | | |
455 | | - | |
| 469 | + | |
| 470 | + | |
456 | 471 | | |
457 | 472 | | |
458 | 473 | | |
| |||
468 | 483 | | |
469 | 484 | | |
470 | 485 | | |
471 | | - | |
| 486 | + | |
| 487 | + | |
472 | 488 | | |
473 | 489 | | |
474 | 490 | | |
| |||
484 | 500 | | |
485 | 501 | | |
486 | 502 | | |
487 | | - | |
| 503 | + | |
| 504 | + | |
488 | 505 | | |
489 | 506 | | |
490 | 507 | | |
| |||
Lines changed: 46 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
| |||
118 | 120 | | |
119 | 121 | | |
120 | 122 | | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
121 | 167 | | |
122 | 168 | | |
123 | 169 | | |
| |||
0 commit comments