Commit 0add046
perf: hoist split_vec_min_alloc to datafusion-common and shrink the emitted prefix (#22416)
## Which issue does this PR close?
Related to #22164 and #22165.
## Rationale for this change
#22165 added a `split_vec_min_alloc` helper so that `EmitTo::First(n)`
allocates `min(n, len - n)` instead of always copying `n` elements. This
PR finishes and corrects that work, following review:
- The helper was `pub(super)` inside `datafusion-physical-plan`, so it
could not be reused. It moves to `datafusion_common::utils` as the
single shared copy.
- The `split_off` branch returned the original allocation as the emitted
prefix: short length, but original (larger) capacity. datafusion
accounts memory by capacity rather than length, so that prefix did not
actually release memory under pressure. The helper now calls
`shrink_to_fit` on it.
- Two further `EmitTo::First(n)` paths still used the
`drain(..n).collect()` idiom: the min/max accumulators and
`ByteViewGroupValueBuilder::take_n`. Both now route through the shared
helper.
## What changes are included in this PR?
- `datafusion_common::utils::split_vec_min_alloc`: the shared helper,
with `shrink_to_fit` on the emitted prefix.
- `datafusion-physical-plan`: the duplicate helper is removed;
`bytes.rs`, `primitive.rs` and `bytes_view.rs` use the shared one.
- `datafusion-functions-aggregate`: `MinMaxStructAccumulator` and
`MinMaxBytesAccumulator` use it in `emit_to`.
This supersedes #22205, whose `ByteViewGroupValueBuilder::take_n` change
is included here.
## Are these changes tested?
Yes. Unit tests for the helper (both branches and the boundaries) plus
the existing `take_n` and `min_max` tests pass.
## Are there any user-facing changes?
No.
---------
Co-authored-by: RyanJamesStewart <RyanJamesStewart@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>1 parent 7a6b062 commit 0add046
7 files changed
Lines changed: 144 additions & 62 deletions
File tree
- datafusion
- common/src/utils
- functions-aggregate/src/min_max
- physical-plan/src/aggregates/group_values/multi_group_by
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
395 | 395 | | |
396 | 396 | | |
397 | 397 | | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
398 | 529 | | |
399 | 530 | | |
400 | 531 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
| 31 | + | |
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
| |||
493 | 495 | | |
494 | 496 | | |
495 | 497 | | |
496 | | - | |
| 498 | + | |
497 | 499 | | |
498 | 500 | | |
499 | 501 | | |
| |||
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
33 | 35 | | |
34 | 36 | | |
35 | 37 | | |
| |||
282 | 284 | | |
283 | 285 | | |
284 | 286 | | |
285 | | - | |
| 287 | + | |
286 | 288 | | |
287 | 289 | | |
288 | 290 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
363 | 364 | | |
364 | 365 | | |
365 | 366 | | |
366 | | - | |
| 367 | + | |
367 | 368 | | |
368 | 369 | | |
369 | 370 | | |
| |||
Lines changed: 1 addition & 57 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | 110 | | |
124 | 111 | | |
125 | 112 | | |
| |||
1285 | 1272 | | |
1286 | 1273 | | |
1287 | 1274 | | |
1288 | | - | |
1289 | | - | |
1290 | | - | |
1291 | | - | |
1292 | | - | |
1293 | | - | |
1294 | | - | |
1295 | | - | |
1296 | | - | |
1297 | | - | |
1298 | | - | |
1299 | | - | |
1300 | | - | |
1301 | | - | |
1302 | | - | |
1303 | | - | |
1304 | | - | |
1305 | | - | |
1306 | | - | |
1307 | | - | |
1308 | | - | |
1309 | | - | |
1310 | | - | |
1311 | | - | |
1312 | | - | |
1313 | | - | |
1314 | | - | |
1315 | | - | |
1316 | | - | |
1317 | | - | |
1318 | | - | |
1319 | | - | |
1320 | | - | |
1321 | | - | |
1322 | | - | |
1323 | | - | |
1324 | | - | |
1325 | | - | |
1326 | | - | |
1327 | | - | |
1328 | | - | |
1329 | | - | |
1330 | | - | |
1331 | | - | |
| 1275 | + | |
1332 | 1276 | | |
1333 | 1277 | | |
1334 | 1278 | | |
| |||
Lines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
0 commit comments