Commit 466c3ea
Dynamic work scheduling in FileStream (apache#21351)
## Which issue does this PR close?
- Closes apache#20529
- Closes apache#20820
## Rationale for this change
This PR finally enables dynamic work scheduling in the FileStream (so
that if a task is done it can look at any remaining work)
This improves performance on queries that scan multiple files and the
work is not balanced evenly across partitions in the plan (e.g. we have
dynamic filtering that reduces work significantly)
It is the last of a sequence of several PRs:
- apache#21342
- apache#21327
- apache#21340
## What changes are included in this PR?
1. Add shared state across sibling FileStream's and the wiring to
connect them
2. Sibling streams put their file work into a shared queue when it can
be reordered
3. Add a bunch of tests sjpw
Note there are a bunch of other things that are NOT included in this PR,
including
1. Trying to limit concurrent IO (this PR has the same properties as
main -- up to one outstanding IO per partition)
2. Trying to issue multiple IOs by the same partition (aka to interleave
IO and CPU work more)
4. Splitting files into smaller units (e.g. across row groups)
As @Dandandan proposes below, I expect we can work on those changes as
follow on PRs.
## Are these changes tested?
Yes by existing functional and benchmark tests, as well as new
functional tests
## Are there any user-facing changes?
Yes, faster performance (see benchmarks):
apache#21351 (comment)
---------
Co-authored-by: Oleks V <comphead@users.noreply.github.com>1 parent 3aaf393 commit 466c3ea
10 files changed
Lines changed: 1000 additions & 74 deletions
File tree
- datafusion/datasource
- src
- file_scan_config
- file_stream
- morsel
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| 67 | + | |
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
38 | 39 | | |
39 | 40 | | |
40 | 41 | | |
| 42 | + | |
41 | 43 | | |
42 | 44 | | |
43 | 45 | | |
| |||
55 | 57 | | |
56 | 58 | | |
57 | 59 | | |
| 60 | + | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
| |||
578 | 581 | | |
579 | 582 | | |
580 | 583 | | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
581 | 593 | | |
582 | 594 | | |
583 | 595 | | |
| |||
587 | 599 | | |
588 | 600 | | |
589 | 601 | | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
590 | 610 | | |
591 | 611 | | |
| 612 | + | |
592 | 613 | | |
593 | 614 | | |
594 | 615 | | |
| |||
985 | 1006 | | |
986 | 1007 | | |
987 | 1008 | | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
988 | 1023 | | |
989 | 1024 | | |
990 | 1025 | | |
| |||
1362 | 1397 | | |
1363 | 1398 | | |
1364 | 1399 | | |
| 1400 | + | |
1365 | 1401 | | |
1366 | 1402 | | |
1367 | 1403 | | |
1368 | 1404 | | |
1369 | 1405 | | |
1370 | 1406 | | |
| 1407 | + | |
1371 | 1408 | | |
1372 | 1409 | | |
1373 | 1410 | | |
| 1411 | + | |
| 1412 | + | |
| 1413 | + | |
1374 | 1414 | | |
| 1415 | + | |
1375 | 1416 | | |
1376 | 1417 | | |
1377 | 1418 | | |
| 1419 | + | |
| 1420 | + | |
| 1421 | + | |
| 1422 | + | |
| 1423 | + | |
| 1424 | + | |
| 1425 | + | |
| 1426 | + | |
1378 | 1427 | | |
1379 | 1428 | | |
1380 | 1429 | | |
| |||
1394 | 1443 | | |
1395 | 1444 | | |
1396 | 1445 | | |
1397 | | - | |
| 1446 | + | |
1398 | 1447 | | |
1399 | 1448 | | |
1400 | 1449 | | |
| |||
2278 | 2327 | | |
2279 | 2328 | | |
2280 | 2329 | | |
| 2330 | + | |
| 2331 | + | |
| 2332 | + | |
| 2333 | + | |
| 2334 | + | |
| 2335 | + | |
| 2336 | + | |
| 2337 | + | |
| 2338 | + | |
| 2339 | + | |
| 2340 | + | |
| 2341 | + | |
| 2342 | + | |
| 2343 | + | |
| 2344 | + | |
| 2345 | + | |
| 2346 | + | |
| 2347 | + | |
| 2348 | + | |
| 2349 | + | |
| 2350 | + | |
| 2351 | + | |
| 2352 | + | |
| 2353 | + | |
| 2354 | + | |
| 2355 | + | |
| 2356 | + | |
| 2357 | + | |
| 2358 | + | |
| 2359 | + | |
| 2360 | + | |
| 2361 | + | |
| 2362 | + | |
| 2363 | + | |
| 2364 | + | |
| 2365 | + | |
| 2366 | + | |
| 2367 | + | |
| 2368 | + | |
| 2369 | + | |
| 2370 | + | |
| 2371 | + | |
| 2372 | + | |
| 2373 | + | |
| 2374 | + | |
| 2375 | + | |
| 2376 | + | |
| 2377 | + | |
| 2378 | + | |
| 2379 | + | |
| 2380 | + | |
| 2381 | + | |
| 2382 | + | |
| 2383 | + | |
| 2384 | + | |
| 2385 | + | |
| 2386 | + | |
| 2387 | + | |
| 2388 | + | |
| 2389 | + | |
| 2390 | + | |
| 2391 | + | |
| 2392 | + | |
| 2393 | + | |
| 2394 | + | |
| 2395 | + | |
| 2396 | + | |
| 2397 | + | |
| 2398 | + | |
| 2399 | + | |
| 2400 | + | |
| 2401 | + | |
| 2402 | + | |
| 2403 | + | |
| 2404 | + | |
| 2405 | + | |
| 2406 | + | |
| 2407 | + | |
| 2408 | + | |
| 2409 | + | |
| 2410 | + | |
| 2411 | + | |
2281 | 2412 | | |
2282 | 2413 | | |
2283 | 2414 | | |
| |||
2461 | 2592 | | |
2462 | 2593 | | |
2463 | 2594 | | |
2464 | | - | |
| 2595 | + | |
2465 | 2596 | | |
2466 | 2597 | | |
2467 | 2598 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| 37 | + | |
36 | 38 | | |
37 | 39 | | |
38 | 40 | | |
39 | | - | |
| 41 | + | |
40 | 42 | | |
41 | 43 | | |
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
| 49 | + | |
47 | 50 | | |
48 | 51 | | |
49 | 52 | | |
| |||
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
84 | 96 | | |
85 | 97 | | |
86 | 98 | | |
| |||
89 | 101 | | |
90 | 102 | | |
91 | 103 | | |
| 104 | + | |
92 | 105 | | |
93 | 106 | | |
94 | 107 | | |
| |||
106 | 119 | | |
107 | 120 | | |
108 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
109 | 126 | | |
110 | 127 | | |
111 | 128 | | |
112 | | - | |
| 129 | + | |
113 | 130 | | |
114 | 131 | | |
115 | 132 | | |
| |||
0 commit comments