fix(ci): trim benchmark full grid to fit daily run under 6h timeout#5905
fix(ci): trim benchmark full grid to fit daily run under 6h timeout#5905Ma77Ball wants to merge 2 commits into
Conversation
|
/request-review @Yicong-Huang |
Automated Reviewer SuggestionsBased on the
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5905 +/- ##
============================================
- Coverage 54.11% 54.10% -0.02%
+ Complexity 2819 2816 -3
============================================
Files 1103 1103
Lines 42650 42650
Branches 4588 4588
============================================
- Hits 23079 23074 -5
- Misses 18226 18230 +4
- Partials 1345 1346 +1
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 459 | 0.28 | 22,231/26,678/26,678 us | 🔴 +15.5% / 🟢 -23.7% |
| 🔴 | bs=100 sw=10 sl=64 | 954 | 0.583 | 103,243/137,028/137,028 us | 🔴 +11.5% / 🟢 -8.0% |
| ⚪ | bs=1000 sw=10 sl=64 | 1,098 | 0.67 | 911,702/983,266/983,266 us | ⚪ within ±5% / 🟢 -6.3% |
Baseline details
Latest main 8803d08 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 459 tuples/sec | 473 tuples/sec | 410.82 tuples/sec | -3.0% | +11.7% |
| bs=10 sw=10 sl=64 | MB/s | 0.28 MB/s | 0.289 MB/s | 0.251 MB/s | -3.1% | +11.7% |
| bs=10 sw=10 sl=64 | p50 | 22,231 us | 19,254 us | 23,785 us | +15.5% | -6.5% |
| bs=10 sw=10 sl=64 | p95 | 26,678 us | 30,900 us | 34,980 us | -13.7% | -23.7% |
| bs=10 sw=10 sl=64 | p99 | 26,678 us | 30,900 us | 34,980 us | -13.7% | -23.7% |
| bs=100 sw=10 sl=64 | throughput | 954 tuples/sec | 970 tuples/sec | 891.94 tuples/sec | -1.6% | +7.0% |
| bs=100 sw=10 sl=64 | MB/s | 0.583 MB/s | 0.592 MB/s | 0.544 MB/s | -1.5% | +7.1% |
| bs=100 sw=10 sl=64 | p50 | 103,243 us | 104,302 us | 112,277 us | -1.0% | -8.0% |
| bs=100 sw=10 sl=64 | p95 | 137,028 us | 122,844 us | 139,802 us | +11.5% | -2.0% |
| bs=100 sw=10 sl=64 | p99 | 137,028 us | 122,844 us | 139,802 us | +11.5% | -2.0% |
| bs=1000 sw=10 sl=64 | throughput | 1,098 tuples/sec | 1,126 tuples/sec | 1,041 tuples/sec | -2.5% | +5.5% |
| bs=1000 sw=10 sl=64 | MB/s | 0.67 MB/s | 0.687 MB/s | 0.635 MB/s | -2.5% | +5.4% |
| bs=1000 sw=10 sl=64 | p50 | 911,702 us | 887,392 us | 972,714 us | +2.7% | -6.3% |
| bs=1000 sw=10 sl=64 | p95 | 983,266 us | 942,143 us | 1,023,057 us | +4.4% | -3.9% |
| bs=1000 sw=10 sl=64 | p99 | 983,266 us | 942,143 us | 1,023,057 us | +4.4% | -3.9% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,436.16,200,128000,459,0.280,22231.41,26677.54,26677.54
1,100,10,64,20,2095.47,2000,1280000,954,0.583,103242.50,137027.52,137027.52
2,1000,10,64,20,18210.89,20000,12800000,1098,0.670,911701.56,983265.63,983265.63
What changes were proposed in this PR?
batchSize=10000from thefull-mode benchmark grid inArrowFlightActorBench.scala, taking the daily sweep from 36 configs to 27 and removing the 9 heaviest configs (30-70 min each) that pushed the run past GitHub's 6h job ceiling.benchmarks.yml.Any related issues, documentation, discussions?
Closes: #5904
How was this PR tested?
Benchmarksworkflow viaworkflow_dispatchon this branch (the only non-schedule trigger that runsfullmode) and confirm theBenchjob finishes well under 6h (expected ~40-50 min including compile/setup), reaching the publish steps.Was this PR authored or co-authored using generative AI tooling?
Co-authored with Claude Opus 4.8 in compliance with ASF