fix(ci): trim benchmark full grid to fit daily run under 6h timeout by Ma77Ball · Pull Request #5905 · apache/texera

Ma77Ball · 2026-06-23T09:51:01Z

What changes were proposed in this PR?

Drop batchSize=10000 from the full-mode benchmark grid in ArrowFlightActorBench.scala, taking the daily sweep from 36 configs to 27 and removing the 9 heaviest configs (30-70 min each) that pushed the run past GitHub's 6h job ceiling.
Update the now-stale "36-config / ~50-60 min" comments to "27-config / ~40 min" in the bench source and benchmarks.yml.

Any related issues, documentation, discussions?

Closes: #5904

How was this PR tested?

Non-functional change (benchmark harness grid + CI comments); no shipped behavior and no unit test covers the bench grid contents.
CI timing verification: trigger the Benchmarks workflow via workflow_dispatch on this branch (the only non-schedule trigger that runs full mode) and confirm the Bench job finishes well under 6h (expected ~40-50 min including compile/setup), reaching the publish steps.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.8 in compliance with ASF

Ma77Ball · 2026-06-23T09:53:36Z

/request-review @Yicong-Huang

github-actions · 2026-06-23T10:37:50Z

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

Contributors with relevant context: @Yicong-Huang
You can notify them by mentioning @Yicong-Huang in a comment.

codecov-commenter · 2026-06-23T10:45:49Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 54.10%. Comparing base (8803d08) to head (25183bb).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #5905      +/-   ##
============================================
- Coverage     54.11%   54.10%   -0.02%     
+ Complexity     2819     2816       -3     
============================================
  Files          1103     1103              
  Lines         42650    42650              
  Branches       4588     4588              
============================================
- Hits          23079    23074       -5     
- Misses        18226    18230       +4     
- Partials       1345     1346       +1

Flag	Coverage Δ		*Carryforward flag
access-control-service	`70.44% <ø> (ø)`
agent-service	`34.36% <ø> (ø)`
amber	`55.61% <ø> (-0.04%)`	⬇️
computing-unit-managing-service	`1.65% <ø> (ø)`
config-service	`57.35% <ø> (ø)`
file-service	`58.59% <ø> (ø)`
frontend	`48.12% <ø> (ø)`
pyamber	`90.20% <ø> (ø)`
python	`90.76% <ø> (ø)`		Carriedforward from f4efde6
workflow-compiling-service	`58.69% <ø> (ø)`

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2026-06-23T11:28:46Z

⚠️ Benchmark changes need a look

🟢 2 better · 🔴 3 worse · ⚪ 10 noise (<±5%) · 0 without baseline

Compared against main 8803d08 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

	config	throughput	MB/s	latency	max Δ latest / 7d
🔴	bs=10 sw=10 sl=64	459	0.28	22,231/26,678/26,678 us	🔴 +15.5% / 🟢 -23.7%
🔴	bs=100 sw=10 sl=64	954	0.583	103,243/137,028/137,028 us	🔴 +11.5% / 🟢 -8.0%
⚪	bs=1000 sw=10 sl=64	1,098	0.67	911,702/983,266/983,266 us	⚪ within ±5% / 🟢 -6.3%

Baseline details

Latest main 8803d08 from same runner

config	metric	PR	latest main	7d avg	Δ latest	Δ 7d
bs=10 sw=10 sl=64	throughput	459 tuples/sec	473 tuples/sec	410.82 tuples/sec	-3.0%	+11.7%
bs=10 sw=10 sl=64	MB/s	0.28 MB/s	0.289 MB/s	0.251 MB/s	-3.1%	+11.7%
bs=10 sw=10 sl=64	p50	22,231 us	19,254 us	23,785 us	+15.5%	-6.5%
bs=10 sw=10 sl=64	p95	26,678 us	30,900 us	34,980 us	-13.7%	-23.7%
bs=10 sw=10 sl=64	p99	26,678 us	30,900 us	34,980 us	-13.7%	-23.7%
bs=100 sw=10 sl=64	throughput	954 tuples/sec	970 tuples/sec	891.94 tuples/sec	-1.6%	+7.0%
bs=100 sw=10 sl=64	MB/s	0.583 MB/s	0.592 MB/s	0.544 MB/s	-1.5%	+7.1%
bs=100 sw=10 sl=64	p50	103,243 us	104,302 us	112,277 us	-1.0%	-8.0%
bs=100 sw=10 sl=64	p95	137,028 us	122,844 us	139,802 us	+11.5%	-2.0%
bs=100 sw=10 sl=64	p99	137,028 us	122,844 us	139,802 us	+11.5%	-2.0%
bs=1000 sw=10 sl=64	throughput	1,098 tuples/sec	1,126 tuples/sec	1,041 tuples/sec	-2.5%	+5.5%
bs=1000 sw=10 sl=64	MB/s	0.67 MB/s	0.687 MB/s	0.635 MB/s	-2.5%	+5.4%
bs=1000 sw=10 sl=64	p50	911,702 us	887,392 us	972,714 us	+2.7%	-6.3%
bs=1000 sw=10 sl=64	p95	983,266 us	942,143 us	1,023,057 us	+4.4%	-3.9%
bs=1000 sw=10 sl=64	p99	983,266 us	942,143 us	1,023,057 us	+4.4%	-3.9%

Raw CSV

config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,436.16,200,128000,459,0.280,22231.41,26677.54,26677.54
1,100,10,64,20,2095.47,2000,1280000,954,0.583,103242.50,137027.52,137027.52
2,1000,10,64,20,18210.89,20000,12800000,1098,0.670,911701.56,983265.63,983265.63

removed the 10000 batch size to allow benchmark to finish on time

f4efde6

github-actions Bot added fix ci changes related to CI labels Jun 23, 2026

github-actions Bot assigned Ma77Ball Jun 23, 2026

github-actions Bot requested a review from Yicong-Huang June 23, 2026 10:37

re-run ci test

25183bb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ci): trim benchmark full grid to fit daily run under 6h timeout#5905

fix(ci): trim benchmark full grid to fit daily run under 6h timeout#5905
Ma77Ball wants to merge 2 commits into
apache:mainfrom
Ma77Ball:fix/benchmark-daily-timeout

Ma77Ball commented Jun 23, 2026

Uh oh!

Ma77Ball commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

codecov-commenter commented Jun 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ma77Ball commented Jun 23, 2026

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

Ma77Ball commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Automated Reviewer Suggestions

Uh oh!

codecov-commenter commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Benchmark changes need a look

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Jun 23, 2026 •

edited

Loading

github-actions Bot commented Jun 23, 2026 •

edited

Loading