ci: add benchmark workflow#1127
Conversation
Benchmark comparisonThreshold: 10% (lower is better). 2 benchmark(s) regressed beyond the configured threshold.
2 benchmark(s) improved beyond the configured threshold.
All benchmark results
|
51bacc4 to
501a793
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
861678d to
64c435f
Compare
8d70b23 to
29f7dc3
Compare
There was a problem hiding this comment.
Pull request overview
Adds a CI benchmarking pipeline to track performance metrics over time (hyperfine startup/import benchmarks + pytest-benchmark runtime benchmarks), compare results to a stored baseline, and (intended) publish historical benchmark JSON to gh-pages.
Changes:
- Introduces a reusable
benchmark.ymlworkflow plus helper scripts to run/aggregate/compare benchmarks and comment results on PRs. - Adds initial benchmark definitions (hyperfine shell benchmarks + a pytest-benchmark dock area benchmark).
- Updates existing pytest workflows to exclude the new benchmark tests from normal unit-test runs.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
tests/unit_tests/benchmarks/test_dock_area_benchmark.py |
Adds a pytest-benchmark test for adding a widget to a BECDockArea. |
tests/benchmarks/hyperfine/utils/exit_bec_startup.py |
Adds an IPython helper to exit BEC startup for launch-time benchmarking. |
tests/benchmarks/hyperfine/benchmark_launch_bec_without_companion.sh |
Adds a hyperfine benchmark script for launching BEC (metadata currently inconsistent). |
tests/benchmarks/hyperfine/benchmark_launch_bec_with_companion.sh |
Adds a hyperfine benchmark script for launching BEC (metadata currently inconsistent). |
tests/benchmarks/hyperfine/benchmark_import_bec_widgets.sh |
Adds a hyperfine benchmark script for import-time measurement. |
.github/workflows/pytest.yml |
Ignores benchmark tests during coverage/unit-test workflow runs. |
.github/workflows/pytest-matrix.yml |
Ignores benchmark tests during python-version-matrix test runs. |
.github/workflows/ci.yml |
Hooks the benchmark workflow into Full CI and adds contents: write permission. |
.github/workflows/benchmark.yml |
New reusable workflow to run benchmarks (3 attempts), aggregate, compare, PR-comment, and publish to gh-pages. |
.github/scripts/run_with_bec_servers.py |
New helper to start Redis + BEC services and run the benchmark command under that environment. |
.github/scripts/run_benchmarks.sh |
New runner that executes hyperfine + pytest-benchmark suites and aggregates results. |
.github/scripts/compare_benchmarks.py |
New comparator to generate a Markdown summary and fail on threshold regressions. |
.github/scripts/aggregate_benchmarks.py |
New aggregator to normalize and median-aggregate benchmark results. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
12cad30 to
4243909
Compare
858fd86 to
8a7dd9c
Compare
cappel89
left a comment
There was a problem hiding this comment.
LGTM. Just out of curiosity, where is the benchmark fixture coming from?
that's pytest-benchmark https://pytest-benchmark.readthedocs.io/en/latest/ |
|
but I obviously forgot to add it to pyproject... let me fix that quickly |
6553f5a to
f18f93c
Compare
f18f93c to
c9db094
Compare
Description
This PR adds a new ci workflow for checking performance metrics, either from a bash script (e.g. for benchmarking import and startup times) or from a pytest-benchmark run (e.g. for runtime benchmarks such as adding components to a dockarea).
To not get completely overshadowed with the CI fluctuations, a matrix job is used and results are averaged across 3 runs.
Once merged, the results are pushed to gh-pages and kept as json files. This only happens on main though; PR will simply run their checks locally and compare against the reference.
Let's test it for some time in BW. If we are happy with it, I'd suggest to create a bespoke action out of it such that we can also reuse it for other repos.