Refactor to do more work in Python#382
Conversation
50b4624 to
75a085b
Compare
|
@diegorusso: This moves as much work as possible from the GHA yml files to Python, so it should be portable across automation systems. I realize it's a fairly large PR, but I would be interested in your feedback to confirm it is going to meet your needs, etc. |
There was a problem hiding this comment.
Pull Request Overview
This PR is the first step in the refactor to shift more work into Python as part of #380. The changes update tests, centralize benchmark hash computation via benchmark_definitions, remove the old should_run script, and adjust workflow and template scripts to use the new commands.
- Updated tests in tests/test_run_benchmarks.py to use the new workflow.should_run and hardcoded benchmark hash logic.
- Removed the obsolete util.get_benchmark_hash in favor of benchmark_definitions.get_benchmark_hash and updated references across the codebase.
- Revised workflow and template scripts (including workflow_bootstrap.py, _pystats.src.yml, and _benchmark.src.yml) to align with the new command structure.
Reviewed Changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/test_run_benchmarks.py | Updated tests to call hardcode_benchmark_hash and use workflow.should_run |
| bench_runner/util.py | Removed legacy benchmark hash function |
| bench_runner/templates/workflow_bootstrap.py | Added new bootstrap script for setting up the workflow |
| bench_runner/templates/benchmark.src.yml, _pystats.src.yml, _benchmark.src.yml | Adjusted steps to call workflow_bootstrap.py instead of the old commands |
| bench_runner/scripts/workflow.py | Major refactoring of workflow logic with benchmark hash updates |
| bench_runner/scripts/should_run.py | Removed in favor of updated workflow script |
| bench_runner/scripts/run_benchmarks.py | Updated to reference benchmark_definitions for hash |
| bench_runner/scripts/install.py, get_merge_base.py | Minor changes for YAML generation and hash retrieval |
| bench_runner/git.py | Updated clone function with context management changes |
| bench_runner/benchmark_definitions.py | New centralized benchmark hash computation |
| bench_runner/main.py | Updated commands reflecting the refactored workflow |
Files not reviewed (1)
- bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (1)
bench_runner/benchmark_definitions.py:34
- [nitpick] The variable name 'hash' shadows the built-in Python function hash(). Consider renaming it to something like 'sha' or 'hasher' to improve clarity.
hash = hashlib.sha256()
| if not sys.platform.startswith("linux"): | ||
| return | ||
|
|
||
| run_in_venv(venv, "pyperf", ["system", perf and "reset" or "tune"], sudo=True) |
There was a problem hiding this comment.
Does this mean that the CPU frequency scaling governors are set to "powersave" when profiling and "performance" when benchmarking?
Could this not lead to different characteristics between the profiled run and the scored run? Though I admit that the profiling will create different characteristics anyway. So I guess my question is just, what is the reason for this?
There was a problem hiding this comment.
The reason for this is pretty recent. When @mpage added events to pyperf so that we can turn perf on and off tightly around benchmarking code, I discovered that the tuning made some of these events not fire all the time, probably due to a lower sampling rate. This resolved that and did not seem to have a significant effect on the overall perf results.
But you're right in general that the perf measurement makes the results slightly different anyway, which is unavoidable.
|
It would be nice to have a way to locally set the nickname for the host as used here. bench_runner/bench_runner/result.py Line 527 in eeccea6 Otherwise |
There was a problem hiding this comment.
Looks reasonable, in that it does everything that we want it to do thanks! Though I haven't gone through and understood everything this PR changes, just that it works for our workflow.
I'm still getting error: use of undeclared identifier 'lastopcode' with the current CPython main, so python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats fails.
If I fix that build failure for the tail calling interpreter and I want the bytecode profiles, I can run something like python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats and python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats, but I have to separately move the pystats results files from results to profiling for profiling_plot to pick them up, is there currently a script somewhere that does this movement that needs including in this or am I missing something?
| if Path(".debug").exists(): | ||
| shutil.rmtree(".debug") | ||
|
|
||
| pystats_dir = Path("/tmp") / "pystats" |
| # Now that we've installed the full bench_runner library, | ||
| # continue on in a new process... | ||
|
|
||
| last_arg = sys.argv.index("workflow_bootstrap.py") |
There was a problem hiding this comment.
This means that I have to copy workflow_bootstrap.py out of templates into the repository that I'm running bench_runner from, could this be changed to check the index of the argument that includes "workflow_bootstrap.py"?
There was a problem hiding this comment.
The python -m bench_runner install script will copy the workflow_bootstrap.py script to the results repository for you (just as it already does for all of the generated GitHub workflow files).
The bootstrap script is required because we need to have a working venv to run the rest of the workflow.
There was a problem hiding this comment.
Ah yes thanks, that got lost in my fiddling.
|
|
||
| args = [] | ||
| if pystats: | ||
| args.append("--with-pystats") |
There was a problem hiding this comment.
I think this should be --enable-pystats
Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning. |
02c4dc1 to
6e8270f
Compare
There was a problem hiding this comment.
Copilot reviewed 15 out of 16 changed files in this pull request and generated no comments.
Files not reviewed (1)
- bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (2)
bench_runner/git.py:157
- The code uses 'contextlib.chdir', which is not a standard library function. Please verify that a valid context manager for changing the working directory is defined or imported.
with contextlib.chdir(dirname):
bench_runner/benchmark_definitions.py:33
- The updated benchmark hash logic now produces a different hash (e.g., 'dcfded') compared to earlier expectations (like '215d35'). Confirm that this change is intentional and that all tests and dependent logic have been updated accordingly.
def get_benchmark_hash() -> str:
I don't understand why it should do that. If the string returned by |
GeorgeWort
left a comment
There was a problem hiding this comment.
Looks good to me, only nice to haves left are the moving of the pystat files and the "unknown" issue.
You're right -- I'm confusing it with |
| BENCHMARK_REPOS = [ | ||
| BenchmarkRepo( | ||
| "56d12a8fd7cc1432835965d374929bfa7f6f7a07", | ||
| "https://github.com/mdboom/pyperformance.git", |
There was a problem hiding this comment.
Why are you pointing to your own fork?
There was a problem hiding this comment.
Oh, this is just an artifact of needing some modifications to pyperformance early on. There's no need for this anymore, I'll change it.
| # the virtual environment to run the full bench_runner. | ||
|
|
||
|
|
||
| # NOTE: This file should import in Python 3.9 or later so it can at least print |
There was a problem hiding this comment.
Later we require Python 3.11 though
There was a problem hiding this comment.
Good catch. I'll update this comment.
There was a problem hiding this comment.
Oh, actually, this is correct. bench_runner requires 3.11 and above and is only tested on that. However, the oldest LTS Linux we use installs Python 3.9 as default. It's nice to be able to run this script and get a nice error message that a Python 3.11 needs to be installed.
There was a problem hiding this comment.
Copilot reviewed 16 out of 17 changed files in this pull request and generated no comments.
Files not reviewed (1)
- bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (4)
bench_runner/benchmark_definitions.py:34
- [nitpick] The variable name 'hash' shadows the built-in hash() function; consider renaming it to 'hasher' or a similar alternative.
hash = hashlib.sha256()
bench_runner/scripts/generate_results.py:130
- The try/except fallback for unknown runner names defaults the index to -1; please document or reconsider if this behavior is intended for proper runner ordering.
idx = order.index(val.split()[0])
bench_runner/scripts/install.py:248
- The condition now skips files with a '.py' extension; verify that legitimate template files are not unintentionally omitted from processing.
if not (ROOT_PATH / path.name).is_file() or path.suffix == ".py":
bench_runner/git.py:157
- Using 'contextlib.chdir' is nonstandard since the standard library does not provide this context manager; ensure that it is defined elsewhere or replace it with a custom context manager for changing directories.
with contextlib.chdir(dirname):
8f2f945 to
8915035
Compare
19dd6bd to
a8b142f
Compare
This is the first step of #380.