Skip to content

Refactor to do more work in Python#382

Merged
mdboom merged 30 commits into
mainfrom
refactor-to-python
Apr 11, 2025
Merged

Refactor to do more work in Python#382
mdboom merged 30 commits into
mainfrom
refactor-to-python

Conversation

@mdboom
Copy link
Copy Markdown
Contributor

@mdboom mdboom commented Mar 27, 2025

This is the first step of #380.

@mdboom mdboom force-pushed the refactor-to-python branch from 50b4624 to 75a085b Compare March 28, 2025 18:16
@mdboom mdboom marked this pull request as ready for review March 28, 2025 18:16
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Mar 28, 2025

@diegorusso: This moves as much work as possible from the GHA yml files to Python, so it should be portable across automation systems. I realize it's a fairly large PR, but I would be interested in your feedback to confirm it is going to meet your needs, etc.

@mdboom mdboom requested a review from Copilot March 28, 2025 18:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR is the first step in the refactor to shift more work into Python as part of #380. The changes update tests, centralize benchmark hash computation via benchmark_definitions, remove the old should_run script, and adjust workflow and template scripts to use the new commands.

  • Updated tests in tests/test_run_benchmarks.py to use the new workflow.should_run and hardcoded benchmark hash logic.
  • Removed the obsolete util.get_benchmark_hash in favor of benchmark_definitions.get_benchmark_hash and updated references across the codebase.
  • Revised workflow and template scripts (including workflow_bootstrap.py, _pystats.src.yml, and _benchmark.src.yml) to align with the new command structure.

Reviewed Changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_run_benchmarks.py Updated tests to call hardcode_benchmark_hash and use workflow.should_run
bench_runner/util.py Removed legacy benchmark hash function
bench_runner/templates/workflow_bootstrap.py Added new bootstrap script for setting up the workflow
bench_runner/templates/benchmark.src.yml, _pystats.src.yml, _benchmark.src.yml Adjusted steps to call workflow_bootstrap.py instead of the old commands
bench_runner/scripts/workflow.py Major refactoring of workflow logic with benchmark hash updates
bench_runner/scripts/should_run.py Removed in favor of updated workflow script
bench_runner/scripts/run_benchmarks.py Updated to reference benchmark_definitions for hash
bench_runner/scripts/install.py, get_merge_base.py Minor changes for YAML generation and hash retrieval
bench_runner/git.py Updated clone function with context management changes
bench_runner/benchmark_definitions.py New centralized benchmark hash computation
bench_runner/main.py Updated commands reflecting the refactored workflow
Files not reviewed (1)
  • bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (1)

bench_runner/benchmark_definitions.py:34

  • [nitpick] The variable name 'hash' shadows the built-in Python function hash(). Consider renaming it to something like 'sha' or 'hasher' to improve clarity.
    hash = hashlib.sha256()

Comment thread bench_runner/scripts/workflow.py Outdated
Comment thread bench_runner/scripts/workflow.py Outdated
Comment thread bench_runner/scripts/workflow.py Outdated
if not sys.platform.startswith("linux"):
return

run_in_venv(venv, "pyperf", ["system", perf and "reset" or "tune"], sudo=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the CPU frequency scaling governors are set to "powersave" when profiling and "performance" when benchmarking?
Could this not lead to different characteristics between the profiled run and the scored run? Though I admit that the profiling will create different characteristics anyway. So I guess my question is just, what is the reason for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is pretty recent. When @mpage added events to pyperf so that we can turn perf on and off tightly around benchmarking code, I discovered that the tuning made some of these events not fire all the time, probably due to a lower sampling rate. This resolved that and did not seem to have a significant effect on the overall perf results.

But you're right in general that the perf measurement makes the results slightly different anyway, which is unavoidable.

@GeorgeWort
Copy link
Copy Markdown

GeorgeWort commented Apr 7, 2025

It would be nice to have a way to locally set the nickname for the host as used here.

_clean(runners.get_nickname_for_hostname(socket.gethostname())),

Otherwise generate_results trips over here.

return order.index(val.split()[0]), val

Copy link
Copy Markdown

@GeorgeWort GeorgeWort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, in that it does everything that we want it to do thanks! Though I haven't gone through and understood everything this PR changes, just that it works for our workflow.

I'm still getting error: use of undeclared identifier 'lastopcode' with the current CPython main, so python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats fails.

If I fix that build failure for the tail calling interpreter and I want the bytecode profiles, I can run something like python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats and python3.11 workflow_bootstrap.py python main linux-aarch64 all clang --pystats, but I have to separately move the pystats results files from results to profiling for profiling_plot to pick them up, is there currently a script somewhere that does this movement that needs including in this or am I missing something?

Comment thread bench_runner/scripts/workflow.py Outdated
if Path(".debug").exists():
shutil.rmtree(".debug")

pystats_dir = Path("/tmp") / "pystats"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be py_stats?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes.

# Now that we've installed the full bench_runner library,
# continue on in a new process...

last_arg = sys.argv.index("workflow_bootstrap.py")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that I have to copy workflow_bootstrap.py out of templates into the repository that I'm running bench_runner from, could this be changed to check the index of the argument that includes "workflow_bootstrap.py"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python -m bench_runner install script will copy the workflow_bootstrap.py script to the results repository for you (just as it already does for all of the generated GitHub workflow files).

The bootstrap script is required because we need to have a working venv to run the rest of the workflow.

Copy link
Copy Markdown

@GeorgeWort GeorgeWort Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes thanks, that got lost in my fiddling.

Comment thread bench_runner/scripts/workflow.py Outdated

args = []
if pystats:
args.append("--with-pystats")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be --enable-pystats

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch.

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 8, 2025

It would be nice to have a way to locally set the nickname for the host as used here.

_clean(runners.get_nickname_for_hostname(socket.gethostname())),

Otherwise generate_results trips over here.

return order.index(val.split()[0]), val

Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning.

@mdboom mdboom force-pushed the refactor-to-python branch from 02c4dc1 to 6e8270f Compare April 8, 2025 13:04
@mdboom mdboom requested review from GeorgeWort and Copilot April 8, 2025 13:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 15 out of 16 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (2)

bench_runner/git.py:157

  • The code uses 'contextlib.chdir', which is not a standard library function. Please verify that a valid context manager for changing the working directory is defined or imported.
with contextlib.chdir(dirname):

bench_runner/benchmark_definitions.py:33

  • The updated benchmark hash logic now produces a different hash (e.g., 'dcfded') compared to earlier expectations (like '215d35'). Confirm that this change is intentional and that all tests and dependent logic have been updated accordingly.
def get_benchmark_hash() -> str:

@GeorgeWort
Copy link
Copy Markdown

Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning.

I don't understand why it should do that. If the string returned by val.split()[0] is not in the order list, then order.index will throw an exception as it should.

Copy link
Copy Markdown

@GeorgeWort GeorgeWort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, only nice to haves left are the moving of the pystat files and the "unknown" issue.

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 8, 2025

Sorry, I'm not following. Can you be more specific? The code at the second link should just sort unknown nicknames to the beginning.

I don't understand why it should do that. If the string returned by val.split()[0] is not in the order list, then order.index will throw an exception as it should.

You're right -- I'm confusing it with find. I think the thing to do here is provide a default.

Comment thread bench_runner/benchmark_definitions.py Outdated
BENCHMARK_REPOS = [
BenchmarkRepo(
"56d12a8fd7cc1432835965d374929bfa7f6f7a07",
"https://github.com/mdboom/pyperformance.git",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you pointing to your own fork?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is just an artifact of needing some modifications to pyperformance early on. There's no need for this anymore, I'll change it.

Comment thread bench_runner/scripts/workflow.py
Comment thread bench_runner/scripts/workflow.py
# the virtual environment to run the full bench_runner.


# NOTE: This file should import in Python 3.9 or later so it can at least print
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later we require Python 3.11 though

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll update this comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually, this is correct. bench_runner requires 3.11 and above and is only tested on that. However, the oldest LTS Linux we use installs Python 3.9 as default. It's nice to be able to run this script and get a nice error message that a Python 3.11 needs to be installed.

@mdboom mdboom requested review from Copilot and diegorusso April 9, 2025 12:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 16 out of 17 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • bench_runner/templates/env.yml: Language not supported
Comments suppressed due to low confidence (4)

bench_runner/benchmark_definitions.py:34

  • [nitpick] The variable name 'hash' shadows the built-in hash() function; consider renaming it to 'hasher' or a similar alternative.
hash = hashlib.sha256()

bench_runner/scripts/generate_results.py:130

  • The try/except fallback for unknown runner names defaults the index to -1; please document or reconsider if this behavior is intended for proper runner ordering.
            idx = order.index(val.split()[0])

bench_runner/scripts/install.py:248

  • The condition now skips files with a '.py' extension; verify that legitimate template files are not unintentionally omitted from processing.
if not (ROOT_PATH / path.name).is_file() or path.suffix == ".py":

bench_runner/git.py:157

  • Using 'contextlib.chdir' is nonstandard since the standard library does not provide this context manager; ensure that it is defined elsewhere or replace it with a custom context manager for changing directories.
with contextlib.chdir(dirname):

Comment thread bench_runner/util.py
@mdboom mdboom force-pushed the refactor-to-python branch from 8f2f945 to 8915035 Compare April 9, 2025 21:36
@mdboom mdboom force-pushed the refactor-to-python branch from 19dd6bd to a8b142f Compare April 10, 2025 14:50
@mdboom mdboom mentioned this pull request Apr 10, 2025
@mdboom mdboom merged commit 77124f2 into main Apr 11, 2025
4 checks passed
@mdboom mdboom deleted the refactor-to-python branch April 18, 2025 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants