Run stdlib tests by indygreg · Pull Request #1024 · astral-sh/python-build-standalone

indygreg · 2026-03-21T14:07:24Z

Still a draft. Assessing how much effort it is to run the stdlib test suite in CI.

This commit overhauls our ability to run the stdlib test harness. Previously, `testdist.py` called a `run_tests.py` script that was bundled in the distribution. This script was simply a wrapper to calling `python -m test --slow-ci`. And `--slow-ci` currently expands to `--multiprocess 0 --randomize --fail-env-changes --rerun --print-slow --verbose3 -u all --timeout 1200`. This commit effectively inlines `run_tests.py` into `testdir.py` as well as greatly expands functionality for running the test harness. When enabling the stdlib test harness in CI as part of this commit, several test failures were encountered, especially in non-standard builds like `static` and `debug`. Even the `freethreaded` builds seemed to encounter a significant amount of failures (many of them intermittent), implying that the official CPython CI is failing to catch a lot of legitimate test failures. We want PBS to run stdlib tests to help us catch changes in behavior. And we can only do that if the CI pass/fail signal is high quality: we don't want CI "passing" if there are changes to test pass/fail behavior. Achieving this requires annotating all tests that can potentially fail. And then the test harness needs to validate that these annotations are accurate (read: that tests actually fail). So this commit introduces a `stdlib-test-annotations.yml` file in the root directory. It contains rules that filter a build configuration and 3 sections that describe specific annotations: 1. Skip running the test harness completely. This is necessary on some builds that are just so broken it wasn't worth annotating tests because so many tests failed. 2. Exclude all tests within a given Python module. This is reserved for scenarios where importing the test module fails and causes most/all tests to fail. Again, a mechanism to short-circuit having to annotate every failing test. 3. Expected test failures. The most common annotation. These annotations describe individual tests or glob pattern matches of tests that are "expected" to fail. Entries can be annotated as "intermittent" or "dont-verify" to allow the test to pass without failing our test harness. Most of the new code is in support of reading and applying these annotations. At build time, we read the `stdlib-test-annotations.yml` file and derive a new `stdlib-test-annotations.json` file with only the active annotations matching the build configuration. This file is included in the build distribution as `python/build/stdlib-test-annotations.json`. It has to be JSON so the Python test harness runner is able to read the file using just the stdlib. `test-distributions.py` has gained some new functionality, including the ability to run the stdlib test harness with raw arguments and emit a JUnit XML file with test results.

indygreg force-pushed the gps-stdlib-tests branch 30 times, most recently from 3776823 to 104fa00 Compare March 23, 2026 07:40

indygreg force-pushed the gps-stdlib-tests branch 29 times, most recently from 7aaaaf2 to bb0818e Compare March 25, 2026 07:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Run stdlib tests#1024

Run stdlib tests#1024
indygreg wants to merge 1 commit intomainfrom
gps-stdlib-tests

indygreg commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

indygreg commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant