Conversation
3776823 to
104fa00
Compare
7aaaaf2 to
bb0818e
Compare
This commit overhauls our ability to run the stdlib test harness. Previously, `testdist.py` called a `run_tests.py` script that was bundled in the distribution. This script was simply a wrapper to calling `python -m test --slow-ci`. And `--slow-ci` currently expands to `--multiprocess 0 --randomize --fail-env-changes --rerun --print-slow --verbose3 -u all --timeout 1200`. This commit effectively inlines `run_tests.py` into `testdir.py` as well as greatly expands functionality for running the test harness. When enabling the stdlib test harness in CI as part of this commit, several test failures were encountered, especially in non-standard builds like `static` and `debug`. Even the `freethreaded` builds seemed to encounter a significant amount of failures (many of them intermittent), implying that the official CPython CI is failing to catch a lot of legitimate test failures. We want PBS to run stdlib tests to help us catch changes in behavior. And we can only do that if the CI pass/fail signal is high quality: we don't want CI "passing" if there are changes to test pass/fail behavior. Achieving this requires annotating all tests that can potentially fail. And then the test harness needs to validate that these annotations are accurate (read: that tests actually fail). So this commit introduces a `stdlib-test-annotations.yml` file in the root directory. It contains rules that filter a build configuration and 3 sections that describe specific annotations: 1. Skip running the test harness completely. This is necessary on some builds that are just so broken it wasn't worth annotating tests because so many tests failed. 2. Exclude all tests within a given Python module. This is reserved for scenarios where importing the test module fails and causes most/all tests to fail. Again, a mechanism to short-circuit having to annotate every failing test. 3. Expected test failures. The most common annotation. These annotations describe individual tests or glob pattern matches of tests that are "expected" to fail. Entries can be annotated as "intermittent" or "dont-verify" to allow the test to pass without failing our test harness. Most of the new code is in support of reading and applying these annotations. At build time, we read the `stdlib-test-annotations.yml` file and derive a new `stdlib-test-annotations.json` file with only the active annotations matching the build configuration. This file is included in the build distribution as `python/build/stdlib-test-annotations.json`. It has to be JSON so the Python test harness runner is able to read the file using just the stdlib. `test-distributions.py` has gained some new functionality, including the ability to run the stdlib test harness with raw arguments and emit a JUnit XML file with test results.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Still a draft. Assessing how much effort it is to run the stdlib test suite in CI.