Skip to content

Commit b5e8e05

Browse files
Copilotev-br
andcommitted
Add documentation and script for verifying Hypothesis example counts
Co-authored-by: ev-br <2133832+ev-br@users.noreply.github.com>
1 parent 4173840 commit b5e8e05

2 files changed

Lines changed: 170 additions & 0 deletions

File tree

README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,63 @@ values should result in more rigorous runs. For example, `--max-examples
291291
10_000` may find bugs where default runs don't but will take much longer to
292292
run.
293293

294+
##### Checking the actual number of examples
295+
296+
To verify the actual number of examples Hypothesis ran for each test, use the
297+
`--hypothesis-show-statistics` flag:
298+
299+
```bash
300+
$ pytest array_api_tests/test_manipulation_functions.py::test_squeeze --max-examples=100 --hypothesis-show-statistics
301+
```
302+
303+
This will display detailed statistics for each test at the end of the output.
304+
The key line to look for is `"Stopped because settings.max_examples=N"` which shows
305+
exactly how many examples were run. For example:
306+
307+
```
308+
================================================ Hypothesis Statistics =================================================
309+
array_api_tests/test_manipulation_functions.py::test_squeeze:
310+
311+
- during generate phase (0.06 seconds):
312+
- Typical runtimes: ~ 1-2 ms, of which ~ 0-2 ms in data generation
313+
- 10 passing examples, 0 failing examples, 19 invalid examples
314+
315+
- Stopped because settings.max_examples=10
316+
```
317+
318+
This confirms that even though `--max-examples=100` was specified, the test
319+
only ran 10 examples. This is because `test_squeeze` is marked with
320+
`@pytest.mark.unvectorized`, which automatically reduces the number of examples
321+
to 1/10th to improve performance.
322+
323+
To compare, a test without the `unvectorized` marker will show the full count:
324+
325+
```bash
326+
$ pytest array_api_tests/test_manipulation_functions.py::TestExpandDims::test_expand_dims_tuples --max-examples=100 --hypothesis-show-statistics
327+
```
328+
329+
Output:
330+
```
331+
- Stopped because settings.max_examples=100
332+
```
333+
334+
This verification method is useful for:
335+
- Confirming that the `unvectorized` marker is working correctly
336+
- Debugging test performance issues
337+
- Understanding how many examples Hypothesis actually generated vs skipped
338+
339+
###### Automated verification
340+
341+
A verification script is provided to automatically check that the `unvectorized`
342+
marker is working correctly:
343+
344+
```bash
345+
$ python3 verify_unvectorized_marker.py
346+
```
347+
348+
This script runs tests with and without the marker and confirms that the number
349+
of examples is correctly reduced.
350+
294351
#### Skipping Dtypes
295352

296353
The test suite will automatically skip testing of inessential dtypes if they

verify_unvectorized_marker.py

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Verification script to check that the @pytest.mark.unvectorized marker
4+
correctly reduces the number of Hypothesis examples.
5+
6+
This script runs tests with --hypothesis-show-statistics and parses the output
7+
to verify the actual number of examples that were run.
8+
9+
Usage:
10+
python3 verify_unvectorized_marker.py
11+
"""
12+
13+
import subprocess
14+
import re
15+
import sys
16+
17+
18+
def run_test_and_get_examples(test_path, max_examples=100):
19+
"""
20+
Run a test with hypothesis statistics and extract the number of examples run.
21+
22+
Args:
23+
test_path: Full path to the test (e.g., "array_api_tests/test_manipulation_functions.py::test_squeeze")
24+
max_examples: The --max-examples value to pass to pytest
25+
26+
Returns:
27+
int: Number of examples that were actually run, or None if parsing failed
28+
"""
29+
cmd = [
30+
"python3", "-m", "pytest",
31+
test_path,
32+
f"--max-examples={max_examples}",
33+
"--hypothesis-show-statistics",
34+
"-v"
35+
]
36+
37+
result = subprocess.run(
38+
cmd,
39+
capture_output=True,
40+
text=True,
41+
env={"ARRAY_API_TESTS_MODULE": "numpy"}
42+
)
43+
44+
# Look for "Stopped because settings.max_examples=N"
45+
match = re.search(r"Stopped because settings\.max_examples=(\d+)", result.stdout)
46+
if match:
47+
return int(match.group(1))
48+
49+
return None
50+
51+
52+
def main():
53+
print("Verifying @pytest.mark.unvectorized marker behavior...")
54+
print("=" * 70)
55+
56+
max_examples = 100
57+
58+
# Test 1: A test WITH the unvectorized marker
59+
print(f"\n1. Testing with @pytest.mark.unvectorized marker:")
60+
print(f" Test: test_squeeze")
61+
print(f" Expected: {max_examples // 10} examples (1/10th of {max_examples})")
62+
63+
examples_vectorized = run_test_and_get_examples(
64+
"array_api_tests/test_manipulation_functions.py::test_squeeze",
65+
max_examples
66+
)
67+
68+
if examples_vectorized is None:
69+
print(" ERROR: Could not parse number of examples from output")
70+
return 1
71+
72+
print(f" Actual: {examples_vectorized} examples")
73+
74+
expected = max_examples // 10
75+
if examples_vectorized == expected:
76+
print(" ✓ PASS: Marker correctly reduced examples")
77+
else:
78+
print(f" ✗ FAIL: Expected {expected} examples but got {examples_vectorized}")
79+
return 1
80+
81+
# Test 2: A test WITHOUT the unvectorized marker
82+
print(f"\n2. Testing WITHOUT @pytest.mark.unvectorized marker:")
83+
print(f" Test: TestExpandDims::test_expand_dims_tuples")
84+
print(f" Expected: {max_examples} examples (full amount)")
85+
86+
examples_non_vectorized = run_test_and_get_examples(
87+
"array_api_tests/test_manipulation_functions.py::TestExpandDims::test_expand_dims_tuples",
88+
max_examples
89+
)
90+
91+
if examples_non_vectorized is None:
92+
print(" ERROR: Could not parse number of examples from output")
93+
return 1
94+
95+
print(f" Actual: {examples_non_vectorized} examples")
96+
97+
if examples_non_vectorized == max_examples:
98+
print(" ✓ PASS: Test ran with full number of examples")
99+
else:
100+
print(f" ✗ FAIL: Expected {max_examples} examples but got {examples_non_vectorized}")
101+
return 1
102+
103+
print("\n" + "=" * 70)
104+
print("✓ All verification checks passed!")
105+
print(f"\nSummary:")
106+
print(f" - Unvectorized test: {examples_vectorized} examples (10% of requested)")
107+
print(f" - Non-unvectorized test: {examples_non_vectorized} examples (100% of requested)")
108+
109+
return 0
110+
111+
112+
if __name__ == "__main__":
113+
sys.exit(main())

0 commit comments

Comments
 (0)