Skip to content

Commit c18aecf

Browse files
Python(feat): pytest pass fail behavior improvements (#568)
1 parent 1187fc6 commit c18aecf

10 files changed

Lines changed: 1313 additions & 289 deletions

File tree

python/docs/examples/pytest_plugin.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,10 @@ outcomes into `TestStatus`:
306306
| Non-`AssertionError` exception escapes the test (e.g. `ValueError`, `TimeoutError`) | `ERROR`, with the formatted traceback (last 10 frames plus the first frame) on `step.error_info.error_message` |
307307
| Manual `step.current_step.update({"status": ...})` | Whatever you set; the step exit handler honors a manually-resolved status |
308308

309+
For the full contract, including skips, xfail/xpass, hard exits (`SystemExit`,
310+
`KeyboardInterrupt`), setup/teardown phase failures, and propagation rules,
311+
see the [Pass/Fail Behavior guide](../guides/pytest_plugin/pass_fail_behavior.md).
312+
309313
A failure or error at any depth propagates upward: the parent substep, the
310314
function step, the class/module/package steps above it, and the session
311315
report all get marked failed.
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# Pass/Fail Behavior
2+
3+
The pytest plugin maps every pytest outcome to a `TestStatus` on the
4+
corresponding Sift step. Use this page to look up what a given test will
5+
produce, and how that result rolls up to the parent steps and the report.
6+
7+
## `TestStatus` values
8+
9+
The statuses below come from `sift_client.sift_types.test_report.TestStatus`.
10+
11+
| Status | Meaning |
12+
| ------------- |------------------------------------------------------------------------------------------------------------------------|
13+
| `PASSED` | The step completed and every check it owns succeeded. |
14+
| `FAILED` | An assertion, a `pytest.fail(...)`, a failed `report_outcome`, or a failing measurement marked it. |
15+
| `ERROR` | An unexpected exception escaped the test body or a fixture (setup or teardown). |
16+
| `ABORTED` | A hard exit (`SystemExit`, observed `KeyboardInterrupt`) interrupted the test. |
17+
| `SKIPPED` | The test was skipped at collection time, at runtime, or from a fixture. |
18+
| `IN_PROGRESS` | Test in progress or the plugin never observed a final outcome (e.g. a session-aborting interrupt killed pytest first). |
19+
20+
## Normal test outcomes
21+
22+
| Scenario | Trigger | Outcome |
23+
| ----------------------------------------- | ------------------------------------ | -------- |
24+
| Test passes | function body returns cleanly | `PASSED` |
25+
| Assertion failure | `assert 1 == 2` | `FAILED` |
26+
| `pytest.fail("...")` from the body | `pytest.fail("intentional failure")` | `FAILED` |
27+
| Uncaught non-assertion exception | `raise ValueError("boom")` | `ERROR` |
28+
29+
A non-assertion exception gets its formatted traceback recorded on
30+
`step.error_info.error_message`.
31+
32+
## Hard exits
33+
34+
Hard exits the plugin can observe map to `ABORTED`. If pytest tears the
35+
session down before the plugin sees the exit, the step stays at
36+
`IN_PROGRESS` instead of resolving.
37+
38+
| Scenario | Trigger | Outcome |
39+
| ---------------------------------------------- | ------------------------- | -------------------------------------------------------------------- |
40+
| `SystemExit` from the test body | `sys.exit(1)` | `ABORTED` |
41+
| `KeyboardInterrupt` the plugin observes | `raise KeyboardInterrupt` | `ABORTED` |
42+
| Session-aborting `KeyboardInterrupt` | Ctrl-C terminates pytest | `IN_PROGRESS` (session ends before the plugin's hooks fire) |
43+
44+
### Abort propagation through nested substeps
45+
46+
Every step that was open when the abort fired records
47+
`ABORTED`.
48+
49+
```python title="test_abort.py"
50+
import sys
51+
52+
53+
def test_x(step):
54+
with step.substep(name="completed_sub"):
55+
pass # closes as PASSED before the abort
56+
with step.substep(name="outer_sub") as outer_sub:
57+
with outer_sub.substep(name="inner_sub"):
58+
sys.exit(1) # ABORTED applied to inner_sub, outer_sub, and the test step
59+
```
60+
61+
The Sift report shows `completed_sub` as `PASSED` and the three steps
62+
still open at the abort (`inner_sub`, `outer_sub`, and the test step
63+
itself) as `ABORTED`.
64+
65+
## Skips
66+
67+
| Scenario | Trigger | Outcome |
68+
| ------------------------------------- | --------------------------------------------- | --------- |
69+
| Collection-time skip | `@pytest.mark.skip(reason=...)` | `SKIPPED` |
70+
| Conditional collection-time skip | `@pytest.mark.skipif(True, reason=...)` | `SKIPPED` |
71+
| Runtime skip from the test body | `pytest.skip("...")` | `SKIPPED` |
72+
| Skip raised inside a fixture | `@pytest.fixture` calls `pytest.skip("...")` | `SKIPPED` |
73+
74+
`SKIPPED` does not propagate as a failure. A skipped substep or test does
75+
not block its parent from resolving to `PASSED`.
76+
77+
## Expected failures (xfail / xpass)
78+
79+
xfail marks declare that a test is expected to fail. The plugin follows
80+
the same semantics pytest does.
81+
82+
| Scenario | Trigger | Outcome |
83+
| ----------------------------------------- | ---------------------------------------------------------- | ------------------------------------------------------------- |
84+
| xfail-marked test that fails | `@pytest.mark.xfail` + `assert 1 == 2` | `PASSED` (the test fulfilled the xfail expectation) |
85+
| Strict xfail that unexpectedly passes | `@pytest.mark.xfail(strict=True)` + `assert True` | `FAILED` (the mark no longer matches reality) |
86+
| Non-strict xfail that unexpectedly passes | `@pytest.mark.xfail()` + `assert True` | `PASSED` (`strict=False` does not insist on the failure) |
87+
| `xfail(raises=...)` with wrong exception | `@pytest.mark.xfail(raises=ValueError)` + `raise KeyError` | `FAILED` (the `raises=` mismatch is a real test failure) |
88+
| `xfail(run=False)` | `@pytest.mark.xfail(run=False)` | `SKIPPED` (the body never ran) |
89+
90+
## Influencing outcomes from test code
91+
92+
A test can also set the step's outcome directly via the helpers below.
93+
Substeps your test opens follow the same propagation rules as the ones
94+
the plugin opens for you.
95+
96+
### Manual status override
97+
98+
`step.current_step.update({...})` sets the status directly. The step's
99+
exit handler does not overwrite it.
100+
101+
```python
102+
from sift_client.sift_types.test_report import TestStatus
103+
104+
105+
def test_manual(step):
106+
step.current_step.update({"status": TestStatus.FAILED})
107+
```
108+
109+
### `report_outcome` for externally computed checks
110+
111+
`report_outcome(name, result, reason)` records a named check whose
112+
pass/fail was computed elsewhere (a subprocess, a remote system, your own
113+
comparison logic). A failing outcome marks the step `FAILED`.
114+
115+
```python
116+
def test_external_check(step):
117+
result, reason = run_external_validator()
118+
step.report_outcome("ext-validator", result, reason)
119+
```
120+
121+
### Measurements with bounds
122+
123+
`step.measure(name=, value=, bounds=)` records a measurement and resolves
124+
the step to `FAILED` if the value is out of bounds. The call returns the
125+
pass/fail boolean and does not raise, so multiple measurements can run
126+
without short-circuiting.
127+
128+
```python
129+
def test_battery(step):
130+
step.measure(name="voltage", value=12.1, bounds={"min": 11.5, "max": 13.0}, unit="V")
131+
step.measure(name="current", value=0.42, bounds={"max": 1.0}, unit="A")
132+
```
133+
134+
### Substep failures
135+
136+
A failed substep propagates failure to its parent step. A manually-set
137+
`SKIPPED` on a substep does not.
138+
139+
```python
140+
def test_with_substep(step):
141+
with step.substep(name="check") as inner:
142+
inner.measure(name="value", value=99.0, bounds={"min": 0.0, "max": 5.0})
143+
# The outer step resolves to FAILED because the substep failed.
144+
```
145+
146+
## Propagation rules
147+
148+
Every non-`PASSED`/`SKIPPED` step marks its parent as failed. What the
149+
parent records depends on whether its own scope had an abort and whether
150+
a child already failed:
151+
152+
- A hard exit (`SystemExit` or an observed `KeyboardInterrupt`) in the
153+
step's own scope records `ABORTED`. `ABORTED` propagates through every
154+
step the abort passes through on its way up.
155+
- A child that already recorded a non-`PASSED`/`SKIPPED` outcome marks
156+
the parent as `FAILED`. This holds whether or not an exception is still
157+
propagating through the parent's scope: only the originating substep
158+
records `ERROR`; ancestors inherit `FAILED`. The traceback stays on
159+
the originating step's `error_info`.
160+
- A step records `ERROR` only when its own scope raised a non-Assertion
161+
exception AND no child has failed.
162+
163+
`SKIPPED` does not propagate. A status set explicitly via
164+
`current_step.update` is kept.

python/lib/sift_client/_tests/pytest_plugin/_fakes.py

Lines changed: 0 additions & 132 deletions
This file was deleted.

0 commit comments

Comments
 (0)