Skip to content

Commit 0c2c022

Browse files
committed
init commit
1 parent 51fc66a commit 0c2c022

3 files changed

Lines changed: 751 additions & 0 deletions

File tree

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""Shared state for the step-status characterization suite.
2+
3+
The outer test in ``test_step_status_states.py`` runs inner pytest sessions
4+
via ``pytester``. The inner session installs a fake ``sift_client`` (see
5+
``_INNER_CONFTEST_SRC`` in that file) which records every step status
6+
write into this module's ``CAPTURED_STEPS`` dict so the outer test can
7+
assert on what the plugin produced.
8+
9+
This lives in its own module (rather than inside the test file) because
10+
the inner ``conftest.py`` runs in a fresh pytester tmp dir and needs an
11+
importable, package-reachable handle to the same dict object.
12+
"""
13+
14+
from __future__ import annotations
15+
16+
from dataclasses import dataclass, field
17+
from typing import TYPE_CHECKING
18+
19+
if TYPE_CHECKING:
20+
from sift_client.sift_types.test_report import TestStatus
21+
22+
23+
@dataclass
24+
class CapturedStep:
25+
step_id: str
26+
name: str
27+
step_path: str
28+
parent_step_id: str | None
29+
statuses: list[TestStatus] = field(default_factory=list)
30+
31+
32+
CAPTURED_STEPS: dict[str, CapturedStep] = {}
33+
34+
35+
def reset() -> None:
36+
CAPTURED_STEPS.clear()
37+
38+
39+
def steps_by_name(name: str) -> list[CapturedStep]:
40+
return [s for s in CAPTURED_STEPS.values() if s.name == name]
41+
42+
43+
def test_step(name: str) -> CapturedStep | None:
44+
"""The step the autouse ``step`` fixture creates for the test function.
45+
46+
There can be a deeper step with the same name when the ``makereport``
47+
hook also records one (e.g. ``pytest.skip()`` inside the test body, or
48+
an ``xfail`` mark). The autouse step is the shallowest of those, so
49+
pick by step_path depth.
50+
"""
51+
matches = [s for s in CAPTURED_STEPS.values() if s.name == name]
52+
if not matches:
53+
return None
54+
return min(matches, key=lambda s: s.step_path.count("."))
55+
56+
57+
def child_steps(parent: CapturedStep) -> list[CapturedStep]:
58+
return [s for s in CAPTURED_STEPS.values() if s.parent_step_id == parent.step_id]
59+
60+
61+
def final_status(name: str) -> TestStatus | None:
62+
step = test_step(name)
63+
if step is None or not step.statuses:
64+
return None
65+
return step.statuses[-1]
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Pytest-plugin step-status: observed vs. target
2+
3+
Companion document to `test_step_status_states.py`. Each row corresponds to
4+
one scenario in that suite. The **observed** column is the status the Sift
5+
pytest plugin records for the test's outer step today; the **target** column
6+
is what the audit recommends. Rows where the two differ are the work items
7+
for the fix.
8+
9+
`TestStatus` values referenced below come from
10+
`sift_client.sift_types.test_report.TestStatus`: `PASSED`, `FAILED`, `ERROR`,
11+
`SKIPPED`, plus the proposed `XFAILED` / `XPASSED` / `ABORTED` additions
12+
called out in the audit.
13+
14+
## Call-phase exit paths
15+
16+
| Scenario | Trigger | Observed today | Target | Status |
17+
| --------------------------------------- | --------------------------------------------- | --------------------------- | ------------------------------------------ | ------ |
18+
| Test passes | function body returns cleanly | `PASSED` | `PASSED` | OK |
19+
| Assert failure in call phase | `assert 1 == 2` | `FAILED` | `FAILED` | OK |
20+
| Generic exception in call phase | `raise ValueError("boom")` | `ERROR` | `ERROR` | OK |
21+
| `pytest.fail("...")` from body | `pytest.fail("intentional failure")` | `ERROR` | `FAILED` | Gap |
22+
| `SystemExit` from the test body | `sys.exit(1)` | `ERROR` | `ABORTED` (proposed) or documented `ERROR` | Gap |
23+
| `KeyboardInterrupt` in body | `raise KeyboardInterrupt` | `PASSED` (session aborts before the plugin sees the interrupt) | `ABORTED` (proposed) | Gap |
24+
25+
## Skip paths
26+
27+
| Scenario | Trigger | Observed today | Target | Status |
28+
| --------------------------------------- | --------------------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------- | ------ |
29+
| Collection-time skip | `@pytest.mark.skip(reason=...)` | `SKIPPED` (only the makereport hook records a step; no autouse step ran) | `SKIPPED` | OK |
30+
| Runtime skip in body | `pytest.skip("...")` | Outer step `ERROR`; a nested step with the same name records `SKIPPED` | Outer step `SKIPPED`; no duplicate nested step | Gap |
31+
| Skip raised inside a fixture | `@pytest.fixture` calls `pytest.skip("...")` | Outer step `PASSED`; a nested `SKIPPED` step is created by the makereport hook | Outer step `SKIPPED` with `phase=setup`; no duplicate nested step | Gap |
32+
33+
## xfail / xpass
34+
35+
| Scenario | Trigger | Observed today | Target | Status |
36+
| --------------------------------------- | ------------------------------------------------------ | ----------------------------------------------------------------------------------------------- | ----------------------------------------------------- | ------ |
37+
| xfail-marked test that fails | `@pytest.mark.xfail` + `assert 1 == 2` | Outer step `FAILED`; nested `SKIPPED` substep from the makereport hook | Outer step `XFAILED`; no duplicate nested step | Gap |
38+
| Strict xfail that unexpectedly passes | `@pytest.mark.xfail(strict=True)` + `assert True` | Outer step `PASSED` (plugin never sees pytest's "strict xpass" failure attached to the report) | Outer step `XPASSED` | Gap |
39+
| Non-strict xfail that unexpectedly passes | `@pytest.mark.xfail()` + `assert True` | Outer step `PASSED` (pytest reports outcome="passed" with `wasxfail` set; plugin ignores it) | Outer step `XPASSED` | Gap |
40+
| `xfail(raises=...)` with wrong exception | `@pytest.mark.xfail(raises=ValueError)` + `raise KeyError` | Outer step `ERROR` (treated as a generic non-assertion exception) | `FAILED` (the `raises=` mismatch is a real test failure) | Gap |
41+
| `xfail(run=False)` | `@pytest.mark.xfail(run=False)` (body never executed) | `SKIPPED` (only the makereport hook records a step) | `XFAILED` | Gap |
42+
43+
## Setup / teardown phases
44+
45+
| Scenario | Trigger | Observed today | Target | Status |
46+
| --------------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | ------ |
47+
| Setup-phase fixture failure (RuntimeError) | `@pytest.fixture` raises before `yield`; test body never runs | Outer step does not exist or lands `PASSED`; the plugin does not consult `report.when` | `ERROR` with `phase=setup` annotation | Gap |
48+
| Teardown-phase fixture failure | `@pytest.fixture` raises after `yield`; test body passed | Outer step `PASSED` — it closes before the failing teardown runs, so the error is invisible | `FAILED` with `phase=teardown` annotation | Gap |
49+
| Call-phase fail **plus** teardown-phase fail | `assert 1 == 2` in body AND `@pytest.fixture` raises after `yield` | Outer step `FAILED` (the call-phase failure dominates); the teardown error is silently lost | `FAILED` with a `phase=teardown` annotation so the teardown error is also visible | Gap |
50+
51+
## Collection / fixture-resolution failures
52+
53+
| Scenario | Trigger | Observed today | Target | Status |
54+
| --------------------------------------- | --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | ------ |
55+
| Missing fixture | `def test_x(nonexistent_fixture):` | Outer step `PASSED` — the autouse `step` fixture's setup still runs before pytest detects the missing fixture; the user sees a green step for a test that never executed | `ERROR` with `phase=setup` | Gap |
56+
57+
## Plugin-API exit paths (in-test mutations)
58+
59+
| Scenario | Trigger | Observed today | Target | Status |
60+
| --------------------------------------- | ---------------------------------------------------------------------- | -------------- | -------- | ------ |
61+
| Manual status override | `step.current_step.update({"status": TestStatus.FAILED})` | `FAILED` | `FAILED` | OK |
62+
| `report_outcome(result=False)` | `step.report_outcome("the_check", False, "did not match")` | `FAILED` | `FAILED` | OK |
63+
| `measure(...)` out-of-bounds | `step.measure(name="m", value=10.0, bounds={"min": 0.0, "max": 5.0})` | `FAILED` | `FAILED` | OK |
64+
65+
## Out of scope for this characterization run
66+
67+
- **Timeout** — needs `pytest-timeout` or a manual signal harness. Add as a
68+
follow-up once the audit picks a timeout strategy.
69+
- **Signal (SIGKILL / SIGTERM)** — cannot be caught from inside the process;
70+
needs a subprocess-level harness.
71+
- **`pytest.exit("...")`** — niche; the "aborts subsequent tests" behavior
72+
is hard to characterize cleanly because each `pytester` invocation is its
73+
own session. Document the expectation alongside `SystemExit`.
74+
- **`os._exit()`** — bypasses Python cleanup entirely; can't be tested
75+
in-process because it would kill the outer pytest run. Document as a
76+
guaranteed data-loss case alongside `SystemExit` / `SIGKILL`.
77+
- **Parametrize-level marks** (`pytest.param(..., marks=pytest.mark.xfail / skip)`)
78+
— routes through a different selection path but produces the same
79+
`report.outcome`, so behavior should match the function-level marks
80+
already covered above. Add only if the plugin's eventual phase-aware
81+
handler diverges between the two.
82+
- **Import error / syntax error / `conftest.py` error** — these fail
83+
collection entirely; no `item` is produced and no plugin hook fires.
84+
Document explicitly that no Sift step is recorded.
85+
- **No-data / indeterminate** — tracked separately as part of the sibling
86+
status-semantics work.
87+
88+
## How to refresh this table
89+
90+
Run the suite locally:
91+
92+
```
93+
pytest lib/sift_client/_tests/util/test_step_status_states.py -v
94+
```
95+
96+
Every "Gap" row corresponds to a `# AUDIT:` comment in the test file naming
97+
the target status. When the plugin fix lands, the regression edit is
98+
mechanical: flip the assertion in each gap row to its target, then update
99+
the **Observed today** column here to match.

0 commit comments

Comments
 (0)