|
| 1 | +# Pytest-plugin step-status: observed vs. target |
| 2 | + |
| 3 | +Companion document to `test_step_status_states.py`. Each row corresponds to |
| 4 | +one scenario in that suite. The **observed** column is the status the Sift |
| 5 | +pytest plugin records for the test's outer step today; the **target** column |
| 6 | +is what the audit recommends. Rows where the two differ are the work items |
| 7 | +for the fix. |
| 8 | + |
| 9 | +`TestStatus` values referenced below come from |
| 10 | +`sift_client.sift_types.test_report.TestStatus`: `PASSED`, `FAILED`, `ERROR`, |
| 11 | +`SKIPPED`, plus the proposed `XFAILED` / `XPASSED` / `ABORTED` additions |
| 12 | +called out in the audit. |
| 13 | + |
| 14 | +## Call-phase exit paths |
| 15 | + |
| 16 | +| Scenario | Trigger | Observed today | Target | Status | |
| 17 | +| --------------------------------------- | --------------------------------------------- | --------------------------- | ------------------------------------------ | ------ | |
| 18 | +| Test passes | function body returns cleanly | `PASSED` | `PASSED` | OK | |
| 19 | +| Assert failure in call phase | `assert 1 == 2` | `FAILED` | `FAILED` | OK | |
| 20 | +| Generic exception in call phase | `raise ValueError("boom")` | `ERROR` | `ERROR` | OK | |
| 21 | +| `pytest.fail("...")` from body | `pytest.fail("intentional failure")` | `ERROR` | `FAILED` | Gap | |
| 22 | +| `SystemExit` from the test body | `sys.exit(1)` | `ERROR` | `ABORTED` (proposed) or documented `ERROR` | Gap | |
| 23 | +| `KeyboardInterrupt` in body | `raise KeyboardInterrupt` | `PASSED` (session aborts before the plugin sees the interrupt) | `ABORTED` (proposed) | Gap | |
| 24 | + |
| 25 | +## Skip paths |
| 26 | + |
| 27 | +| Scenario | Trigger | Observed today | Target | Status | |
| 28 | +| --------------------------------------- | --------------------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------- | ------ | |
| 29 | +| Collection-time skip | `@pytest.mark.skip(reason=...)` | `SKIPPED` (only the makereport hook records a step; no autouse step ran) | `SKIPPED` | OK | |
| 30 | +| Runtime skip in body | `pytest.skip("...")` | Outer step `ERROR`; a nested step with the same name records `SKIPPED` | Outer step `SKIPPED`; no duplicate nested step | Gap | |
| 31 | +| Skip raised inside a fixture | `@pytest.fixture` calls `pytest.skip("...")` | Outer step `PASSED`; a nested `SKIPPED` step is created by the makereport hook | Outer step `SKIPPED` with `phase=setup`; no duplicate nested step | Gap | |
| 32 | + |
| 33 | +## xfail / xpass |
| 34 | + |
| 35 | +| Scenario | Trigger | Observed today | Target | Status | |
| 36 | +| --------------------------------------- | ------------------------------------------------------ | ----------------------------------------------------------------------------------------------- | ----------------------------------------------------- | ------ | |
| 37 | +| xfail-marked test that fails | `@pytest.mark.xfail` + `assert 1 == 2` | Outer step `FAILED`; nested `SKIPPED` substep from the makereport hook | Outer step `XFAILED`; no duplicate nested step | Gap | |
| 38 | +| Strict xfail that unexpectedly passes | `@pytest.mark.xfail(strict=True)` + `assert True` | Outer step `PASSED` (plugin never sees pytest's "strict xpass" failure attached to the report) | Outer step `XPASSED` | Gap | |
| 39 | +| Non-strict xfail that unexpectedly passes | `@pytest.mark.xfail()` + `assert True` | Outer step `PASSED` (pytest reports outcome="passed" with `wasxfail` set; plugin ignores it) | Outer step `XPASSED` | Gap | |
| 40 | +| `xfail(raises=...)` with wrong exception | `@pytest.mark.xfail(raises=ValueError)` + `raise KeyError` | Outer step `ERROR` (treated as a generic non-assertion exception) | `FAILED` (the `raises=` mismatch is a real test failure) | Gap | |
| 41 | +| `xfail(run=False)` | `@pytest.mark.xfail(run=False)` (body never executed) | `SKIPPED` (only the makereport hook records a step) | `XFAILED` | Gap | |
| 42 | + |
| 43 | +## Setup / teardown phases |
| 44 | + |
| 45 | +| Scenario | Trigger | Observed today | Target | Status | |
| 46 | +| --------------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | ------ | |
| 47 | +| Setup-phase fixture failure (RuntimeError) | `@pytest.fixture` raises before `yield`; test body never runs | Outer step does not exist or lands `PASSED`; the plugin does not consult `report.when` | `ERROR` with `phase=setup` annotation | Gap | |
| 48 | +| Teardown-phase fixture failure | `@pytest.fixture` raises after `yield`; test body passed | Outer step `PASSED` — it closes before the failing teardown runs, so the error is invisible | `FAILED` with `phase=teardown` annotation | Gap | |
| 49 | +| Call-phase fail **plus** teardown-phase fail | `assert 1 == 2` in body AND `@pytest.fixture` raises after `yield` | Outer step `FAILED` (the call-phase failure dominates); the teardown error is silently lost | `FAILED` with a `phase=teardown` annotation so the teardown error is also visible | Gap | |
| 50 | + |
| 51 | +## Collection / fixture-resolution failures |
| 52 | + |
| 53 | +| Scenario | Trigger | Observed today | Target | Status | |
| 54 | +| --------------------------------------- | --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | ------ | |
| 55 | +| Missing fixture | `def test_x(nonexistent_fixture):` | Outer step `PASSED` — the autouse `step` fixture's setup still runs before pytest detects the missing fixture; the user sees a green step for a test that never executed | `ERROR` with `phase=setup` | Gap | |
| 56 | + |
| 57 | +## Plugin-API exit paths (in-test mutations) |
| 58 | + |
| 59 | +| Scenario | Trigger | Observed today | Target | Status | |
| 60 | +| --------------------------------------- | ---------------------------------------------------------------------- | -------------- | -------- | ------ | |
| 61 | +| Manual status override | `step.current_step.update({"status": TestStatus.FAILED})` | `FAILED` | `FAILED` | OK | |
| 62 | +| `report_outcome(result=False)` | `step.report_outcome("the_check", False, "did not match")` | `FAILED` | `FAILED` | OK | |
| 63 | +| `measure(...)` out-of-bounds | `step.measure(name="m", value=10.0, bounds={"min": 0.0, "max": 5.0})` | `FAILED` | `FAILED` | OK | |
| 64 | + |
| 65 | +## Out of scope for this characterization run |
| 66 | + |
| 67 | +- **Timeout** — needs `pytest-timeout` or a manual signal harness. Add as a |
| 68 | + follow-up once the audit picks a timeout strategy. |
| 69 | +- **Signal (SIGKILL / SIGTERM)** — cannot be caught from inside the process; |
| 70 | + needs a subprocess-level harness. |
| 71 | +- **`pytest.exit("...")`** — niche; the "aborts subsequent tests" behavior |
| 72 | + is hard to characterize cleanly because each `pytester` invocation is its |
| 73 | + own session. Document the expectation alongside `SystemExit`. |
| 74 | +- **`os._exit()`** — bypasses Python cleanup entirely; can't be tested |
| 75 | + in-process because it would kill the outer pytest run. Document as a |
| 76 | + guaranteed data-loss case alongside `SystemExit` / `SIGKILL`. |
| 77 | +- **Parametrize-level marks** (`pytest.param(..., marks=pytest.mark.xfail / skip)`) |
| 78 | + — routes through a different selection path but produces the same |
| 79 | + `report.outcome`, so behavior should match the function-level marks |
| 80 | + already covered above. Add only if the plugin's eventual phase-aware |
| 81 | + handler diverges between the two. |
| 82 | +- **Import error / syntax error / `conftest.py` error** — these fail |
| 83 | + collection entirely; no `item` is produced and no plugin hook fires. |
| 84 | + Document explicitly that no Sift step is recorded. |
| 85 | +- **No-data / indeterminate** — tracked separately as part of the sibling |
| 86 | + status-semantics work. |
| 87 | + |
| 88 | +## How to refresh this table |
| 89 | + |
| 90 | +Run the suite locally: |
| 91 | + |
| 92 | +``` |
| 93 | +pytest lib/sift_client/_tests/util/test_step_status_states.py -v |
| 94 | +``` |
| 95 | + |
| 96 | +Every "Gap" row corresponds to a `# AUDIT:` comment in the test file naming |
| 97 | +the target status. When the plugin fix lands, the regression edit is |
| 98 | +mechanical: flip the assertion in each gap row to its target, then update |
| 99 | +the **Observed today** column here to match. |
0 commit comments