Skip to content

Commit 9b16a60

Browse files
committed
clean up
1 parent 4f071b1 commit 9b16a60

3 files changed

Lines changed: 219 additions & 46 deletions

File tree

python/lib/sift_client/_tests/util/step_status_states.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,16 @@ one scenario in that suite. The **target** column is the contract the suite
55
asserts (sourced from
66
[`docs/guides/pytest_plugin/pass_fail_behavior.md`](../../../../docs/guides/pytest_plugin/pass_fail_behavior.md));
77
the **observed today** column records what the plugin actually produces
8-
right now. Rows marked `Gap` are scenarios where the test fails today and
9-
the plugin needs to be fixed to match the contract.
8+
right now. Every row should be marked `OK`; a `Gap` indicates the plugin has
9+
regressed against the contract.
1010

1111
`TestStatus` values referenced below come from
1212
`sift_client.sift_types.test_report.TestStatus`: `PASSED`, `FAILED`, `ERROR`,
13-
`SKIPPED`. The targets below map every scenario onto these four existing
14-
statuses. An `ABORTED` status for hard process exits (`SystemExit`,
13+
`SKIPPED`, `IN_PROGRESS`. The targets below map every scenario onto these
14+
existing statuses. An `ABORTED` status for hard process exits (`SystemExit`,
1515
`KeyboardInterrupt`, signals) is a planned future addition; until it lands
16-
those cases baseline against `ERROR`. The user-facing contract these
17-
targets describe is documented in
16+
those cases baseline against `ERROR` or `IN_PROGRESS`. The user-facing
17+
contract these targets describe is documented in
1818
[`docs/guides/pytest_plugin/pass_fail_behavior.md`](../../../../docs/guides/pytest_plugin/pass_fail_behavior.md).
1919

2020
## Case ID scheme
@@ -43,42 +43,42 @@ that prefix; numbers are never reused or shifted when other sections grow.
4343
| `CALL-01` | Test passes | function body returns cleanly | `PASSED` | `PASSED` | OK |
4444
| `CALL-02` | Assert failure in call phase | `assert 1 == 2` | `FAILED` | `FAILED` | OK |
4545
| `CALL-03` | Generic exception in call phase | `raise ValueError("boom")` | `ERROR` | `ERROR` | OK |
46-
| `CALL-04` | `pytest.fail("...")` from body | `pytest.fail("intentional failure")` | `ERROR` | `FAILED` | Gap |
46+
| `CALL-04` | `pytest.fail("...")` from body | `pytest.fail("intentional failure")` | `FAILED` | `FAILED` | OK |
4747
| `CALL-05` | `SystemExit` from the test body | `sys.exit(1)` | `ERROR` | `ERROR` (baseline; `ABORTED` planned later) | OK |
48-
| `CALL-06` | `KeyboardInterrupt` in body | `raise KeyboardInterrupt` | `PASSED` (session aborts before the plugin sees the interrupt) | `ERROR` when the plugin sees the interrupt; document that a session-aborting interrupt may leave the step in `IN_PROGRESS` | Gap |
48+
| `CALL-06` | `KeyboardInterrupt` in body | `raise KeyboardInterrupt` | `IN_PROGRESS` (session aborts before the plugin sees the interrupt) | `ERROR` when the plugin sees the interrupt; a session-aborting interrupt leaves the step in `IN_PROGRESS` | OK |
4949

5050
## Skip paths
5151

5252
| Case | Scenario | Trigger | Observed today | Target | Status |
5353
| --------- | --------------------------------------- | --------------------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------- | ------ |
5454
| `SKIP-01` | Collection-time skip | `@pytest.mark.skip(reason=...)` | `SKIPPED` (only the makereport hook records a step; no autouse step ran) | `SKIPPED` | OK |
5555
| `SKIP-02` | Conditional collection-time skip | `@pytest.mark.skipif(True, reason=...)` | `SKIPPED` (same route as `@pytest.mark.skip`) | `SKIPPED` | OK |
56-
| `SKIP-03` | Runtime skip in body | `pytest.skip("...")` | Outer step `ERROR`; a nested step with the same name records `SKIPPED` | Outer step `SKIPPED`; no duplicate nested step | Gap |
57-
| `SKIP-04` | Skip raised inside a fixture | `@pytest.fixture` calls `pytest.skip("...")` | Outer step `PASSED`; a nested `SKIPPED` step is created by the makereport hook | Outer step `SKIPPED` (setup-phase skip); no duplicate nested step | Gap |
56+
| `SKIP-03` | Runtime skip in body | `pytest.skip("...")` | Outer step `SKIPPED`; no duplicate nested step | Outer step `SKIPPED`; no duplicate nested step | OK |
57+
| `SKIP-04` | Skip raised inside a fixture | `@pytest.fixture` calls `pytest.skip("...")` | Outer step `SKIPPED` (setup-phase skip); no duplicate nested step | Outer step `SKIPPED` (setup-phase skip); no duplicate nested step | OK |
5858

5959
## xfail / xpass
6060

6161
| Case | Scenario | Trigger | Observed today | Target | Status |
6262
| ---------- | ----------------------------------------- | ------------------------------------------------------ | ----------------------------------------------------------------------------------------------- | ----------------------------------------------------- | ------ |
63-
| `XFAIL-01` | xfail-marked test that fails | `@pytest.mark.xfail` + `assert 1 == 2` | Outer step `FAILED`; nested `SKIPPED` substep from the makereport hook | Outer step `PASSED` (test fulfilled the xfail expectation); no duplicate nested step | Gap |
64-
| `XFAIL-02` | Strict xfail that unexpectedly passes | `@pytest.mark.xfail(strict=True)` + `assert True` | Outer step `PASSED` (plugin never sees pytest's "strict xpass" failure attached to the report) | Outer step `FAILED` (mark no longer matches reality — either the bug was fixed or the test stopped testing what it claimed) | Gap |
65-
| `XFAIL-03` | Non-strict xfail that unexpectedly passes | `@pytest.mark.xfail()` + `assert True` | Outer step `PASSED` (pytest reports outcome="passed" with `wasxfail` set; plugin ignores it) | Outer step `PASSED` (`strict=False` doesn't insist on the failure) | OK |
66-
| `XFAIL-04` | `xfail(raises=...)` with wrong exception | `@pytest.mark.xfail(raises=ValueError)` + `raise KeyError` | Outer step `ERROR` (treated as a generic non-assertion exception) | `FAILED` (the `raises=` mismatch is a real test failure) | Gap |
67-
| `XFAIL-05` | `xfail(run=False)` | `@pytest.mark.xfail(run=False)` (body never executed) | `SKIPPED` (only the makereport hook records a step) | `SKIPPED` (the test never ran) | OK |
63+
| `XFAIL-01` | xfail-marked test that fails | `@pytest.mark.xfail` + `assert 1 == 2` | Outer step `PASSED` (test fulfilled the xfail expectation); no duplicate nested step | Outer step `PASSED` (test fulfilled the xfail expectation); no duplicate nested step | OK |
64+
| `XFAIL-02` | Strict xfail that unexpectedly passes | `@pytest.mark.xfail(strict=True)` + `assert True` | Outer step `FAILED` (mark no longer matches reality — either the bug was fixed or the test stopped testing what it claimed) | Outer step `FAILED` (mark no longer matches reality — either the bug was fixed or the test stopped testing what it claimed) | OK |
65+
| `XFAIL-03` | Non-strict xfail that unexpectedly passes | `@pytest.mark.xfail()` + `assert True` | Outer step `PASSED` | Outer step `PASSED` (`strict=False` doesn't insist on the failure) | OK |
66+
| `XFAIL-04` | `xfail(raises=...)` with wrong exception | `@pytest.mark.xfail(raises=ValueError)` + `raise KeyError` | `FAILED` (the `raises=` mismatch is a real test failure) | `FAILED` (the `raises=` mismatch is a real test failure) | OK |
67+
| `XFAIL-05` | `xfail(run=False)` | `@pytest.mark.xfail(run=False)` (body never executed) | `SKIPPED` (the test never ran) | `SKIPPED` (the test never ran) | OK |
6868

6969
## Setup / teardown phases
7070

7171
| Case | Scenario | Trigger | Observed today | Target | Status |
7272
| ---------- | ------------------------------------------ | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | ------ |
73-
| `PHASE-01` | Setup-phase fixture failure (RuntimeError) | `@pytest.fixture` raises before `yield`; test body never runs | Outer step does not exist or lands `PASSED`; the plugin does not consult `report.when` | `ERROR` with `phase=setup` annotation | Gap |
74-
| `PHASE-02` | Teardown-phase fixture failure | `@pytest.fixture` raises after `yield`; test body passed | Outer step `PASSED` — it closes before the failing teardown runs, so the error is invisible | `FAILED` with `phase=teardown` annotation | Gap |
75-
| `PHASE-03` | Call-phase fail **plus** teardown-phase fail | `assert 1 == 2` in body AND `@pytest.fixture` raises after `yield` | Outer step `FAILED` (the call-phase failure dominates); the teardown error is silently lost | `FAILED` with a `phase=teardown` annotation so the teardown error is also visible | Gap |
73+
| `PHASE-01` | Setup-phase fixture failure (RuntimeError) | `@pytest.fixture` raises before `yield`; test body never runs | Outer step `ERROR`; the plugin reads the setup-phase report and maps `failed``ERROR` | `ERROR` (a `phase=setup` annotation is a planned follow-up) | OK |
74+
| `PHASE-02` | Teardown-phase fixture failure | `@pytest.fixture` raises after `yield`; test body passed | Outer step `FAILED`; after teardown the plugin upgrades a passed step when the teardown report shows `failed` | `FAILED` (a `phase=teardown` annotation is a planned follow-up) | OK |
75+
| `PHASE-03` | Call-phase fail **plus** teardown-phase fail | `assert 1 == 2` in body AND `@pytest.fixture` raises after `yield` | Outer step `FAILED` (the call-phase failure dominates); the teardown error is not yet surfaced separately | `FAILED`; surfacing the teardown error alongside is a planned follow-up | OK |
7676

7777
## Collection / fixture-resolution failures
7878

7979
| Case | Scenario | Trigger | Observed today | Target | Status |
8080
| --------- | --------------------------------------- | --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | ------ |
81-
| `COLL-01` | Missing fixture | `def test_x(nonexistent_fixture):` | Outer step `PASSED` — the autouse `step` fixture's setup still runs before pytest detects the missing fixture; the user sees a green step for a test that never executed | `ERROR` with `phase=setup` | Gap |
81+
| `COLL-01` | Missing fixture | `def test_x(nonexistent_fixture):` | Outer step `ERROR` — the missing fixture surfaces as a setup-phase failure, which the plugin now maps to `ERROR` | `ERROR` (a `phase=setup` annotation is a planned follow-up) | OK |
8282

8383
## Plugin-API exit paths (in-test mutations)
8484

@@ -121,6 +121,6 @@ Run the suite locally:
121121
pytest lib/sift_client/_tests/util/test_step_status_states.py -v
122122
```
123123

124-
Gap rows fail today. When the plugin fix for a row lands, the matching
125-
test turns green; update the **Observed today** column here to match the
126-
target and flip the row's status to **OK**.
124+
Every row should be `OK`. If a row regresses to `Gap`, the matching test
125+
fails; update the **Observed today** column here to describe the
126+
regression and flip the row's status to **Gap** until the plugin is fixed.

python/lib/sift_client/util/test_results/context_manager.py

Lines changed: 44 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,17 @@
3939
logger = logging.getLogger(__name__)
4040

4141

42+
def format_truncated_traceback(
43+
exc: type[BaseException] | None,
44+
exc_value: BaseException | None,
45+
tb: object | None,
46+
) -> ErrorInfo:
47+
"""Format an ErrorInfo from a traceback, keeping the first frame and the last 10."""
48+
stack = traceback.format_exception(exc, exc_value, tb) # type: ignore[arg-type]
49+
stack = [stack[0], *stack[-10:]] if len(stack) > 10 else stack
50+
return ErrorInfo(error_code=1, error_message="".join(stack))
51+
52+
4253
def log_replay_instructions(log_file: str | Path | None) -> None:
4354
"""Log instructions for manually replaying a test result log file.
4455
@@ -287,6 +298,17 @@ def record_step_outcome(self, outcome: bool, step: TestStep):
287298
self.open_step_results[step.step_path] = False
288299
self.any_failures = True
289300

301+
def mark_step_failed_after_close(self, step: TestStep):
302+
"""Mark a step's parent as failed after the step has already been popped from the stack.
303+
304+
Used by the pytest plugin when a teardown-phase report fires after the
305+
fixture's `__exit__` has already resolved and exited the step.
306+
"""
307+
self.any_failures = True
308+
path_parts = step.step_path.split(".")
309+
if len(path_parts) > 1:
310+
self.open_step_results[".".join(path_parts[:-1])] = False
311+
290312
def resolve_and_propagate_step_result(
291313
self,
292314
step: TestStep,
@@ -383,13 +405,7 @@ def update_step_from_result(
383405
# If we're not showing assertion errors (i.e. pytest), mark step as failed but don't set error info.
384406
self.report_context.record_step_outcome(False, self.current_step)
385407
else:
386-
stack = traceback.format_exception(exc, exc_value, tb) # type: ignore
387-
stack = [stack[0], *stack[-10:]] if len(stack) > 10 else stack
388-
trace = "".join(stack)
389-
error_info = ErrorInfo(
390-
error_code=1,
391-
error_message=trace,
392-
)
408+
error_info = format_truncated_traceback(exc, exc_value, tb)
393409

394410
# Resolve the status of this step (i.e. fail if children failed) and propagate the result to the parent step.
395411
result = self.report_context.resolve_and_propagate_step_result(
@@ -414,6 +430,27 @@ def update_step_from_result(
414430
return result
415431

416432
def __exit__(self, exc, exc_value, tb):
433+
if getattr(self, "_sift_managed_externally", False):
434+
# The pytest fixture already resolved status from phase reports.
435+
# Run the standard propagation so the parent step sees this step's
436+
# pass/fail, emit one update_step with the resolved values, and pop
437+
# from the stack without re-classifying.
438+
assert self.current_step is not None
439+
result = self.report_context.resolve_and_propagate_step_result(
440+
self.current_step, self.current_step.error_info
441+
)
442+
self.current_step.update(
443+
{
444+
"status": self.current_step.status,
445+
"end_time": datetime.now(timezone.utc),
446+
"error_info": self.current_step.error_info,
447+
},
448+
)
449+
self.report_context.exit_step(self.current_step)
450+
if hasattr(self, "force_result"):
451+
result = self.force_result
452+
return result
453+
417454
result = self.update_step_from_result(exc, exc_value, tb)
418455

419456
# Now that the step is updated. Let the report context handle removing it from the stack and updating the report context.

0 commit comments

Comments
 (0)