Skip to content

Commit da53bb0

Browse files
authored
fix: restore example collection during directory traversal (#794) (#795)
* fix: restore example collection during directory traversal (#794) pytest_pycollect_makemodule only fires for files matching python_files (test_*.py) — return ExampleModule from pytest_collect_file so directory traversal collects examples. * fix: require # pytest: marker for example collection (#796) Examples without a marker are now silently ignored instead of being collected as tests. Removes spurious marker from helper/helpers.py and documents opt-in behaviour in AGENTS.md and MARKERS_GUIDE.md. * docs: expand required Ollama models list in CONTRIBUTING.md The previous list only covered 4 models (CI subset). Expanded to include all models needed by examples and tests, grouped by context (CI, examples, test suite) with a one-liner to pull the lot. * fix: harden example collection hooks against silent failures - pytest_pycollect_makemodule: return SkippedFile (not None) for markerless files, closing the direct-specification bypass - pytest_collect_file: narrow except to OSError, return None on failure (exclude) instead of falling through to collect - Fix stale docstring and comments from the old collection model * fix: harden collection hooks and add regression test Address code review findings from PR #795: - Fix duplicate collection: guard pytest_collect_file with isinitpath() so directly-specified files defer to pytest_pycollect_makemodule - Remove dead try/except OSError (_extract_markers_from_file is self-contained) - Key examples_to_skip by full path instead of basename to prevent collisions across subdirectories - Use file_path.parts instead of str() substring match in pytest_pycollect_makemodule for correct path filtering - Add test/test_example_collection.py regression test verifying support files are excluded, examples are collected, and no duplicates occur Signed-off-by: Nigel Jones <nigelgj@ie.ibm.com> Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> --------- Signed-off-by: Nigel Jones <nigelgj@ie.ibm.com> Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
1 parent f885a3c commit da53bb0

6 files changed

Lines changed: 129 additions & 49 deletions

File tree

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ Tests use a four-tier granularity system (`unit`, `integration`, `e2e`, `qualita
5151

5252
See **[test/MARKERS_GUIDE.md](test/MARKERS_GUIDE.md)** for the full marker reference (tier definitions, backend markers, resource gates, auto-skip logic, common patterns).
5353

54-
**Examples in `docs/examples/`** use comment-based markers:
54+
**Examples in `docs/examples/`** are opt-in — unlike `test/` files (auto-collected, default `unit`), examples require an explicit `# pytest:` comment to be collected. Files without this comment are silently ignored (they won't appear in skip summaries either). This is because examples have variable dependencies and limited setup:
5555
```python
5656
# pytest: e2e, ollama, qualitative
5757
"""Example description..."""

CONTRIBUTING.md

Lines changed: 39 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -354,12 +354,46 @@ uv run ruff check .
354354
### Required Models
355355

356356
#### Ollama
357-
- `granite4:micro-h`
358-
- `granite3.2-vision`
359-
- `granite4:micro`
360-
- `qwen2.5vl:7b`
361357

362-
_Note: ollama models can be obtained by running `ollama pull <model>`_
358+
HuggingFace and cloud backends download or host models automatically. Ollama
359+
models must be pulled locally before running the tests that need them.
360+
361+
**CI (unit + integration tests):**
362+
363+
- `granite4:micro` — default model for `start_session()` and most examples
364+
- `granite4:micro-h` — hybrid variant used by conftest fixtures
365+
366+
**Examples (`docs/examples/`):**
367+
368+
- `deepseek-r1:8b` — safety / guardian examples
369+
- `granite3-guardian:2b` — mini-researcher guardian backend
370+
- `granite3.2-vision` — vision (Ollama chat) example
371+
- `granite3.3:8b` — m\_decompose example
372+
- `granite4:latest` — melp examples
373+
- `llama3.2` — repair-with-guardian example
374+
- `llama3.2:3b` — tutorial / mify examples (via `META_LLAMA_3_2_3B`)
375+
- `phi:2.7b` — SOFAI graph-colouring example
376+
- `pielee/qwen3-4b-thinking-2507_q8:latest` — SOFAI S2 solver
377+
- `qwen2.5vl:7b` — vision (OpenAI-via-Ollama) example
378+
379+
**Additional test models (`test/`):**
380+
381+
- `granite4:small-h` — hybrid-small tests
382+
- `llama3.2:1b` — lightweight inference tests
383+
- `llama3:8b` — legacy Llama 3 tests
384+
- `llava` — multimodal tests
385+
- `mistral:7b` — Mistral backend tests
386+
- `smollm2:1.7b` — SmolLM tests
387+
388+
Pull everything:
389+
390+
```bash
391+
for m in granite4:micro granite4:micro-h deepseek-r1:8b \
392+
granite3-guardian:2b granite3.2-vision granite3.3:8b granite4:latest \
393+
llama3.2 llama3.2:3b phi:2.7b pielee/qwen3-4b-thinking-2507_q8:latest \
394+
qwen2.5vl:7b granite4:small-h llama3.2:1b llama3:8b llava mistral:7b \
395+
smollm2:1.7b; do ollama pull "$m"; done
396+
```
363397

364398
### Test Markers
365399

docs/examples/conftest.py

Lines changed: 40 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -282,22 +282,22 @@ def pytest_collection_finish(session):
282282

283283

284284
def pytest_terminal_summary(terminalreporter, exitstatus, config):
285-
# Append the skipped examples if needed.
286-
if len(examples_to_skip) == 0:
287-
return
288-
289-
terminalreporter.ensure_newline()
290-
terminalreporter.section("Skipped Examples", sep="=", blue=True, bold=True)
291-
terminalreporter.line("The following examples were skipped during collection:\n")
292-
for filename, reason in examples_to_skip.items():
293-
terminalreporter.line(f" • {filename}: {reason}")
285+
if examples_to_skip:
286+
terminalreporter.ensure_newline()
287+
terminalreporter.section("Skipped Examples", sep="=", blue=True, bold=True)
288+
terminalreporter.line(
289+
"The following examples were skipped during collection:\n"
290+
)
291+
for filepath, reason in examples_to_skip.items():
292+
terminalreporter.line(f" • {pathlib.Path(filepath).name}: {reason}")
294293

295294

296295
def pytest_pycollect_makemodule(module_path, parent):
297296
"""Intercepts Module creation to skip files before import.
298297
299-
Runs for both directory traversal and direct file specification.
300-
Returning a SkippedFile prevents pytest from importing the file,
298+
Only fires for files matching python_files (default test_*.py) during
299+
directory traversal, or for any file specified directly on the command
300+
line. Returning a SkippedFile prevents pytest from importing the file,
301301
which is necessary when files contain unavailable dependencies.
302302
303303
Args:
@@ -307,7 +307,7 @@ def pytest_pycollect_makemodule(module_path, parent):
307307
file_path = module_path
308308

309309
# Limit scope to docs/examples directory
310-
if "docs" not in str(file_path) or "examples" not in str(file_path):
310+
if "docs" not in file_path.parts or "examples" not in file_path.parts:
311311
return None
312312

313313
if file_path.name == "conftest.py":
@@ -319,14 +319,14 @@ def pytest_pycollect_makemodule(module_path, parent):
319319
config._example_capabilities = get_system_capabilities()
320320

321321
# Check manual skip list
322-
if file_path.name in examples_to_skip:
322+
if str(file_path) in examples_to_skip:
323323
return SkippedFile.from_parent(parent, path=file_path)
324324

325325
# Extract and evaluate markers
326326
markers = _extract_markers_from_file(file_path)
327327

328328
if not markers:
329-
return None
329+
return SkippedFile.from_parent(parent, path=file_path)
330330

331331
should_skip, _reason = _should_skip_collection(markers)
332332

@@ -365,16 +365,19 @@ def pytest_ignore_collect(collection_path, config):
365365
and "examples" in abs_path.parts
366366
):
367367
# Skip files in the manual skip list
368-
if collection_path.name in examples_to_skip:
368+
if str(collection_path) in examples_to_skip:
369369
return True
370370

371371
# Extract markers and check if we should skip
372372
try:
373373
markers = _extract_markers_from_file(collection_path)
374+
# No markers → not a runnable example (e.g. __init__.py, helpers)
375+
if not markers:
376+
return True
374377
should_skip, reason = _should_skip_collection(markers)
375378
if should_skip and reason:
376379
# Add to skip list with reason for terminal summary
377-
examples_to_skip[collection_path.name] = reason
380+
examples_to_skip[str(collection_path)] = reason
378381
# Return True to ignore this file completely
379382
return True
380383
except Exception as e:
@@ -389,36 +392,35 @@ def pytest_ignore_collect(collection_path, config):
389392
return False
390393

391394

392-
# This doesn't replace the existing pytest file collection behavior.
393395
def pytest_collect_file(parent: pytest.Dir, file_path: pathlib.PosixPath):
394-
# Do a quick check that it's a .py file in the expected `docs/examples` folder. We can make
395-
# this more exact if needed.
396+
"""Provide an explicit collector for example files in docs/examples/."""
396397
if (
397398
file_path.suffix == ".py"
398399
and "docs" in file_path.parts
399400
and "examples" in file_path.parts
400401
):
401-
# Skip this test. It requires additional setup.
402-
if file_path.name in examples_to_skip:
403-
return
402+
# Directly-specified files are handled by pytest_pycollect_makemodule —
403+
# only provide an explicit collector during directory traversal.
404+
if parent.session.isinitpath(file_path):
405+
return None
404406

405-
# Check markers first - if file has skip marker, return SkippedFile
406-
try:
407-
markers = _extract_markers_from_file(file_path)
408-
should_skip, _reason = _should_skip_collection(markers)
409-
if should_skip:
410-
# FIX: Return a dummy collector instead of None.
411-
# This prevents pytest from falling back to the default Module collector
412-
# which would try to import the file.
413-
return SkippedFile.from_parent(parent, path=file_path)
414-
except Exception:
415-
# If we can't read markers, continue with other checks
416-
pass
407+
# Already flagged for skipping (missing system capability)
408+
if str(file_path) in examples_to_skip:
409+
return
417410

418-
# ExampleModule (returned by pytest_pycollect_makemodule) handles
419-
# collection for files that should run — return None here to avoid
420-
# creating a duplicate collector from this hook.
421-
return None
411+
# Check markers — no markers means not a runnable example.
412+
# _extract_markers_from_file is self-contained (returns [] on error),
413+
# so no try/except needed here.
414+
markers = _extract_markers_from_file(file_path)
415+
if not markers:
416+
return None
417+
should_skip, _reason = _should_skip_collection(markers)
418+
if should_skip:
419+
return SkippedFile.from_parent(parent, path=file_path)
420+
421+
# pytest_pycollect_makemodule only fires for files matching python_files
422+
# (test_*.py) — examples need an explicit collector for directory traversal.
423+
return ExampleModule.from_parent(parent, path=file_path)
422424

423425

424426
class SkippedFile(pytest.File):

docs/examples/helper/helpers.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
# pytest: ollama, e2e
2-
31
from textwrap import fill
42
from typing import Any
53

test/MARKERS_GUIDE.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -270,16 +270,20 @@ pytestmark = [pytest.mark.e2e, pytest.mark.huggingface,
270270

271271
## Example Files (`docs/examples/`)
272272

273-
Examples use a comment-based marker format instead of `pytestmark`:
273+
Unlike `test/` files (which are auto-collected and default to `unit`), examples
274+
require an explicit `# pytest:` comment to be collected. This opt-in approach
275+
reflects that examples often have variable dependencies and limited setup, so
276+
only files that declare themselves runnable should be executed.
274277

275278
```python
276279
# pytest: e2e, ollama, qualitative
277280
"""Example description..."""
278281
```
279282

280283
Same classification rules apply. The comment must appear in the first few
281-
lines before non-comment code. Parser: `docs/examples/conftest.py`
282-
(`_extract_markers_from_file`).
284+
lines before non-comment code. Files without this comment are silently
285+
ignored — they won't appear in skip summaries or collection output.
286+
Parser: `docs/examples/conftest.py` (`_extract_markers_from_file`).
283287

284288
## Adding Markers to New Tests
285289

test/test_example_collection.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
"""Regression tests for docs/examples/ collection hooks.
2+
3+
These hooks have regressed twice (#794, #796). This test ensures:
4+
- Support files (__init__.py, helpers.py, conftest.py) are never collected
5+
- Real examples with markers ARE collected
6+
- No example is collected twice (duplicate guard)
7+
"""
8+
9+
import subprocess
10+
11+
12+
def test_example_collection_sanity():
13+
"""Verify example collection excludes support files and avoids duplicates."""
14+
result = subprocess.run(
15+
["uv", "run", "pytest", "docs/examples/", "--collect-only", "-q"],
16+
capture_output=True,
17+
text=True,
18+
timeout=120,
19+
)
20+
21+
lines = result.stdout.splitlines()
22+
# Collected test IDs are lines before the blank/summary lines
23+
collected = [line for line in lines if "::" in line]
24+
25+
# Support files must never appear as collected tests
26+
for item in collected:
27+
filename = item.split("::")[0].rsplit("/", 1)[-1]
28+
assert filename != "__init__.py", f"__init__.py collected as test: {item}"
29+
assert filename != "helpers.py", f"helpers.py collected as test: {item}"
30+
assert filename != "conftest.py", f"conftest.py collected as test: {item}"
31+
32+
# Sanity floor — we have ~79 examples today; 50 is a safe lower bound
33+
assert len(collected) >= 50, (
34+
f"Only {len(collected)} examples collected — expected at least 50. "
35+
"Collection hooks may be broken."
36+
)
37+
38+
# No duplicates — each test ID should appear exactly once
39+
seen = set()
40+
for item in collected:
41+
assert item not in seen, f"Duplicate collection detected: {item}"
42+
seen.add(item)

0 commit comments

Comments
 (0)