Skip to content

fix(ci): run full non-e2e suite, drop -m v2 blind spot (#669)#697

Merged
frankbria merged 3 commits into
mainfrom
fix/669-ci-v2-blind-spot
Jun 19, 2026
Merged

fix(ci): run full non-e2e suite, drop -m v2 blind spot (#669)#697
frankbria merged 3 commits into
mainfrom
fix/669-ci-v2-blind-spot

Conversation

@frankbria

@frankbria frankbria commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Closes #669.

Problem

The backend CI gate ran pytest tests/ --ignore=tests/e2e -m v2, but the root
tests/conftest.py never auto-marks v2. So -m v2 silently deselected ~514
tests (~40% of files)
— they never gated a PR. The v1→v2 refactor the filter
existed for is finished.

What I did

Categorized the 514 deselected tests (-m "not v2"): 497 pass, 16 skip, 1
fails
— and that 1 is a real-LLM lifecycle test, excluded from CI by design.
=> No dead tests remain (the v1 /api/projects pair was already removed in
#667). The blind spot was ~500 legit tests that simply never ran in CI.

Changes

  • CI (.github/workflows/test.yml): -m v2-m "not lifecycle". Every
    non-e2e test now gates PRs; only real-LLM lifecycle tests are excluded (they
    need a key + cost money — run locally via scripts/lifecycle before the PR).
    timeout-minutes 15 → 20 for the ~500 extra tests under coverage.
  • pytest.ini: document the two-tier gate. The v2 marker is kept for
    back-compat but no longer gates — anything non-e2e/non-lifecycle runs by default,
    so the suite can't silently shrink again.
  • tests/auth/test_query_param_token.py: sign JWTs with the live
    manager.SECRET instead of the import-time binding.

Two-tier test strategy (explicit going forward)

  • CI (every PR): -m "not lifecycle" — fast, no API key, environment-correctness.
  • Local pre-PR: scripts/lifecycle --mode cli|api|web|all — the real-LLM
    end-to-end tests. Not skipped, just a separate tier.

Why the test change was needed

Un-hiding the suite surfaced a real order-dependent flake. test_query_param_token
bound SECRET at import and signed tokens with it, but get_current_user verifies
against the live manager.SECRET. Any test that starts the app via TestClient
refreshes the secret from .env (lifespan → refresh_secret), making import-bound
tokens unverifiable → 3 happy-path tests 401. CI's backend job has no .env, so it
was a local-only landmine, but it's now fixed and deterministic.

Verification

  • Full non-e2e run (-m "not lifecycle"): 3830 passed, 11 skipped, 6 deselected
    (lifecycle only), 0 failed
    in 8m06s (was 3333 gated / 514 blind).
  • Leak repro (adapters + health + query_param): 3 failed → 44 passed after the fix.
  • ruff check clean.

Acceptance criteria

  • Deselected tests categorized (dead vs legit); dead ones removed (0 remained).
  • Legit tests run in CI (filter relaxed to not lifecycle).
  • CI no longer silently excludes ~40% of files; selection is explicit and green.

Known limitations

  • lifecycle tests still don't run in CI by design (real LLM). They're covered by
    the local scripts/lifecycle tier, not this gate.

Summary by CodeRabbit

  • Tests
    • Updated CI to run all non-e2e tests except real-LLM lifecycle tests.
    • Increased the pytest step timeout to 20 minutes.
    • Improved JWT query-param token generation to use the current shared secret at runtime.
  • Documentation
    • Refreshed test command examples to match the new non-lifecycle CI selection behavior.
  • Chores
    • Updated CI configuration guidance/comments to reflect the updated test selection strategy.

The backend CI gate ran `pytest -m v2`, but the root conftest never
auto-marked v2, so ~514 legit tests (40% of files) were silently
deselected and never gated a PR. The v1->v2 refactor that the filter
existed for is done.

Categorized the 514 deselected tests: 497 pass, 16 skip, 1 is a
real-LLM `lifecycle` test (correctly excluded). No dead tests remain
(the v1 /api/projects pair was removed in #667).

Changes:
- CI: `-m v2` -> `-m "not lifecycle"` so every non-e2e test gates PRs;
  only real-LLM lifecycle tests (run locally via scripts/lifecycle) are
  excluded. Timeout 15->20m for the ~500 extra tests under coverage.
- pytest.ini: document the new two-tier gate; v2 marker kept for
  back-compat but no longer gates.
- tests/auth/test_query_param_token.py: sign JWTs with the LIVE
  manager.SECRET instead of the import-time binding. Un-hiding the
  suite surfaced this order-dependent flake: any test that starts the
  app via TestClient refreshes SECRET from .env (lifespan), which made
  import-bound tokens unverifiable.

Full non-e2e run: 3830 passed, 11 skipped, 6 deselected (lifecycle), 0 failed.
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: adb57dfd-ad54-4422-b39d-c108d033a662

📥 Commits

Reviewing files that changed from the base of the PR and between 6ee3c56 and 683cae9.

📒 Files selected for processing (1)
  • CLAUDE.md
✅ Files skipped from review due to trivial changes (1)
  • CLAUDE.md

Walkthrough

The CI backend-tests job switches its pytest marker from -m v2 to -m "not lifecycle", increasing step timeout from 15 to 20 minutes. pytest.ini and CLAUDE.md documentation are updated to reflect the new test selection. _make_token in the auth query-param token test now reads manager.SECRET at call time instead of at import time.

Changes

CI Test Marker Switch and Auth Token Fix

Layer / File(s) Summary
CI marker switch and test documentation
.github/workflows/test.yml, pytest.ini, CLAUDE.md
The backend-tests job changes from -m v2 to -m "not lifecycle" with a 20-minute timeout. pytest.ini inline comments and CLAUDE.md examples are updated to describe CI running all non-e2e tests except lifecycle, and obsolete v2-only run instructions are removed.
Auth token helper runtime secret
tests/auth/test_query_param_token.py
_make_token drops the import-time SECRET default and instead resolves manager.SECRET at call time when secret is None, ensuring the token helper always uses the current live secret.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • frankbria/codeframe#464: Directly modifies the same CI backend-tests pytest selection in .github/workflows/test.yml, enforcing -m v2 — the inverse of this PR's change.
  • frankbria/codeframe#592: Works on unblocking the lifecycle-marked test workflow, which aligns with this PR's decision to exclude lifecycle tests from the main CI gate.

Poem

🐇 Hoppity hop through the test suite I go,
No more -m v2 leaving secrets below!
Now "not lifecycle" opens the gate,
And secrets are read at the runtime, not late.
Twenty whole minutes — that's all that I need,
To hop through the tests at full bunny speed! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: replacing the -m v2 blind spot with a full non-e2e suite run, directly addressing issue #669.
Linked Issues check ✅ Passed The PR fully addresses all three acceptance criteria: deselected tests categorized (497 pass, 16 skip, 1 lifecycle intentionally excluded), legit tests now run in CI via -m not lifecycle, and CI selection is explicit without silent exclusions.
Out of Scope Changes check ✅ Passed All changes are directly scoped to resolving the CI blind spot: workflow config updates, pytest guidance docs, test helper fix for expanded suite, and documentation refresh—no extraneous modifications present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/669-ci-v2-blind-spot

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude

claude Bot commented Jun 19, 2026

Copy link
Copy Markdown

Code Review — fix(ci): run full non-e2e suite, drop -m v2 blind spot

Overview

This PR fixes a significant CI blind spot: the -m v2 filter was silently deselecting ~514 tests (~40% of files) that lacked the v2 marker. The fix broadens the gate to -m "not lifecycle" and documents the new two-tier strategy. A real order-dependent flake in test_query_param_token that the broader suite surfaced is also fixed.

What is good

  • Root cause is well-diagnosed. The PR description clearly explains the problem, shows the categorization work (497 pass, 16 skip, 1 fails — and the 1 is by design), and documents the pre/post test counts. That investigation work is exactly what makes this safe to merge.
  • CI marker strategy is correct. -m "not lifecycle" is a stable negative filter — new tests run by default unless explicitly opted out. The previous -m v2 was opt-in, creating a silent gap whenever a test file was added without the marker.
  • Secret fix is sound. The manager.SECRET late-binding fix in _make_token correctly addresses the order-dependent flake. The comment explains why clearly, which is appropriate here (a non-obvious timing dependency between lifespan startup and import-time binding).
  • Timeout increase is proportionate. 15m → 20m for ~500 extra tests is conservative but reasonable. The PR reports 8m06s actual wall time in the full run, so there is headroom.
  • pytest.ini documentation is accurate and removes the stale -m v2 example that would now mislead contributors.

Minor observations

1. Sentinel default in _make_token

The type annotation says str but the default is None. For clean typing it should be str | None = None. Not a blocker.

2. 126 files still carry the v2 marker

They will continue to run (they were always gated). But now that the marker is no longer a CI gate, pytestmark = pytest.mark.v2 is dead weight in those files. The pytest.ini comment correctly documents this. Cleaning those up is a separate low-priority task.

3. Watch the first few CI runs post-merge

The PR reports 497 pass for the newly un-hidden tests. Worth watching whether any have latent fixture or env-var sensitivities in CI that do not show locally. The PR description implies this was verified, but the first live CI run will confirm.

Verdict

This is a correctness fix to the CI gate with clear pre-merge verification. The code changes are minimal and tightly scoped. The type annotation nit is optional cleanup.

Approve with minor notes. Nothing here should block merge.

@claude

claude Bot commented Jun 19, 2026

Copy link
Copy Markdown

Follow-up — CLAUDE.md documentation update

Picking up from my prior review (which approved with minor notes).

The latest commit updates CLAUDE.md to document the new two-tier test strategy:

  • Replaces the stale uv run pytest -m v2 example with the actual CI gate command: uv run pytest --ignore=tests/e2e -m "not lifecycle"
  • Replaces the misleading uv run pytest tests/lifecycle/ entry (it implied no API key was needed — but lifecycle tests do use a real LLM) with the correct scripts/lifecycle --mode cli|api|web|all entry

This is exactly the right kind of update to make alongside a CI gate change. The developer guide now matches what CI actually runs, so contributors won't be confused about what to verify before opening a PR.

My prior approval stands. Nothing in this update changes the analysis — the CLAUDE.md delta is correct and consistent with the rest of the PR.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLAUDE.md`:
- Line 156: Update line 156 in CLAUDE.md to align the documented CI gate command
with the configuration in pyproject.toml. Change the command from `uv run pytest
--ignore=tests/e2e -m "not lifecycle"` to `uv run pytest tests/
--ignore=tests/e2e -m "not lifecycle"` by adding the explicit `tests/` directory
argument to match the exact command specified in pyproject.toml for consistency
and clarity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc6b9a6e-6aa3-4123-8b19-568711a91cc3

📥 Commits

Reviewing files that changed from the base of the PR and between f39ca5b and 6ee3c56.

📒 Files selected for processing (1)
  • CLAUDE.md

Comment thread CLAUDE.md Outdated
@claude

claude Bot commented Jun 19, 2026

Copy link
Copy Markdown

Follow-up Review — No new concerns

Consistent with my prior two reviews on this PR.

The diff now shows all four files changed:

  • CI gate (.github/workflows/test.yml) — correct
  • CLAUDE.md — documented correctly (covered in my second comment)
  • pytest.ini — accurate, stale -m v2 example removed
  • test_query_param_token.py — late-binding fix is sound

The one minor nit from my first review still stands: secret: str = None should be secret: str | None = None for clean typing. It does not affect runtime behavior.

Nothing new to flag. This is ready to merge.

@frankbria frankbria merged commit a80d527 into main Jun 19, 2026
16 checks passed
@frankbria frankbria deleted the fix/669-ci-v2-blind-spot branch June 19, 2026 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P6.8.3] Reconcile CI -m v2 blind spot: ~535 tests (40% of files) never run in CI

1 participant