test(conftest): isolate polish cache per test, harden API-key delete#32
Merged
Merged
Conversation
Two related test-isolation gaps that surfaced as snapshot flakes:
1. The autouse polish fixture set ``ATTUNE_AUTHOR_STRICT_POLISH=false``
and deleted ``ANTHROPIC_API_KEY``, but did not redirect
``ATTUNE_AUTHOR_POLISH_CACHE``. Every test shared the dev machine's
real ``~/.attune/polish_cache``. A prior live ``regenerate`` run
would populate the cache with polished output; subsequent
golden-snapshot tests would then observe LLM-rewritten content
instead of the deterministic Jinja fallback, depending on which
tests had run before — flaky between machines and between
sessions on the same machine. Point the cache at a per-session
tmp directory via ``tmp_path_factory``.
2. ``monkeypatch.delenv`` raised ``KeyError`` when the var was not
already set — fragile across environments. Use
``raising=False``.
Repro of the flake (on a dev machine with ``.env`` carrying a live
key and ``~/.attune/polish_cache`` populated from a real run):
pytest tests/test_generated_templates_golden.py
# FAILED test_task_template_matches_snapshot — snapshot shows raw
# Jinja output, observed value is the LLM-polished rewrite
After this commit, the test suite is hermetic w.r.t. polish state
regardless of the host's cache or env.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
silversurfer562
added a commit
that referenced
this pull request
May 15, 2026
…eck Phase 1 (#34) Bumps pyproject.toml from 0.11.1 → 0.13.0 and converts the CHANGELOG Unreleased section into the 0.13.0 release notes. Skipping 0.12.0 — the internal release/v0.12.0 branch carried the polish fact-check Phase 1 work but was never published to PyPI. That work now ships in 0.13.0 alongside the regenerator fixes prompted by attune-rag d39e39d. Headline changes: - #31 — reference templates carry typed Parameters/Returns columns without depending on the LLM polish pass (closes #30) - #32 — test fixture isolates the polish cache per session to prevent shared-state flakes in golden snapshot tests - #33 — polish bypass surfaced in YAML frontmatter (polish: skipped) - #28 — polish fact-check Phase 1 (already on main from prior PR) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related test-isolation gaps that surfaced as snapshot flakes during the work on #30:
The autouse polish fixture set `ATTUNE_AUTHOR_STRICT_POLISH=false` and deleted `ANTHROPIC_API_KEY`, but did not redirect `ATTUNE_AUTHOR_POLISH_CACHE`. Every test shared the dev machine's real `~/.attune/polish_cache`. A prior live `regenerate` run would populate that cache with polished output; subsequent golden-snapshot tests would then silently observe LLM-rewritten content instead of the deterministic Jinja fallback. Repro is environment-dependent — passes in CI (no cache, no `.env` live key), fails locally if a real run primed the cache.
`monkeypatch.delenv("ANTHROPIC_API_KEY")` raised `KeyError` when the var was not already set. Use `raising=False`.
What changed
Why now
While verifying #31 I hit `test_task_template_matches_snapshot` flaking. The root cause was the shared real polish cache, not the snapshot — but until this is fixed, future PRs will keep hitting the same false negative locally.
Test plan
🤖 Generated with Claude Code