Skip to content

fix: prevent LLM polish from laundering source_hash frontmatter#48

Merged
silversurfer562 merged 2 commits into
mainfrom
fix/source-hash-llm-laundering
May 27, 2026
Merged

fix: prevent LLM polish from laundering source_hash frontmatter#48
silversurfer562 merged 2 commits into
mainfrom
fix/source-hash-llm-laundering

Conversation

@silversurfer562

Copy link
Copy Markdown
Member

Summary

Fixes the regen/staleness source_hash mismatch bug — features stayed permanently "stale" after a successful regen because the polished file's frontmatter source_hash didn't match what compute_source_hash recomputed.

Root cause

apply_polish_results wrote the LLM's polished output verbatim, including any frontmatter the LLM emitted. The LLM is given the rendered template (with frontmatter) as input context; it polishes the body but ALSO echoes the frontmatter — sometimes with single-character transcription errors in deterministic fields.

Concrete evidence (attune-ai spec-engine, 2026-05-27):

frontmatter: f8ced22b02899aa25ff409636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
computed:    f8ced22b02899aa25ff709636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
                                ^
                                position 19: LLM wrote f4 instead of f7

compute_source_hash is called once at generator.py:355 and is fully deterministic (verified by calling it twice in a row — identical output). The divergence is entirely in the polish step's output mutation. The original spec hypothesis (budget truncation of hash inputs) was wrong; the polish layer never re-hashes the source.

Fix

_replace_polished_frontmatter(polished, canonical_source) strips whatever frontmatter the LLM emitted from the polished content and re-injects the canonical frontmatter from entry.rendered_content. apply_polish_results calls it for every depth with a polished result. Lenient-mode failures fall through to the raw rendered template (which already has correct frontmatter).

Tests

7 new regression tests in tests/unit/test_polished_frontmatter_reinjection.py:

  • test_polished_with_perturbed_source_hash_is_corrected — exactly reproduces the f4/f7 bug, asserts canonical hash wins.
  • test_polished_with_missing_frontmatter_gets_canonical_prepended — LLM strips frontmatter → still gets canonical prepended.
  • test_polished_with_correct_frontmatter_unchanged_semantically — non-perturbed case still works.
  • test_canonical_without_frontmatter_returns_polished_unchanged — defensive: no canonical frontmatter → polished returned as-is.
  • test_extra_blank_lines_in_polished_body_preserved — body whitespace passes through.
  • test_polished_template_keeps_canonical_source_hash — end-to-end via apply_polish_results.
  • test_no_polish_result_uses_rendered_content_directly — lenient-mode fallthrough path.

Local: 173/173 unit tests pass.

Spec doc

docs/specs/regen-staleness-hash-mismatch/decisions.md updated with the verified diagnosis, replacing the original (wrong) budget-truncation hypothesis.

Downstream impact

Unblocks attune-gui Phase 2 (living-docs-regen-automation) which needed attune-author status --dry-run to reach a fixed point after regen. Once this lands and gets released (likely 0.14.2 patch), attune-gui can pin the new version and turn on its CI fail-if-stale gate.

After this lands: cut attune-author 0.14.2 → update attune-ai's attune-author dep cap if needed → regenerate spec-engine in attune-ai (the feature that's still stale post-fix in the current ecosystem) and confirm staleness clears.

Test plan

  • Local: 173/173 unit tests pass
  • CI: full matrix green
  • After merge: cut 0.14.2 release; verify on PyPI; confirm in attune-ai that attune-author generate spec-engine --all-kinds clears the stale flag

🤖 Generated with Claude Code

silversurfer562 and others added 2 commits May 27, 2026 07:30
Root cause: `apply_polish_results` wrote the LLM's polished output
verbatim, including any frontmatter the LLM emitted. The LLM is
given the rendered template (with frontmatter) as input context;
it polishes the body but ALSO echoes the frontmatter — sometimes
with single-character transcription errors in deterministic fields
like `source_hash`. That broke staleness detection: the
frontmatter `source_hash` written into the polished file didn't
match what `compute_source_hash` recomputed on the same source,
leaving the feature permanently "stale" after a successful regen.

Concrete evidence (attune-ai spec-engine, 2026-05-27):

  frontmatter: f8ced22b02899aa25ff409636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
  computed:    f8ced22b02899aa25ff709636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
                                  ^
                                  position 19: LLM wrote f4 instead of f7

Single-character difference at one byte of a 64-char SHA-256 hex
digest. The same `compute_source_hash` function called twice in
the same Python process returns identical values — pure LLM
hallucination of the value it was supposed to echo verbatim.

Fix: strip whatever frontmatter the LLM emitted from the polished
content and re-inject the canonical frontmatter from
`entry.rendered_content`. The LLM polishes the BODY; the
frontmatter (especially `source_hash`, `generated_at`, `feature`,
`depth`, `name`, `status`, `type`) is non-negotiable deterministic
metadata that must survive the polish step exactly.

Implementation:

- New `_replace_polished_frontmatter(polished, canonical_source)`
  helper in `generator.py`. Uses a frontmatter regex to extract
  the canonical block, strip whatever the LLM emitted, and
  re-assemble.
- `apply_polish_results` now calls the helper for every depth
  with a polished result. Lenient-mode failures (depth missing
  from `polished_by_depth`) fall through to the raw rendered
  template, which already has correct frontmatter.

Tests: 7 new regression tests covering the corrupted-hash case,
LLM-stripped frontmatter, LLM-correct frontmatter, no-canonical-
frontmatter defensive path, and body-whitespace preservation —
plus two behavioral tests asserting `apply_polish_results` writes
the canonical hash to disk regardless of LLM perturbation.

Spec doc (`docs/specs/regen-staleness-hash-mismatch/decisions.md`)
updated with the verified root cause replacing the original
budget-truncation hypothesis (which was wrong — `compute_source_hash`
runs only once and is fully deterministic; the divergence lives
entirely in the polish step's output mutation).

Unblocks attune-gui Phase 2 (`living-docs-regen-automation`) which
needed `attune-author status --dry-run` to reach a fixed point
after regen.

Local: 173/173 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous whole-block replacement broke 3 golden snapshot tests
that asserted on the `polish: skipped` frontmatter marker which
the lenient-mode polish failure path adds via
`_mark_polish_skipped`. My whole-block replace discarded that
marker.

Switching to field-level merge:
- DETERMINISTIC fields (type, name, feature, depth, generated_at,
  source_hash, status): canonical from rendered template wins.
- All OTHER fields (polish: skipped, future markers): polished
  output preserved as-emitted.

Implementation: parse both frontmatter blocks line-by-line, walk
polished's lines, swap deterministic-keyed lines with canonical's
version, keep everything else. Append any deterministic canonical
fields the polished output dropped entirely.

Tests added:
- test_polish_skipped_marker_preserved — regression on the lenient
  failure path. Exercises both the deterministic-field override
  AND the marker preservation in one assertion.
- test_unknown_non_deterministic_field_preserved — forward-compat
  for future polish-layer fields.

Local: 979/979 unit tests pass; 15/15 in this file's slice +
3/3 golden snapshots restored. End-to-end re-verified on attune-ai
spec-engine: 11/11 regenerated templates have canonical
source_hash matching compute_source_hash.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codecov

codecov Bot commented May 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.39535% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/attune_author/generator.py 81.39% 4 Missing and 4 partials ⚠️

📢 Thoughts on this report? Let us know!

@silversurfer562 silversurfer562 merged commit 1b1c7c5 into main May 27, 2026
12 of 13 checks passed
@silversurfer562 silversurfer562 mentioned this pull request May 27, 2026
5 tasks
silversurfer562 added a commit that referenced this pull request May 27, 2026
Patch release fixing the source_hash LLM-laundering bug (PR #48).

After successful regen, features stayed permanently "stale"
because the polish LLM transcribed the source_hash field with
single-character errors. Field-level frontmatter merge restores
the canonical deterministic fields after polish while preserving
non-deterministic markers like `polish: skipped`.

Unblocks attune-gui Phase 2 (living-docs-regen-automation) once
attune-gui pins this version. attune-ai's dashboard stale-count
also stops mis-flagging just-regenerated features.

Also includes the `publish.yml` trigger swap that was queued in
Unreleased since 0.14.1.

Local: 983/983 unit tests pass. Dist builds clean.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
silversurfer562 added a commit that referenced this pull request Jun 7, 2026
* docs(specs): reconcile regen-pipeline — mark satisfied-by-different-means

Audit found regen-pipeline was marked "complete" with all 24 tasks
checked, but none of its named symbols ever shipped in either repo
(attune-author: _regen, regen_template(corpus_root=...),
_resolve_corpus_root, atomic_write, _patch_summaries_json;
attune-gui: /api/config, /api/templates/refresh-all,
/api/browse/directory, CorpusSetup, App.jsx). A bogus "Shipped" note
had conflated it with the unrelated hash-mismatch regenerate CLI.

The 3 user stories are all satisfied by a more evolved architecture:
- regen: POST /api/living-docs/docs/{id}/regenerate (Jobs +
  generate_feature_templates)
- corpus config: multi-corpus registry + workspace config
- bulk: make regen-all

No genuine product gaps remain. This commit corrects the spec docs:
- requirements.md: status -> reconciled, with user-story->reality map
- design.md: marked obsolete (assumes React/JSX + single corpus_root)
- tasks.md: flags the false done-marks and corrects the Shipped note

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(specs): mark regen-staleness-hash-mismatch DONE (shipped in #48/0.14.2)

Status said "Implementation TBD" but the fix shipped in PR #48
(commit 1b1c7c5), released in 0.14.2: apply_polish_results now
re-injects deterministic frontmatter via _replace_polished_frontmatter,
with regression test tests/unit/test_polished_frontmatter_reinjection.py
and a CHANGELOG entry. Status corrected to reflect shipped reality.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(polish): prompt-cache hit-rate telemetry (spec polish-cache-hit-metrics)

Each polish run now tracks Anthropic prompt-cache token usage and logs
a one-line summary at the end of `attune-author regenerate`:

  Polish cache hit: 87% (1241 read / 1421 total tokens, 6 call(s))

A WARNING is appended when the run's hit rate < 50% (with >=1 cacheable
token), surfacing silent cache regressions (prompt edits, model alias
drift). Hit rate = read / (read + creation) cacheable input tokens.

Implementation:
- doc_gen/_anthropic.call_anthropic gains an optional
  on_cache_usage(creation, read, model) callback; _log_cache_usage now
  returns (creation, read). Backward compatible — doc-gen passes nothing.
- polish.py: PolishCacheStats dataclass, in-process accumulator
  (_polish_cache_telemetry / reset_polish_cache_telemetry, mirroring
  generator._faithfulness_telemetry), polish_cache_stats(), and
  format_polish_cache_summary(). _call_llm wires the callback.
- maintenance.py: reset at run start, log summary at run end alongside
  the faithfulness summary.

Deviation from the written spec: attune-author has no telemetry JSONL,
so the metric follows the existing in-process faithfulness-counter
pattern instead of a new JSONL subsystem; the threshold warning is
current-run, not cross-run. Acceptance criteria in decisions.md all met.

Tests: 16 new in tests/unit/test_polish_cache_metrics.py (callback
firing incl. zero case, hit-rate math, accumulator, summary, warning).
Docs: README "Cache hit rate" subsection; CHANGELOG [Unreleased].
Spec docs updated to DONE with the deviation noted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(help): regenerate polish/staleness templates after cache-metrics change

Auto-regenerated by the pre-commit help-freshness hook following the
polish prompt-cache telemetry work (d4af5a3).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore(specs): archive completed/superseded specs

Move terminal specs into docs/specs/archive/ so they stop inflating the
active count: polish-fact-check (v0.14.0), polish-cache-hit-metrics
(done), regen-staleness-hash-mismatch (#48/0.14.2), regen-pipeline
(superseded). skill-export-evangelism kept active (open).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Revert "chore(help): regenerate polish/staleness templates after cache-metrics change"

This reverts commit b9edfc4.

* test(batch): make batch-state fixture date relative to now

The _state() helper hardcoded submitted_at=2026-05-08, which silently
expired past the 29-day retention window on 2026-06-06 and broke the
status/cancel tests (they read batch state without an injected now=).
Default to now-1day so the fixture stays inside the window.

Fixes the 3 date-bomb failures in test_maintenance_batch.py
(TestStatus/TestCancel).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant