fix: prevent LLM polish from laundering source_hash frontmatter (#48)

silversurfer562 · claude · web-flow · commit 1b1c7c5ec56e · 2026-05-27T09:29:44.000-04:00
* fix: prevent LLM polish from laundering source_hash frontmatter

Root cause: `apply_polish_results` wrote the LLM's polished output
verbatim, including any frontmatter the LLM emitted. The LLM is
given the rendered template (with frontmatter) as input context;
it polishes the body but ALSO echoes the frontmatter — sometimes
with single-character transcription errors in deterministic fields
like `source_hash`. That broke staleness detection: the
frontmatter `source_hash` written into the polished file didn't
match what `compute_source_hash` recomputed on the same source,
leaving the feature permanently "stale" after a successful regen.

Concrete evidence (attune-ai spec-engine, 2026-05-27):

  frontmatter: f8ced22b02899aa25ff409636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
  computed:    f8ced22b02899aa25ff709636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
                                  ^
                                  position 19: LLM wrote f4 instead of f7

Single-character difference at one byte of a 64-char SHA-256 hex
digest. The same `compute_source_hash` function called twice in
the same Python process returns identical values — pure LLM
hallucination of the value it was supposed to echo verbatim.

Fix: strip whatever frontmatter the LLM emitted from the polished
content and re-inject the canonical frontmatter from
`entry.rendered_content`. The LLM polishes the BODY; the
frontmatter (especially `source_hash`, `generated_at`, `feature`,
`depth`, `name`, `status`, `type`) is non-negotiable deterministic
metadata that must survive the polish step exactly.

Implementation:

- New `_replace_polished_frontmatter(polished, canonical_source)`
  helper in `generator.py`. Uses a frontmatter regex to extract
  the canonical block, strip whatever the LLM emitted, and
  re-assemble.
- `apply_polish_results` now calls the helper for every depth
  with a polished result. Lenient-mode failures (depth missing
  from `polished_by_depth`) fall through to the raw rendered
  template, which already has correct frontmatter.

Tests: 7 new regression tests covering the corrupted-hash case,
LLM-stripped frontmatter, LLM-correct frontmatter, no-canonical-
frontmatter defensive path, and body-whitespace preservation —
plus two behavioral tests asserting `apply_polish_results` writes
the canonical hash to disk regardless of LLM perturbation.

Spec doc (`docs/specs/regen-staleness-hash-mismatch/decisions.md`)
updated with the verified root cause replacing the original
budget-truncation hypothesis (which was wrong — `compute_source_hash`
runs only once and is fully deterministic; the divergence lives
entirely in the polish step's output mutation).

Unblocks attune-gui Phase 2 (`living-docs-regen-automation`) which
needed `attune-author status --dry-run` to reach a fixed point
after regen.

Local: 173/173 unit tests pass.

Co-Authored-By: Claude Opus 4.7 &lt;noreply@anthropic.com&gt;

* fixup: field-level frontmatter merge to preserve polish: skipped

The previous whole-block replacement broke 3 golden snapshot tests
that asserted on the `polish: skipped` frontmatter marker which
the lenient-mode polish failure path adds via
`_mark_polish_skipped`. My whole-block replace discarded that
marker.

Switching to field-level merge:
- DETERMINISTIC fields (type, name, feature, depth, generated_at,
  source_hash, status): canonical from rendered template wins.
- All OTHER fields (polish: skipped, future markers): polished
  output preserved as-emitted.

Implementation: parse both frontmatter blocks line-by-line, walk
polished's lines, swap deterministic-keyed lines with canonical's
version, keep everything else. Append any deterministic canonical
fields the polished output dropped entirely.

Tests added:
- test_polish_skipped_marker_preserved — regression on the lenient
  failure path. Exercises both the deterministic-field override
  AND the marker preservation in one assertion.
- test_unknown_non_deterministic_field_preserved — forward-compat
  for future polish-layer fields.

Local: 979/979 unit tests pass; 15/15 in this file's slice +
3/3 golden snapshots restored. End-to-end re-verified on attune-ai
spec-engine: 11/11 regenerated templates have canonical
source_hash matching compute_source_hash.

Co-Authored-By: Claude Opus 4.7 &lt;noreply@anthropic.com&gt;

---------

Co-authored-by: Claude Opus 4.7 &lt;noreply@anthropic.com&gt;
diff --git a/docs/specs/regen-staleness-hash-mismatch/decisions.md b/docs/specs/regen-staleness-hash-mismatch/decisions.md
@@ -1,9 +1,86 @@
 # Decisions — Regen / staleness hash mismatch
 
-**Status:** draft — bug confirmed externally, fix not yet scoped.
+**Status:** root cause confirmed 2026-05-27 — original hypothesis (budget truncation of hash inputs) was wrong; actual cause is LLM-polished frontmatter laundering. Fix direction concrete. Implementation TBD.
 **Owner:** Patrick
 **Filed:** 2026-05-25 (handoff from attune-gui Phase 2 blockers; see [attune-gui docs/specs/living-docs-regen-automation/decisions.md](https://github.com/Smart-AI-Memory/attune-gui/blob/main/docs/specs/living-docs-regen-automation/decisions.md#phase-2-blockers-discovered-2026-05-23))
 
+## Root cause (verified 2026-05-27)
+
+The original hypothesis (budget-truncated hash input) was wrong.
+`compute_source_hash` is called exactly ONCE at
+`generator.py:355` (inside `prepare_polish_phase`) and produces
+a deterministic value off the FULL source set. Verified by
+calling it twice in a row — idempotent. The `source_hash`
+variable flows through to `_render_template` at line 1452
+which writes it into the rendered template's frontmatter
+correctly.
+
+**The actual bug is in `apply_polish_results` at
+`generator.py:468`:**
+
+```python
+final_content = polished_by_depth.get(entry.depth, entry.rendered_content)
+...
+entry.out_path.write_text(final_content, encoding="utf-8")
+```
+
+When LLM polish ran, the polished content REPLACES the rendered
+template **including the frontmatter the LLM regenerated as part
+of its output**. The LLM is given the rendered template (with
+correct frontmatter) as input context, polishes the body, and
+returns the whole document — but its emitted frontmatter has a
+single-character transcription error in the `source_hash` field.
+
+**Reproducible evidence (attune-ai spec-engine, 2026-05-27):**
+
+```
+frontmatter source_hash: f8ced22b02899aa25ff409636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
+computed source_hash:    f8ced22b02899aa25ff709636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
+                                            ^
+                                            position 19: f4 vs f7
+```
+
+Single-char difference at byte 19 of a 64-char SHA-256 hex
+digest. Pure LLM hallucination of the value it was supposed to
+echo verbatim. Same `compute_source_hash` function called twice
+in the same Python process returns identical values; the
+divergence is solely between "what was hashed and written into
+the prompt" and "what the LLM emitted as its frontmatter copy."
+
+## Confirmed fix direction
+
+Strip frontmatter from `final_content` after polish and
+re-inject the canonical frontmatter from
+`entry.rendered_content`. The LLM polishes the BODY; the
+frontmatter (especially `source_hash`, `generated_at`,
+`feature`, `depth`, `name`) is non-negotiable deterministic
+metadata that must survive the polish step exactly.
+
+**Sketch (in `apply_polish_results` around line 468):**
+
+```python
+final_content = polished_by_depth.get(entry.depth, entry.rendered_content)
+if entry.depth in polished_by_depth:
+    # The LLM may have perturbed the frontmatter — re-inject
+    # the canonical one from the rendered template.
+    final_content = _replace_frontmatter(
+        polished_body=final_content,
+        canonical_frontmatter=_extract_frontmatter(entry.rendered_content),
+    )
+```
+
+Where `_extract_frontmatter` returns the `---\n...\n---\n`
+prefix from `entry.rendered_content`, and `_replace_frontmatter`
+strips whatever frontmatter the LLM produced and prepends the
+canonical one. Both can use `_FRONTMATTER_RE` from
+`staleness.py` (or a local equivalent).
+
+Even better long-term: send the LLM the body only (strip
+frontmatter from its input context), have it return the body
+only, and assemble the final document deterministically. Bigger
+refactor but eliminates the "did the LLM accidentally edit
+metadata" failure mode entirely.
+
 ## Problem
 
 Running `attune-author regenerate` writes a new `source_hash` value to a
diff --git a/src/attune_author/generator.py b/src/attune_author/generator.py
@@ -15,6 +15,7 @@
 import ast
 import logging
 import os
+import re
 from concurrent.futures import ThreadPoolExecutor, as_completed
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
@@ -431,6 +432,131 @@ def prepare_polish_phase(
     )
 
 
+_FRONTMATTER_RE = re.compile(r"\A---\n(.*?)\n---\n", re.DOTALL)
+"""Matches a YAML frontmatter block at the start of a markdown
+document, capturing the body between the ``---`` delimiters.
+Includes the closing ``---\\n`` in the match so the body starts
+at the next character after the match end."""
+
+
+#: Frontmatter fields that are DETERMINISTIC — computed from
+#: source and not for the LLM (or polish-layer) to mutate. These
+#: come from the rendered template and override whatever the
+#: polish output contains.
+_DETERMINISTIC_FRONTMATTER_FIELDS = frozenset(
+    {
+        "type",
+        "name",
+        "feature",
+        "depth",
+        "generated_at",
+        "source_hash",
+        "status",
+    }
+)
+
+
+def _parse_frontmatter_lines(block: str) -> list[tuple[str, str]]:
+    """Parse a YAML frontmatter block into (key, line) pairs in order.
+
+    The block is the captured group from ``_FRONTMATTER_RE``,
+    i.e. the YAML body without the ``---`` delimiters. Each line
+    is returned as the (key, whole-line) tuple. Lines that don't
+    match the ``key: ...`` shape (e.g. multi-line YAML values, or
+    structural lines) are returned with key ``""`` so the caller
+    can decide whether to include them.
+    """
+    out: list[tuple[str, str]] = []
+    for line in block.splitlines():
+        stripped = line.lstrip()
+        if not stripped or stripped.startswith("#"):
+            out.append(("", line))
+            continue
+        key, sep, _ = stripped.partition(":")
+        if sep and " " not in key and "\t" not in key:
+            out.append((key.strip(), line))
+        else:
+            out.append(("", line))
+    return out
+
+
+def _replace_polished_frontmatter(polished: str, canonical_source: str) -> str:
+    """Re-inject deterministic frontmatter fields from canonical source.
+
+    The polish LLM is given the rendered template (with frontmatter)
+    as input context and asked to improve the body. Empirically, the
+    LLM also echoes the frontmatter in its output — sometimes with
+    single-character transcription errors in deterministic fields
+    like ``source_hash``. That broke staleness detection: the
+    frontmatter ``source_hash`` written into the polished file
+    didn't match what ``compute_source_hash`` recomputed on the
+    same source, leaving the feature permanently "stale" after a
+    successful regen.
+
+    Approach: field-level merge. For deterministic fields
+    (:data:`_DETERMINISTIC_FRONTMATTER_FIELDS`), the canonical
+    value from the rendered template wins. For any other field
+    (e.g. ``polish: skipped`` added by the lenient-mode polish
+    failure path in :func:`attune_author.polish._mark_polish_skipped`),
+    the polish output's value is preserved.
+
+    Edge cases:
+
+    - Polished has no frontmatter (LLM stripped it): prepend the
+      canonical block as-is.
+    - Canonical has no frontmatter (shouldn't happen in practice
+      since rendered templates always have one): return polished
+      untouched.
+
+    See ``docs/specs/regen-staleness-hash-mismatch/decisions.md``
+    for the full diagnosis.
+    """
+    canonical_match = _FRONTMATTER_RE.match(canonical_source)
+    if canonical_match is None:
+        # Defensive: rendered templates always have frontmatter.
+        return polished
+
+    polished_match = _FRONTMATTER_RE.match(polished)
+    if polished_match is None:
+        # LLM stripped the frontmatter entirely. Prepend canonical
+        # block and return.
+        return canonical_match.group(0) + polished
+
+    canonical_lines = _parse_frontmatter_lines(canonical_match.group(1))
+    polished_lines = _parse_frontmatter_lines(polished_match.group(1))
+
+    canonical_by_key: dict[str, str] = {k: line for k, line in canonical_lines if k}
+
+    merged: list[str] = []
+    seen_deterministic: set[str] = set()
+    for key, line in polished_lines:
+        if key in _DETERMINISTIC_FRONTMATTER_FIELDS:
+            # Override with canonical's line for this deterministic
+            # field. If canonical lacks the key (very unusual),
+            # drop the polished version too — better silence than
+            # propagating a possibly-perturbed value.
+            canonical_line = canonical_by_key.get(key)
+            if canonical_line is not None:
+                merged.append(canonical_line)
+                seen_deterministic.add(key)
+        else:
+            # Non-deterministic field (e.g. polish: skipped marker)
+            # OR a structural / comment line. Preserve as the polish
+            # layer emitted it.
+            merged.append(line)
+
+    # Append any deterministic canonical fields the polish output
+    # was missing (e.g. LLM dropped a line entirely). Preserves the
+    # invariant that the canonical's deterministic fields are
+    # always present in the result.
+    for key, line in canonical_lines:
+        if key and key in _DETERMINISTIC_FRONTMATTER_FIELDS and key not in seen_deterministic:
+            merged.append(line)
+
+    body = polished[polished_match.end() :]
+    return "---\n" + "\n".join(merged) + "\n---\n" + body
+
+
 def apply_polish_results(
     prep: PolishPreparation,
     polished_by_depth: dict[str, str],
@@ -465,7 +591,21 @@ def apply_polish_results(
     project_root = Path.cwd()
     absolute_sources = [project_root / rel_path for rel_path in prep.matched_files]
     for entry in prep.pending:
-        final_content = polished_by_depth.get(entry.depth, entry.rendered_content)
+        if entry.depth in polished_by_depth:
+            # Polish ran: take the polished body but re-inject the
+            # canonical frontmatter. The LLM occasionally transcribes
+            # deterministic fields (notably source_hash) with single-
+            # character errors, which permanently breaks staleness
+            # detection. See _replace_polished_frontmatter docstring.
+            final_content = _replace_polished_frontmatter(
+                polished=polished_by_depth[entry.depth],
+                canonical_source=entry.rendered_content,
+            )
+        else:
+            # Polish skipped (e.g. lenient-mode failure) — use the
+            # raw rendered template, which already has correct
+            # frontmatter.
+            final_content = entry.rendered_content
         # Phase 4: strip `# attune-author: skip-mypy` directives from
         # tutorial code fences so they don't ship to readers. Other
         # template kinds are untouched.
diff --git a/tests/unit/test_polished_frontmatter_reinjection.py b/tests/unit/test_polished_frontmatter_reinjection.py