Commit 1b1c7c5
fix: prevent LLM polish from laundering source_hash frontmatter (#48)
* fix: prevent LLM polish from laundering source_hash frontmatter
Root cause: `apply_polish_results` wrote the LLM's polished output
verbatim, including any frontmatter the LLM emitted. The LLM is
given the rendered template (with frontmatter) as input context;
it polishes the body but ALSO echoes the frontmatter — sometimes
with single-character transcription errors in deterministic fields
like `source_hash`. That broke staleness detection: the
frontmatter `source_hash` written into the polished file didn't
match what `compute_source_hash` recomputed on the same source,
leaving the feature permanently "stale" after a successful regen.
Concrete evidence (attune-ai spec-engine, 2026-05-27):
frontmatter: f8ced22b02899aa25ff409636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
computed: f8ced22b02899aa25ff709636e659830c6ba856d70de6ddd1a9bf1cbe37a1337
^
position 19: LLM wrote f4 instead of f7
Single-character difference at one byte of a 64-char SHA-256 hex
digest. The same `compute_source_hash` function called twice in
the same Python process returns identical values — pure LLM
hallucination of the value it was supposed to echo verbatim.
Fix: strip whatever frontmatter the LLM emitted from the polished
content and re-inject the canonical frontmatter from
`entry.rendered_content`. The LLM polishes the BODY; the
frontmatter (especially `source_hash`, `generated_at`, `feature`,
`depth`, `name`, `status`, `type`) is non-negotiable deterministic
metadata that must survive the polish step exactly.
Implementation:
- New `_replace_polished_frontmatter(polished, canonical_source)`
helper in `generator.py`. Uses a frontmatter regex to extract
the canonical block, strip whatever the LLM emitted, and
re-assemble.
- `apply_polish_results` now calls the helper for every depth
with a polished result. Lenient-mode failures (depth missing
from `polished_by_depth`) fall through to the raw rendered
template, which already has correct frontmatter.
Tests: 7 new regression tests covering the corrupted-hash case,
LLM-stripped frontmatter, LLM-correct frontmatter, no-canonical-
frontmatter defensive path, and body-whitespace preservation —
plus two behavioral tests asserting `apply_polish_results` writes
the canonical hash to disk regardless of LLM perturbation.
Spec doc (`docs/specs/regen-staleness-hash-mismatch/decisions.md`)
updated with the verified root cause replacing the original
budget-truncation hypothesis (which was wrong — `compute_source_hash`
runs only once and is fully deterministic; the divergence lives
entirely in the polish step's output mutation).
Unblocks attune-gui Phase 2 (`living-docs-regen-automation`) which
needed `attune-author status --dry-run` to reach a fixed point
after regen.
Local: 173/173 unit tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fixup: field-level frontmatter merge to preserve polish: skipped
The previous whole-block replacement broke 3 golden snapshot tests
that asserted on the `polish: skipped` frontmatter marker which
the lenient-mode polish failure path adds via
`_mark_polish_skipped`. My whole-block replace discarded that
marker.
Switching to field-level merge:
- DETERMINISTIC fields (type, name, feature, depth, generated_at,
source_hash, status): canonical from rendered template wins.
- All OTHER fields (polish: skipped, future markers): polished
output preserved as-emitted.
Implementation: parse both frontmatter blocks line-by-line, walk
polished's lines, swap deterministic-keyed lines with canonical's
version, keep everything else. Append any deterministic canonical
fields the polished output dropped entirely.
Tests added:
- test_polish_skipped_marker_preserved — regression on the lenient
failure path. Exercises both the deterministic-field override
AND the marker preservation in one assertion.
- test_unknown_non_deterministic_field_preserved — forward-compat
for future polish-layer fields.
Local: 979/979 unit tests pass; 15/15 in this file's slice +
3/3 golden snapshots restored. End-to-end re-verified on attune-ai
spec-engine: 11/11 regenerated templates have canonical
source_hash matching compute_source_hash.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>1 parent 1cc27f8 commit 1b1c7c5
3 files changed
Lines changed: 491 additions & 2 deletions
File tree
- docs/specs/regen-staleness-hash-mismatch
- src/attune_author
- tests/unit
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
7 | 84 | | |
8 | 85 | | |
9 | 86 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
431 | 432 | | |
432 | 433 | | |
433 | 434 | | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
434 | 560 | | |
435 | 561 | | |
436 | 562 | | |
| |||
465 | 591 | | |
466 | 592 | | |
467 | 593 | | |
468 | | - | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
469 | 609 | | |
470 | 610 | | |
471 | 611 | | |
| |||
0 commit comments