Skip to content

Commit e2a043a

Browse files
docs(specs): handoff for regen/staleness source_hash mismatch (#41)
Phase 2 of attune-gui's living-docs-regen-automation (Smart-AI-Memory/attune-gui#62) surfaced an attune-author bug: regenerate writes a source_hash that the immediately-following status check disagrees with. Likely cause: regen hashes a budget-truncated source view while staleness hashes the full set. This decisions.md captures the handoff — repro, code pointers, fix directions to choose between — so the next session picking this up starts with the architecture context already laid out. No code fix yet; that needs Phase 1 (failing test) first. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 29efed9 commit e2a043a

1 file changed

Lines changed: 104 additions & 0 deletions

File tree

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Decisions — Regen / staleness hash mismatch
2+
3+
**Status:** draft — bug confirmed externally, fix not yet scoped.
4+
**Owner:** Patrick
5+
**Filed:** 2026-05-25 (handoff from attune-gui Phase 2 blockers; see [attune-gui docs/specs/living-docs-regen-automation/decisions.md](https://github.com/Smart-AI-Memory/attune-gui/blob/main/docs/specs/living-docs-regen-automation/decisions.md#phase-2-blockers-discovered-2026-05-23))
6+
7+
## Problem
8+
9+
Running `attune-author regenerate` writes a new `source_hash` value to a
10+
template's YAML frontmatter, but immediately running `attune-author
11+
regenerate --dry-run` (or `status`) on the same feature **still reports
12+
it as stale**. The loop never reaches a fixed point — every PR using
13+
the new attune-gui CI `fail-if-stale` gate would fail forever.
14+
15+
Discovered while implementing Phase 2 of attune-gui's
16+
`living-docs-regen-automation` spec. Phase 2 (CI fail-if-stale)
17+
is **parked** until this bug is fixed and attune-gui can pin a new
18+
attune-author release.
19+
20+
## Hypothesis
21+
22+
Two different hash *inputs* (not algorithms) are being compared:
23+
24+
- The **staleness check** path (e.g. `attune-author status`, `compute_source_hash` /
25+
`compute_semantic_hash` in [src/attune_author/staleness.py](../../../src/attune_author/staleness.py))
26+
hashes the *full* set of source files matched by the feature glob.
27+
- The **regenerate-write** path appears to hash a **budget-truncated** view
28+
of the same source — evidenced by `ground_truth.budget: dropped X to fit
29+
budget` log lines emitted during regen. Whatever ends up in the frontmatter's
30+
`source_hash` field is therefore not the same value the staleness check
31+
later recomputes from disk.
32+
33+
Two consequences:
34+
35+
1. The frontmatter hash *cannot* match what `status` computes → permanent
36+
staleness.
37+
2. Even if a contributor regenerates, the `source_hash` in the artifact
38+
is a hash of "what fit in the LLM context window," which is not a
39+
semantically useful fingerprint for "is this artifact aligned with
40+
the source."
41+
42+
## Verification needed
43+
44+
Before designing the fix, confirm the hypothesis:
45+
46+
1. **Reproduce.** On any attune-author-managed corpus with at least one
47+
feature whose source set exceeds the ground-truth budget:
48+
```bash
49+
attune-author regenerate <feature>
50+
attune-author status <feature> # or: attune-author regenerate --dry-run <feature>
51+
```
52+
Expected (after fix): `fresh`. Actual: `stale`.
53+
54+
2. **Trace which hash is written.** Find the code path that produces the
55+
`source_hash` value written to frontmatter during regen. The hash is
56+
referenced at:
57+
- [src/attune_author/generator.py:355](../../../src/attune_author/generator.py#L355)
58+
`compute_source_hash(feature, root)` call
59+
- [src/attune_author/generator.py:1452](../../../src/attune_author/generator.py#L1452)
60+
where `source_hash:` is written into the frontmatter string
61+
- [src/attune_author/staleness.py:211](../../../src/attune_author/staleness.py#L211)
62+
`compute_source_hash` definition (delegates to `compute_semantic_hash`
63+
for pure-Python features)
64+
Look for any *other* hash computation happening after the budget step
65+
that might be writing to the same field.
66+
67+
3. **Trace which hash is read.** The status/dry-run path reads from
68+
frontmatter via `_read_frontmatter_value(text, "source_hash")` in
69+
[src/attune_author/staleness.py](../../../src/attune_author/staleness.py#L249),
70+
then compares it against a freshly computed `compute_source_hash` of the
71+
on-disk source. Confirm both sides use the same definition of "source."
72+
73+
## Fix directions (not yet chosen)
74+
75+
| Option | Pro | Con |
76+
|---|---|---|
77+
| **Always write the full-set hash** to frontmatter (regardless of what the LLM sees). | Single source of truth. Staleness check works against the fingerprint that actually represents the source. | Requires touching whichever step currently overwrites `source_hash` with a budget-truncated value. |
78+
| **Always read the budget-truncated view on the status side** too. | Symmetric. | The budget can change between runs (model swap, prompt edits). Yesterday's hash matches today's source only by coincidence. Worse semantics. |
79+
| **Stop hashing the source in regen entirely**; let the staleness check own the hash. After regen, run staleness check to compute and write. | Conceptually clean. One hash, one writer. | Two-pass write; second pass mutates the just-written file. |
80+
81+
The first option is most likely correct, but step 2 above must confirm
82+
*where* the wrong hash gets written before choosing.
83+
84+
## Out of scope (for this spec)
85+
86+
- Redesigning the ground-truth budget itself.
87+
- Changing what frontmatter fields are written (`source_hash` stays; semantics
88+
of its value changes).
89+
- attune-gui Phase 2 design refresh — that lives in
90+
[attune-gui's spec](https://github.com/Smart-AI-Memory/attune-gui/blob/main/docs/specs/living-docs-regen-automation/decisions.md).
91+
Once this fix lands and attune-gui can pin a new attune-author release,
92+
Phase 2 will likely switch from `make regen-all && git diff --exit-code`
93+
to `attune-author status --dry-run` (no `ANTHROPIC_API_KEY` needed in CI,
94+
resolves the policy conflict noted in PR
95+
[Smart-AI-Memory/attune-gui#62](https://github.com/Smart-AI-Memory/attune-gui/pull/62)).
96+
97+
## Phase outline (when this spec is approved)
98+
99+
- **Phase 1** — Reproduce the bug in a failing unit test (verification step 1
100+
above, plus a test in `tests/test_staleness.py` or new file). State the
101+
expected vs. actual hash values.
102+
- **Phase 2** — Trace and fix. Pick a Fix direction based on Phase 1 findings.
103+
- **Phase 3** — Cut release. Bump attune-author (likely 0.14.1 patch). Update
104+
attune-gui's dependency pin in `pyproject.toml`. Unblock attune-gui Phase 2.

0 commit comments

Comments
 (0)