Skip to content

Commit 97bfcb8

Browse files
docs(memory): first real curated-memory dogfood — friction entries + find_similar spot-check (#1210)
First real add_finding() captures through the 4 curated NodeTypes (PR #1207), written to ~/.attune/memory/curated_graph.json and receipt-verified from a fresh instance. Logs three friction points (repo-tracked cwd-relative default path; RELATED_TO direction defaults dropping the symmetric read; find_similar signature + threshold=0.5 muting realistic queries) plus the clean fits, and closes the session-starter's find_similar sanity-check item in the recall-eval spec (alive, not query()-dead — default tuning issue).
1 parent 225a756 commit 97bfcb8

2 files changed

Lines changed: 104 additions & 2 deletions

File tree

docs/specs/memory-nodetype-friction-log/decisions.md

Lines changed: 80 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,5 +36,83 @@ mistake a pre-existing implementation gap for a taxonomy-fit problem.
3636

3737
---
3838

39-
_(no friction entries yet — log starts on first real `add_finding()`
40-
call using one of the 4 new types)_
39+
## 2026-07-01 — First real captures: 4 nodes, one per type
40+
41+
**What was recorded** (real session findings, written to
42+
`~/.attune/memory/curated_graph.json` via the real public API, receipt
43+
verified by reloading from a brand-new `MemoryGraph` instance):
44+
45+
- `PROJECT_CONTEXT` — "PersonalMemory recall is file-backed and
46+
survives process death" (the Run 3 verdict, PR #1209)
47+
- `REFERENCE` — pointer to the canonical benchmark numbers
48+
(`docs/specs/memory-recall-eval/decisions.md`)
49+
- `FEEDBACK` — "prefer real round-trip tests over mocks when touching
50+
attune.memory" (Patrick-endorsed, both #1208 bugs were mock-blind)
51+
- `USER_CONTEXT` — Patrick's stated active priority (2026-07-01): make
52+
the memory system genuinely work for the agent
53+
- Two `RELATED_TO` edges (`REFERENCE → PROJECT_CONTEXT`,
54+
`FEEDBACK → PROJECT_CONTEXT`)
55+
56+
**Clean fits:** the 4-type taxonomy matched all four captures with no
57+
forcing — nothing needed a fifth type, nothing straddled two types.
58+
`status="active"` persisted and reloaded correctly (the #1208 fix
59+
working live). `severity` staying unset read naturally. Tags carried
60+
fine. `workflow=""` for curated nodes worked as documented.
61+
62+
**Friction A — storage location (the biggest one).** `MemoryGraph`'s
63+
default path is `patterns/memory_graph.json`: **cwd-relative and
64+
git-tracked**. Curated *cross-session* memory written through the
65+
default would either get committed into the repo or stranded inside
66+
whichever worktree the session happened to run in (this repo runs many
67+
parallel worktrees). Workaround: explicit
68+
`path=~/.attune/memory/curated_graph.json`. Signal: curated memory
69+
needs a durable default home outside the repo, distinct from
70+
workflow-findings graphs — the current default is shaped for
71+
per-project workflow state, not cross-session memory.
72+
73+
**Friction B — `RELATED_TO` direction ergonomics.** The natural
74+
`[[link]]` usage — add an edge, then ask the *target* node for related
75+
nodes — silently returns `[]`: `RELATED_TO` is declared symmetric
76+
(`REVERSE_EDGE_TYPES` maps it to itself) but `add_edge` defaults
77+
`bidirectional=False` and `find_related` defaults
78+
`direction="outgoing"`, so symmetry exists only if the writer or
79+
reader remembers to ask for it. Workaround: `direction="both"` at read
80+
time (verified: returns both linked nodes). Signal: for curated
81+
memory, symmetric edge types should traverse both directions by
82+
default; the current defaults quietly drop half the link graph.
83+
84+
**Friction C — `find_similar` signature + threshold (also closes the
85+
session-starter's #3 sanity-check).** Two parts. (a) Signature
86+
inconsistency: `find_related` takes a node id, `find_similar` takes a
87+
finding *dict* — first call attempt passed an id and died with
88+
`AttributeError: 'str' object has no attribute 'get'`. (b) The default
89+
`threshold=0.5` over Jaccard word-overlap mutes realistic queries:
90+
"recall benchmark persistence numbers" scored 0.301 against the
91+
project-context node and 0.125 against the reference node — both
92+
filtered out at the default; only a near-verbatim name clears 0.5
93+
(verbatim self-match = 1.0, so this is NOT the `PersonalMemory.query()`
94+
dead-path class — the mechanism works, the default tuning makes it
95+
effectively silent). Workaround: `threshold≈0.25`, or use
96+
`PersonalMemory.query()` for text recall. Cross-referenced in
97+
`docs/specs/memory-recall-eval/decisions.md`.
98+
99+
**Minor:** `EdgeType` lives in `attune.memory.edges`, not
100+
`attune.memory.nodes` (where the curated-memory docstring pointing at
101+
it lives) — cost one failed import.
102+
103+
---
104+
105+
## Adjacent observations (not R1-scope — different subsystem)
106+
107+
- **2026-07-01 — cross-project recall noise in the stash/recall
108+
surface.** A session resume in this repo surfaced "recent findings
109+
from this project" that included precious-metals-conversation
110+
findings ("silver volatility over 20 years", "consider reinvesting
111+
dividends") plus one entry too context-free to act on ("Identity
112+
rewrite may be more done than the memory note implies").
113+
Recency-ranked recall with no topical/project gate pulls whatever
114+
was stashed last, and project attribution leaked across sessions.
115+
This is the stash/recall-hook subsystem, not `add_finding()`, so it
116+
is logged here as adjacent evidence only — but it is the same
117+
product question this spec exists to answer (does recalled memory
118+
read as trustworthy?), and today the answer on that surface was no.

docs/specs/memory-recall-eval/decisions.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,3 +130,27 @@ a measured fact. The single-process default (`--phase all`) reproduces
130130
the same numbers, so the two methodologies are interchangeable for
131131
future runs; use `--phase persistence` when the change under test
132132
touches serialization or file layout.
133+
134+
## 2026-07-01 — Sanity check: `MemoryGraph.find_similar` (deferred non-goal, spot-checked)
135+
136+
The non-goals deferred `MemoryGraph` recall to a future experiment;
137+
given Run 1 found `PersonalMemory.query()` silently dead, a cheap
138+
spot-check was worth it. Done during the first real curated-memory
139+
captures (see
140+
`docs/specs/memory-nodetype-friction-log/decisions.md`, Friction C):
141+
142+
- **Not the dead-path class.** A verbatim node name self-matches at
143+
score 1.0 — the mechanism (Jaccard word-overlap over
144+
name/description, type/file bonuses) works.
145+
- **But the default `threshold=0.5` mutes realistic queries.** A
146+
natural paraphrase ("recall benchmark persistence numbers") scored
147+
0.301 / 0.125 against nodes it should plausibly find — both filtered
148+
at the default. Callers get `[]` for anything short of near-verbatim
149+
names, which *reads* like the query() bug from the outside.
150+
- **Workaround:** pass `threshold≈0.25`, or use
151+
`PersonalMemory.query()` for text recall.
152+
153+
No full ground-truthed benchmark run for `find_similar` yet — this was
154+
a spot-check, not Run 4. If curated-graph usage grows (friction-log
155+
spec), rerun this corpus's methodology against `find_similar` with a
156+
tuned threshold.

0 commit comments

Comments
 (0)