Skip to content

Commit dca820a

Browse files
authored
Improve recall cue quality
Adds cue quality lint warnings, gist-first summaries, boundary-aware chunking, and v0.3.1 release metadata.
1 parent 3db80ae commit dca820a

15 files changed

Lines changed: 327 additions & 44 deletions

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [0.3.1] - 2026-04-23
6+
7+
- Added impression cue quality warnings to `deja-vu-lint-memory` for sparse, oversized, duplicate, generic, and repeated keyword sets.
8+
- Updated the default summary generator to preserve decision, rationale, and trigger gist cues instead of only truncating source content.
9+
- Updated the default chunker to preserve Markdown heading and paragraph boundaries before falling back to hard splitting.
10+
- Added source tests for cue lint warnings, gist summaries, and boundary-aware chunking.
11+
- Updated package metadata for the 0.3.1 patch release.
12+
513
## [0.3.0] - 2026-04-22
614

715
- Repositioned Deja Vu as a cue-first memory protocol centered on `task cue -> familiarity score -> minimal recall -> durable writeback`.

README.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ The protocol is packaged as three project-local assets:
1616

1717
The goal is not to give every agent a heavy runtime. The goal is to give any agent a repeatable discipline for spending almost no tokens until the task proves that deeper memory is useful.
1818

19+
The current patch line also emphasizes memory quality control: compact cue linting, gist-first summaries, and boundary-aware chunks keep recall routes small instead of merely adding more stored text.
20+
1921
## What Deja Vu Is
2022

2123
Deja Vu defines a shared memory behavior for agents working inside one project.
@@ -86,6 +88,7 @@ The canonical layout and field rules are specified in [docs/storage-markdown.md]
8688
- Use a single-project scope only in MVP: `project:<project-id>`.
8789
- Recall before substantial work, but follow a strict recall budget.
8890
- Prefer scripted impression scans first; open summary or detailed records only when needed.
91+
- Keep impression cues sparse, specific, and linted so the first recall step stays cheap.
8992
- Write back only durable memory:
9093
- decisions
9194
- architecture intent
@@ -163,6 +166,8 @@ The public TypeScript exports remain intact for hosts that want semantic recall.
163166

164167
`scanImpressions()` performs token-only familiarity scanning and does not load summaries or chunks.
165168

169+
The default engine helpers preserve low-token recall quality by generating decision/rationale/trigger summaries and by chunking Markdown or paragraph boundaries before falling back to character splits.
170+
166171
## Examples
167172

168173
- Protocol-first example: [examples/protocol-project](./examples/protocol-project)
@@ -209,4 +214,5 @@ npm run lint:memory
209214
- [docs/scripted-recall.md](./docs/scripted-recall.md)
210215
- [docs/bootstrap-instructions.md](./docs/bootstrap-instructions.md)
211216
- [docs/project-rules-template.md](./docs/project-rules-template.md)
217+
- [docs/release-v0.3.1.md](./docs/release-v0.3.1.md)
212218
- [llms.txt](./llms.txt)

docs/engine/semantic-engine.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ The engine provides:
2020
- threshold gating
2121
- scoring and ranking
2222
- in-memory demo adapters
23+
- gist-first default summaries
24+
- Markdown and paragraph boundary-aware default chunking
2325

2426
## What it does not do
2527

docs/impression-layer.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ Default keyword discipline:
7777
- prefer nouns, feature names, decision names, and project-specific phrases
7878
- avoid full sentences
7979
- use `aliases` only for stable alternate names
80+
- run the memory linter to catch sparse, oversized, generic, duplicate, or repeated cue sets
8081

8182
## Retention Behavior
8283

docs/release-v0.3.1.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Deja Vu v0.3.1
2+
3+
Deja Vu v0.3.1 is a recall-quality patch for the cue-first protocol release.
4+
5+
## Highlights
6+
7+
- `deja-vu-lint-memory` now warns when impression cues are too sparse, too large, duplicated, too generic, or repeated across records.
8+
- Default engine summaries now preserve gist cues as decision, rationale, and trigger fields when those labels are available.
9+
- Default engine chunking now respects Markdown headings and paragraph boundaries before using hard character splits.
10+
- The patch keeps the v0.3 protocol surface unchanged while making low-token recall routes cleaner and less noisy.
11+
12+
## Validation
13+
14+
- `npm run test:src`
15+
- `npm run lint:memory`

docs/scripted-recall.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,13 @@ The companion linter checks whether the impression index is structurally usable:
2020
deja-vu-lint-memory
2121
```
2222

23+
The linter also warns about low-quality cue routes that make future recall more expensive:
24+
25+
- too few or too many keywords
26+
- duplicate keywords inside one record
27+
- too many generic keywords
28+
- duplicate keyword sets across records
29+
2330
## Inputs
2431

2532
The script reads:

encoding-status.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@
44
| --- | --- | --- |
55
| `encoding-status.md` | 編碼正常(新建,UTF-8) | Project-local registry created by agent. |
66
| `.gitignore` | 編碼正常(新建,UTF-8) | New text file. |
7-
| `package.json` | 編碼正常(已檢查) | Rewritten in UTF-8; version bumped to 0.3.0 and package description now uses cue-first positioning. |
8-
| `package-lock.json` | 編碼正常(已檢查) | Rewritten in UTF-8; package version metadata now matches the 0.3.0 release. |
7+
| `package.json` | 編碼正常(已檢查) | Updated in UTF-8; version bumped to 0.3.1 with cue-quality package description and keywords. |
8+
| `package-lock.json` | 編碼正常(已檢查) | Updated in UTF-8; package version metadata now matches the 0.3.1 release. |
99
| `tsconfig.json` | 編碼正常(新建,UTF-8) | New text file. |
1010
| `LICENSE` | 編碼正常(新建,UTF-8) | New text file. |
11-
| `README.md` | 編碼正常(已檢查) | Rewritten in UTF-8; repo entrypoint now centers cue-first recall, minimum memory files, and recall budget. |
12-
| `CHANGELOG.md` | 編碼正常(已檢查) | Rewritten in UTF-8; now includes the 0.3.0 cue-first protocol release notes. |
13-
| `llms.txt` | 編碼正常(已檢查) | Rewritten in UTF-8; AI-readable index now points to cue-first recall and recall budget concepts. |
11+
| `README.md` | 編碼正常(已檢查) | Updated in UTF-8; overview now includes cue quality control, gist summaries, and boundary-aware chunks. |
12+
| `CHANGELOG.md` | 編碼正常(已檢查) | Updated in UTF-8; now includes the 0.3.1 recall-quality patch release notes. |
13+
| `llms.txt` | 編碼正常(已檢查) | Updated in UTF-8; AI-readable index now includes cue quality linting, gist summaries, and boundary-aware chunking. |
1414
| `docs/architecture.md` | 編碼正常(已檢查) | Rewritten in UTF-8; architecture doc now describes the engine as the optional layer inside a protocol-first product. |
1515
| `docs/agent-handshake.md` | 編碼正常(已檢查) | Rewritten in UTF-8; handshake now starts from cue-first adoption and recall-budget discipline. |
1616
| `docs/project-rules-template.md` | 編碼正常(已檢查) | Rewritten in UTF-8; points to canonical AGENTS and memory templates. |
@@ -20,7 +20,7 @@
2020
| `docs/protocol.md` | 編碼正常(已檢查) | Rewritten in UTF-8; protocol now defines cue-first v0.3, minimum artifacts, and recall budget. |
2121
| `docs/workflow.md` | 編碼正常(已檢查) | Rewritten in UTF-8; workflow now uses cue-first recall budget and lower-priority event writeback. |
2222
| `docs/storage-markdown.md` | 編碼正常(已檢查) | Rewritten in UTF-8; storage contract now separates required, recommended, and optional layouts. |
23-
| `docs/engine/semantic-engine.md` | 編碼正常(新建,UTF-8| New optional engine overview. |
23+
| `docs/engine/semantic-engine.md` | 編碼正常(已檢查| Updated in UTF-8; engine overview now mentions gist-first summaries and boundary-aware chunking. |
2424
| `docs/engine/protocol-to-engine.md` | 編碼正常(新建,UTF-8) | New mapping from protocol workflow to semantic engine usage. |
2525
| `docs/templates/AGENTS.template.md` | 編碼正常(已檢查) | Rewritten in UTF-8; template now uses protocol v0.3, recall budget, and optional index/events rules. |
2626
| `docs/templates/memory/index.md` | 編碼正常(新建,UTF-8) | New memory index template. |
@@ -35,8 +35,8 @@
3535
| `src/utils/id.ts` | 編碼正常(新建,UTF-8) | New text file. |
3636
| `src/utils/math.ts` | 編碼正常(新建,UTF-8) | New text file. |
3737
| `src/utils/text.ts` | 編碼正常(新建,UTF-8) | New text file. |
38-
| `src/memory/default-chunker.ts` | 編碼正常(新建,UTF-8| New text file. |
39-
| `src/memory/default-summary-generator.ts` | 編碼正常(新建,UTF-8| New text file. |
38+
| `src/memory/default-chunker.ts` | 編碼正常(已檢查) | Updated in UTF-8; default chunking now preserves Markdown and paragraph boundaries before hard splitting. |
39+
| `src/memory/default-summary-generator.ts` | 編碼正常(已檢查) | Updated in UTF-8; default summaries now preserve decision/rationale/trigger gist cues. |
4040
| `src/scoring/hybrid-scoring-strategy.ts` | 編碼正常(新建,UTF-8) | New text file. |
4141
| `src/plugins/mock-embedding-provider.ts` | 編碼正常(新建,UTF-8) | Updated in UTF-8; hybrid token and trigram demo embeddings. |
4242
| `src/plugins/create-in-memory-engine.ts` | 編碼正常(新建,UTF-8) | New text file. |
@@ -58,15 +58,16 @@
5858
| `examples/protocol-project/memory/context/project-context.md` | 編碼正常(新建,UTF-8) | New example project context. |
5959
| `examples/protocol-project/memory/decisions/protocol-first-positioning.md` | 編碼正常(新建,UTF-8) | New example decision record. |
6060
| `examples/protocol-project/memory/open-loops/add-engine-later.md` | 編碼正常(新建,UTF-8) | New example open-loop record. |
61-
| `tests/semantic-recall-engine.test.ts` | 編碼正常(已檢查) | Checked in UTF-8; source tests now run through the updated Node 24-compatible test script. |
62-
| `docs/impression-layer.md` | 編碼正常(已檢查) | Rewritten in UTF-8; impression layer now anchors cue-first token spending and keyword discipline. |
63-
| `docs/scripted-recall.md` | 編碼正常(已檢查) | Rewritten in UTF-8; script contract now requires only summary, impressions, and scanner for bootstrap. |
61+
| `tests/semantic-recall-engine.test.ts` | 編碼正常(已檢查) | Updated in UTF-8; source tests now cover gist summaries and boundary-aware chunking. |
62+
| `docs/impression-layer.md` | 編碼正常(已檢查) | Updated in UTF-8; keyword discipline now points to linter checks for low-quality cue routes. |
63+
| `docs/scripted-recall.md` | 編碼正常(已檢查) | Updated in UTF-8; linter docs now describe low-quality cue warnings. |
6464
| `docs/release-v0.2.1.md` | 編碼正常(新建,UTF-8) | New release note for scripted impression-first recall. |
6565
| `docs/release-v0.3.0.md` | 編碼正常(新建,UTF-8) | New release note for cue-first protocol and recall budget release. |
66+
| `docs/release-v0.3.1.md` | 編碼正常(新建,UTF-8) | New release note for cue quality, gist summaries, and boundary-aware chunking. |
6667
| `docs/templates/memory/impressions.jsonl` | 編碼正常(新建,UTF-8) | New impression index template. |
6768
| `docs/templates/memory/events/YYYY-MM.md` | 編碼正常(新建,UTF-8) | New event ledger template. |
6869
| `scripts/dejavu-scan-memory.mjs` | 編碼正常(新建,UTF-8) | New default memory impression scanner. |
6970
| `examples/protocol-project/memory/impressions.jsonl` | 編碼正常(新建,UTF-8) | New example impression index. |
7071
| `examples/protocol-project/memory/events/2026-04.md` | 編碼正常(新建,UTF-8) | New example event ledger. |
71-
| `scripts/dejavu-lint-memory.mjs` | 編碼正常(新建,UTF-8| New memory impression index linter. |
72-
| `tests/memory-cli.test.ts` | 編碼正常(新建,UTF-8| New CLI and package smoke tests. |
72+
| `scripts/dejavu-lint-memory.mjs` | 編碼正常(已檢查) | Updated in UTF-8; linter now warns on low-quality impression cues and duplicate keyword sets. |
73+
| `tests/memory-cli.test.ts` | 編碼正常(已檢查| Updated in UTF-8; CLI tests now cover low-quality cue warnings. |

llms.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ Deja Vu helps agents preserve useful project memory through:
77
- explicit project rules
88
- a repeatable cue-scan, minimal-recall, and writeback workflow
99
- tiny Markdown and JSONL memory files inside the repository
10+
- cue quality checks that keep the first recall step sparse and specific
1011

1112
The minimum adoption path does not require npm, embeddings, vector search, or a dedicated memory service.
1213

@@ -36,6 +37,9 @@ The minimum adoption path does not require npm, embeddings, vector search, or a
3637
- project-scoped continuity
3738
- recall before substantial work
3839
- impression-first scripted recall
40+
- cue quality linting
41+
- gist-first summaries
42+
- boundary-aware chunking
3943
- recall budget
4044
- selective writeback
4145
- event ledger continuity

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "@focaxisdev/deja-vu",
3-
"version": "0.3.0",
4-
"description": "Deja Vu: a cue-first memory protocol for AI agents with an optional semantic recall engine.",
3+
"version": "0.3.1",
4+
"description": "Deja Vu: a cue-first memory protocol with quality-gated recall cues and an optional semantic engine.",
55
"type": "module",
66
"main": "./dist/src/index.js",
77
"types": "./dist/src/index.d.ts",
@@ -43,6 +43,10 @@
4343
"memory-protocol",
4444
"project-memory",
4545
"markdown-memory",
46+
"cue-first-recall",
47+
"cue-quality",
48+
"gist-summary",
49+
"boundary-aware-chunking",
4650
"semantic-recall",
4751
"memory-engine",
4852
"ai-agent",

0 commit comments

Comments
 (0)