Skip to content

Commit ed33435

Browse files
cdeustclaude
andcommitted
release: v3.15.0 — E1 v3 verification campaign + arXiv-ready papers + BEAM-10M harness
Single coherent docs + metadata commit cutting v3.15.0. 64 commits since v3.14.12 grouped thematically in CHANGELOG. README updated minimally: - Version badge 3.14.2 -> 3.15.0; citation badge 41 -> 45 papers. - Lead intro updated: 20 -> 26 biological mechanisms (matches 26-enum ablation campaign in tasks/e1-v3-results.md). - New v3.15.0 lead-line summarising the verification campaign + production fixes + arXiv-ready papers + BEAM-10M harness. - LoCoMo headline R@10 94.2% -> 94.3% and MRR 0.8278 -> 0.8279 to match the post-plasticity-fix BASELINE_NO_CONSOLIDATION JSON (benchmarks/results/ablation/locomo_v3_post_plasticity_fix/). - Science section: 41 -> 45 papers; new direct links to both PDF artefacts (arxiv-thermodynamic 30 pages, arxiv-context-assembly 37 pages). - Verification section: now references both papers + 26 enum mechanisms. - Citation block: adds @Unpublished entries for both PDFs with "arXiv ID forthcoming, endorsement in progress". CHANGELOG.md: full v3.15.0 entry covering verification campaign (LongMemEval-S 17 rows n=500, LoCoMo 14 rows × 2 sweeps n=1986, blend calibration, per-category specialization), production fixes (consolidation cadence ingest_at, plasticity result-shape contract), wiring (HOPFIELD/HDC/SPREADING_ACTIVATION/DENDRITIC_CLUSTERS/EMOTIONAL_RETRIEVAL/ MOOD_CONGRUENT_RERANK/RECONSOLIDATION + 23-mechanism CORTEX_ABLATE_<MECH> hooks at production hot-path), benchmark infrastructure (BEAM-10M LLM head-to-head harness, LoCoMo --ablate flags, blend calibration), papers (recompile with all 45 citations resolved, §6.3 three-pass integration), issue fixes (#15 Nitjsefnie session-layout discovery), and repo housekeeping. Headline numbers verified against on-disk JSONs: - LongMemEval-S: MRR=0.9124, R@10=98.4% (n=500) - LoCoMo (post-plasticity-fix): MRR=0.8279, R@10=94.3% (n=1986) - BEAM Overall: 0.591 (unchanged since 5812342) No production code touched in this commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 79f0b20 commit ed33435

6 files changed

Lines changed: 378 additions & 355 deletions

File tree

.claude-plugin/marketplace.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@
66
},
77
"metadata": {
88
"description": "Persistent memory and cognitive profiling plugins for Claude Code",
9-
"version": "3.14.5"
9+
"version": "3.15.0"
1010
},
1111
"plugins": [
1212
{
1313
"name": "cortex",
1414
"source": "./",
1515
"description": "Persistent memory and cognitive profiling for Claude Code — thermodynamic memory with heat/decay, intent-aware retrieval, biological plasticity, codebase intelligence, and cognitive profiling. 47 MCP tools with enriched schemas. PostgreSQL + pgvector in CLI mode; automatic SQLite fallback in Cowork/sandboxed mode. Curated wiki (ADRs, specs, lessons) with audit-artefact filtering. Consolidate is set-based SQL batched — decay/plasticity/pruning run 100-500× faster on large stores. Workflow graph with caller-qualified CALLS chains rendering full method-to-method dependencies (native tree-sitter, no AP required). Side panel humanized for non-technical users. Ingests codebase analysis (ai-automatised-pipeline) and PRDs (prd-spec-generator) into wiki + memory + knowledge graph. Docker image available.",
16-
"version": "3.14.5",
16+
"version": "3.15.0",
1717
"author": {
1818
"name": "Clement Deust",
1919
"email": "admin@ai-architect.tools"

CHANGELOG.md

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,153 @@ adheres to [Semantic Versioning](https://semver.org/).
66

77
## [Unreleased]
88

9+
## [3.15.0] — E1 v3 verification campaign + arXiv-ready papers + BEAM-10M harness
10+
11+
A single coherent release covering 64 commits since v3.14.12. The headline is
12+
verification: every benchmark number on the README is now backed by a
13+
per-mechanism ablation row with code SHAs, dirty flags, manifests, and
14+
per-row JSON outputs preserved alongside the writeups. Two production fixes
15+
were surfaced by the campaign and ship inside the same release. Both
16+
companion papers (thermodynamic memory + structured context assembly) are
17+
arXiv-ready.
18+
19+
### Verification campaign (paper-claim-bearing)
20+
21+
- **E1 v3 LongMemEval-S — 17-row per-mechanism ablation, n=500.** Headline
22+
`MRR = 0.9124`, `R@10 = 98.4%` (vs. published baselines `MRR = 0.882`,
23+
`R@10 = 97.8%`: **+3.0% MRR, +0.6% R@10**). Driver:
24+
`benchmarks/lib/run_e1_v3_lme.py`. Per-row JSONs:
25+
`benchmarks/results/ablation/longmemeval-s_v3/`. Writeup:
26+
`tasks/e1-v3-results.md`.
27+
- **E1 v3 LoCoMo — 14-row two-baseline ablation, n=1986.** Headline
28+
`MRR = 0.8279`, `R@10 = 94.3%` (`BASELINE_NO_CONSOLIDATION`,
29+
longitudinal-read-path anchor) — vs. CLAUDE.md baseline
30+
(`MRR = 0.794`, `R@10 = 0.926`): **+4.3% MRR, +1.7% R@10**. Re-run on
31+
plasticity-fixed bytes (commit `2f45bcb`, descendant of `5f737fe`).
32+
Cadence-fix anchor agreement re-validated identically
33+
(`ΔvsNO = +0.0014`); two consolidation-only rows
34+
(`HOMEOSTATIC_PLASTICITY`, `SCHEMA_ENGINE`) recover positive
35+
contributions previously masked by the contract bug.
36+
`benchmarks/results/ablation/locomo_v3_post_plasticity_fix/`.
37+
Writeup: `tasks/e1-v3-locomo-results-post-fix.md`. The pre-fix sweep is
38+
preserved at `tasks/e1-v3-locomo-results.md`.
39+
- **Phase A + B blend-weight calibration.** Central composite design + 5×5
40+
grid search; all six post-WRRF rerank constants confirmed near-optimum at
41+
the engineering defaults shipped today. `tasks/e1-v3-blend-calibration.md`.
42+
- **Per-category delta analysis (LME-S).** Mechanism specialization
43+
surfaced: HDC specializes for multi-session reasoning, HOPFIELD for
44+
knowledge updates, ADAPTIVE_DECAY against stable preferences.
45+
`tasks/e1-v3-per-category.md`.
46+
47+
Total: **45 per-mechanism evidence rows** across 26 enum mechanisms
48+
(17 read-path on LongMemEval-S + 9 consolidation-only routed to LoCoMo).
49+
50+
### Fixed (production fixes surfaced during verification)
51+
52+
- **`6c51bce` — consolidation cadence is now ingest-relative.**
53+
`consolidation_engine` migrated from wall-clock `created_at` to
54+
ingest-relative `ingested_at`. Recovers `MRR 0.222 → 0.8264` on
55+
backdated corpora; affects every production backfill scenario where
56+
memories carry old timestamps but were written today.
57+
- **`5f737fe` — plasticity result-shape contract preserved on ablation.**
58+
`apply_hebbian_update` no-op (when `CORTEX_ABLATE_SYNAPTIC_PLASTICITY=1`)
59+
now returns dicts with `action="none"` instead of raw edge tuples,
60+
fixing a silent `KeyError` downstream in consolidation/plasticity. This
61+
is what was masking the two consolidation-only contributions in the
62+
pre-fix LoCoMo sweep.
63+
64+
### Added (read-path mechanisms now wired end-to-end)
65+
66+
- **`ddb5b58` / `024ea1a` / `bc0ae4f`**`HOPFIELD`, `HDC`,
67+
`SPREADING_ACTIVATION`, `DENDRITIC_CLUSTERS` wired into the `pg_recall`
68+
pipeline. Batch Hopfield embeddings and real entity-set Jaccard for the
69+
dendritic stage. Query-entity resolution extended to natural-language
70+
tokens.
71+
- **`81e8d90`**`EMOTIONAL_RETRIEVAL` + `MOOD_CONGRUENT_RERANK` are now
72+
live read-path stages (not test-only).
73+
- **`9d6bc96`**`RECONSOLIDATION` post-retrieval stage wired
74+
(Nader 2000); retrieved memories become labile and may be updated
75+
against the retrieval context.
76+
- **`c5ade6b`** — VADER → `user_mood` EMA hook in `remember()`; closes
77+
the `MOOD_CONGRUENT` signal gap end-to-end.
78+
- **`b4b23e7`**`PgMemoryStore.get_user_mood` / `set_user_mood` +
79+
`user_mood` DDL; the column the read-path stage was reading didn't
80+
exist before this.
81+
- **`099ba1e` / `54f8501`** — 23 mechanisms now have
82+
`CORTEX_ABLATE_<MECH>=1` env-var hooks reading at the production
83+
hot-path (not just at test wiring), so ablation studies exercise the
84+
same code path as production.
85+
86+
### Added (benchmark + verification infrastructure)
87+
88+
- **`3201cc3` / `0a53996`** — BEAM-10M LLM head-to-head harness scaffold
89+
+ live mode wiring at `benchmarks/llm_head_to_head/`; smoke pending
90+
API keys.
91+
- **`0e1f90d`** — LongMemEval-S `--with-consolidation` flag.
92+
- **`b68c5ac` / `ef178da`** — LoCoMo `--ablate` + `--with-consolidation`
93+
+ `--results-out` flags + 14-row driver `run_e1_v3_locomo.py`.
94+
- **`f09485d`** — Blend-weight calibration infrastructure with
95+
pre-registration; harness dirty-check matched to pre-reg
96+
(`39ab694` ignores submodule internal state).
97+
- **`5a5d8d3` / `3eab1ed`** — E2 N-scan rebuilt as real-benchmark
98+
subsample + Zipf synthetic; ablation env vars wired into the
99+
production code path.
100+
- **DB snapshot + restore + HNSW determinism infrastructure** (E2 / E3 /
101+
E4 / E5 internal harnesses).
102+
103+
### Added (papers + endorsement materials)
104+
105+
- **`6b80760` / `3ace1fb` / `3eaeaf6`**`docs/arxiv-thermodynamic/main.pdf`
106+
compiled, 30 pages. Ported to LaTeX matching `arxiv/main.tex` style.
107+
- **`9e6ddf6`** — Recompile with bibtex pass; **all 45 citations now
108+
resolve** (vs. the previous 4 unresolved `??` markers).
109+
- **`bce4840` / `db4fe0a` / `6f75221`** — §6.3 three-pass integration:
110+
LME-S evidence + LoCoMo subsection + post-fix re-run + cadence-fix
111+
narrative + plasticity-fix narrative.
112+
- **`fa9c101` / `fb6f67f`** — §6.4 Operating Regime added; full E2b Zipf
113+
curve integrated; falsifications reframed as predicted boundaries with
114+
the `N=100k` datapoint landed.
115+
- **`a787fe6`** — Refresh `linkedin-endorser-post.md`; new
116+
`arxiv-endorsement-email.md` template with pre-submission checklist.
117+
- **`974c364` / `2152946`** — Prose polish; `BEAM Overall 0.543 → 0.591`
118+
number fix in CLAUDE.md and the markdown source.
119+
- **`ffcad91`** — Repo reorg: `arxiv/``arxiv-context-assembly/` +
120+
paper-md moved into `docs/papers/`.
121+
- **`docs/arxiv-context-assembly/main.pdf`** — 37 pages, pre-existing
122+
verbatim + argmax bugs fixed, arXiv-ready.
123+
124+
### Fixed (issue fixes from contributors)
125+
126+
- **`5398745`** — issue #15 (Nitjsefnie). `discover_files` walks all four
127+
session layouts (subagent + teammate transcripts), recovers ~89% of
128+
session content during backfill that was previously dropped.
129+
130+
### Fixed (CI + plumbing)
131+
132+
- **`df14e16`** — DDL comment semicolon broke `ddl.split(';')` extractor.
133+
- **`9f94bd3`**`user_mood` DDL comment semicolon + test uses dominant
134+
beta.
135+
- **`34aa452`** — Repair docstring boundary in `cls.run_cls_cycle`
136+
(broken in `3eab1ed`).
137+
- **`51ce608` / `c4253cc` / `5271828` / `fd51f6f` / `4918638` / `79f0b20`**
138+
ruff format + drop unused imports in verification harnesses;
139+
bump tool count to 47.
140+
- **`18b4be4`** — ruff format on `memories_page` + `memories_facets`.
141+
142+
### Changed (visualization, repo housekeeping)
143+
144+
- **`63bacca` / `2953bae` / `b7a8f97`** — Paged Knowledge + Board with
145+
filter chips, lazy-load; default landing reverted to Knowledge; Graph
146+
view restored to pre-d3-removal state with a warning banner.
147+
- **22 stale public repos archived; `ai-prd-mcp` deleted** — security
148+
hardening (legacy build artefacts had embedded keys at one point) +
149+
portfolio cleanup.
150+
- **`551a411` / `30d80fe`** — Profile README draft for `cdeust/cdeust`
151+
(controls AI Overview narrative); profile draft points
152+
`AI Architect` to website not archived repo.
153+
- **Cortex repo description + topics refreshed** for AI-search
154+
discovery.
155+
9156
## [3.14.12] — fix MCP client deadlock on long upstream responses
10157

11158
### Fixed

README.md

Lines changed: 28 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@
77
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="MIT License"></a>
88
<img src="https://img.shields.io/badge/python-3.10+-blue.svg" alt="Python 3.10+">
99
<img src="https://img.shields.io/badge/tests-2500_passing-brightgreen.svg" alt="Tests">
10-
<img src="https://img.shields.io/badge/citations-41_papers-orange.svg" alt="Citations">
11-
<img src="https://img.shields.io/badge/version-3.14.2-brightgreen.svg" alt="Version 3.14.2">
10+
<img src="https://img.shields.io/badge/citations-45_papers-orange.svg" alt="Citations">
11+
<img src="https://img.shields.io/badge/version-3.15.0-brightgreen.svg" alt="Version 3.15.0">
1212
<a href="https://glama.ai/mcp/servers/cdeust/Cortex"><img src="https://glama.ai/mcp/servers/cdeust/Cortex/badges/score.svg" alt="Glama score: security A, license A"></a>
1313
</p>
1414

@@ -30,7 +30,9 @@ Claude Code forgets you every time you close the tab. Every architecture decisio
3030

3131
Cortex is a persistent memory engine for Claude Code built on computational neuroscience. It remembers what you worked on, how you think, what you decided and why. Not as a dumb text dump shoved into context, but as a living memory system that consolidates, forgets intelligently, and reconstructs the right context at the right time.
3232

33-
**20 biological mechanisms. 47 MCP tools. 9 automatic hooks. Runs entirely on your machine. PostgreSQL + pgvector.**
33+
**26 biological mechanisms. 47 MCP tools. 9 automatic hooks. Runs entirely on your machine. PostgreSQL + pgvector.**
34+
35+
**v3.15.0 — verification campaign + arXiv-ready papers**: 45 per-mechanism ablation rows across LongMemEval-S (17 rows, n=500) and LoCoMo (14 rows × 2 sweeps, n=1986). Headline numbers stay verified — LongMemEval R@10 = 98.4% / MRR = 0.9124, LoCoMo R@10 = 94.3% / MRR = 0.8279 — and every figure now traces to a JSON in `benchmarks/results/ablation/` with code SHAs, dirty flags, and per-row category breakdowns. The thermodynamic memory paper (`docs/arxiv-thermodynamic/main.pdf`, 30 pages, all 45 citations resolved) and the structured context assembly paper (`docs/arxiv-context-assembly/main.pdf`, 37 pages) are arXiv-ready. Two production fixes surfaced during verification: consolidation cadence is now ingest-relative instead of wall-clock (recovers MRR 0.222 → 0.8264 on backdated corpora), and the plasticity ablation no-op preserves the result-shape contract (no more silent KeyError). HOPFIELD, HDC, SPREADING_ACTIVATION, DENDRITIC_CLUSTERS, EMOTIONAL_RETRIEVAL, MOOD_CONGRUENT_RERANK, and RECONSOLIDATION are now wired end-to-end on the production read path; 23 mechanisms have CORTEX_ABLATE_<MECH> hooks reading at the hot path. BEAM-10M LLM head-to-head harness scaffolded at `benchmarks/llm_head_to_head/`. [Release notes →](https://github.com/cdeust/Cortex/releases/tag/v3.15.0)
3436

3537
**v3.14.2 — call graph lit + queryable**: the workflow graph now renders the actual call and import edges between symbols — not just the AST shells. Every edge carries a *confidence* (0.0–1.0) and a *reason* tag (`direct-ast`, `import-scope-lookup`, `memory-entities-link`, …) so you can tell a resolved call from a same-name guess at a glance. Knowledge-graph entities ship as a first-class layer: ~10k entities extracted from memory text land between the memory ring and the file shell, heat-weighted centroid-placed near the memories that mention them. And a new `query_workflow_graph` MCP tool returns typed subgraphs on demand — filter by `node_kind`, `edge_kind`, `neighbour_of <id> + depth`, or `domain`, so downstream agents can reason over graph slices without rebuilding from scratch.
3638

@@ -152,8 +154,8 @@ LoCoMo (Maharana et al., ACL 2024): 1,986 questions across 10 conversations, inc
152154

153155
| | Cortex | What it means |
154156
|---|---|---|
155-
| Recall@10 | **94.2%** | Right memory in top 10 over 9 times out of 10 (n=1986, BASELINE_NO_CONSOLIDATION) |
156-
| MRR | **0.8278** | Correct answer is typically the first result |
157+
| Recall@10 | **94.3%** | Right memory in top 10 over 9 times out of 10 (n=1986, BASELINE_NO_CONSOLIDATION, post-plasticity-fix) |
158+
| MRR | **0.8279** | Correct answer is typically the first result |
157159

158160
| Category | MRR | R@10 | Why this score |
159161
|---|---|---|---|
@@ -234,7 +236,7 @@ Cortex doesn't store memories the way a database stores rows. It treats them mor
234236

235237
**Similar memories stay distinct.** Pattern separation — modeled on the dentate gyrus, which keeps "Tuesday's standup" separate from "Wednesday's standup" even though they're almost identical. Without this, retrieval returns the same generic match for every similar query. *(Leutgeb et al. 2007; Yassa & Stark 2011)*
236238

237-
**41 papers total.** Every algorithm, constant, and threshold traces to a published source. Full citations, equations, ablation data, and per-module implementation audit: **[docs/papers/science.md](docs/papers/science.md)** | **[Research post on structured context assembly](docs/research-post-context-assembly.md)**
239+
**45 papers total.** Every algorithm, constant, and threshold traces to a published source. Full citations, equations, ablation data, and per-module implementation audit: **[docs/papers/science.md](docs/papers/science.md)** | **[Thermodynamic memory paper (PDF, 30 pages)](docs/arxiv-thermodynamic/main.pdf)** | **[Structured context assembly paper (PDF, 37 pages)](docs/arxiv-context-assembly/main.pdf)** | **[Research post on structured context assembly](docs/research-post-context-assembly.md)**
238240

239241
---
240242

@@ -391,19 +393,38 @@ Every benchmark headline number above is backed by a per-mechanism ablation camp
391393
- **LoCoMo, 14 rows, n=1986 (pre-plasticity-fix bytes)**`tasks/e1-v3-locomo-results.md`. Two-baseline (NO_CONSOLIDATION / WITH_CONSOLIDATION) design; empirical resolution of the architectural-mismatch hypothesis (RECONSOLIDATION ΔMRR = +0.0076, ADAPTIVE_DECAY ΔMRR = -0.0163).
392394
- **LoCoMo, 14 rows, n=1986 (post-plasticity-fix bytes)**`tasks/e1-v3-locomo-results-post-fix.md`. Re-run on commit `2f45bcb` (descendant of plasticity result-shape fix `5f737fe`); cadence-fix anchor agreement re-validated identically (ΔvsNO = +0.0014); two consolidation-only rows (HOMEOSTATIC_PLASTICITY, SCHEMA_ENGINE) recover positive contributions previously masked by the contract bug.
393395

394-
Total: 45 per-mechanism evidence rows. The full paper, including the §6.3 per-mechanism evidence section and §6.3.4.1 plasticity-fix re-run subsection, is at `docs/arxiv-thermodynamic/main.pdf`.
396+
Total: 45 per-mechanism evidence rows across 26 enum mechanisms (17 read-path + 9 consolidation-only routed to LoCoMo). The full thermodynamic memory paper, including §6.3 per-mechanism evidence and §6.3.4.1 plasticity-fix re-run subsection, is at `docs/arxiv-thermodynamic/main.pdf` (30 pages, all 45 citations resolved). The companion structured context assembly paper is at `docs/arxiv-context-assembly/main.pdf` (37 pages).
395397

396398
## License
397399

398400
MIT
399401

400402
## Citation
401403

404+
If you reference the system, the paper PDFs on `main` are the canonical artefacts (arXiv IDs forthcoming, endorsement in progress):
405+
402406
```bibtex
403407
@software{cortex2026,
404408
title={Cortex: Persistent Memory for Claude Code},
405409
author={Deust, Clement},
406410
year={2026},
407411
url={https://github.com/cdeust/Cortex}
408412
}
413+
414+
@unpublished{deust2026thermodynamic,
415+
title={Thermodynamic Memory for Conversational Agents:
416+
A Per-Mechanism Ablation Study on LongMemEval and LoCoMo},
417+
author={Deust, Clement},
418+
year={2026},
419+
note={arXiv ID forthcoming, endorsement in progress},
420+
url={https://github.com/cdeust/Cortex/blob/main/docs/arxiv-thermodynamic/main.pdf}
421+
}
422+
423+
@unpublished{deust2026context,
424+
title={Structured Context Assembly for Long-Horizon Conversational Memory},
425+
author={Deust, Clement},
426+
year={2026},
427+
note={arXiv ID forthcoming, endorsement in progress},
428+
url={https://github.com/cdeust/Cortex/blob/main/docs/arxiv-context-assembly/main.pdf}
429+
}
409430
```

0 commit comments

Comments
 (0)