Recall is a SQLite-backed persistent memory layer for coding agents. Stop-hook extraction captures sessions as you work, MCP tools expose them mid-session, hybrid search (FTS5 + embeddings) retrieves them, and a tiered L0/L1 recall block injects identity + top-ranked records at every session start. Works across Claude Code, OpenCode, and Pi from one local database.
All coding agents forget when a session ends. Recall doesn't β it extracts, indexes, and recalls what matters across every session, across every agent you use.
Built on the Model Context Protocol. One SQLite file. No phone-home. No vendor lock-in.
Stable on Claude Code. Beta on Pi and Alpha for OpenCode (MCP works; lifecycle extensions are early). Codex CLI and Gemini CLI on the roadmap. See Roadmap.
AI agents have no memory between sessions. Context is lost. You repeat yourself. Decisions made last week are forgotten today. Every new session re-learns the basics.
Install once, then forget about it. Recall runs silently in the background:
ββββββββββββ ββββββββββββββββββ ββββββββββββββββ βββββββββββββββββ ββββββββββββββββ
β You Work βββββΆβ Stop hook firesβββββΆβ Auto-Extract βββββΆβ SQLite + FTS5 βββββΆβ Next Session β
βββββββ²βββββ β (end of turn) β ββββββββββββββββ βββββββββββββββββ ββββββββ¬ββββββββ
β ββββββββββββββββββ β
ββββββββββββββββββββββββββββββ Memory Available ββββββββββββββββββββββββββββββββ
- Auto-extraction β sessions are parsed into structured summaries incrementally as you work (Stop hook fires at the end of every turn, not only when you exit)
- Full-text + semantic search β find anything from any past session
- Tiered session-start context β L0 identity (who you are) + L1 importance-ranked top records load automatically
- Zero friction β no workflow changes, no manual steps
- MCP integration β your agent searches memory automatically through standard MCP tools
Four things that set Recall apart from cloud-hosted memory layers and from agent-specific scratch files:
- Local-first, zero infrastructure. One SQLite file at
~/.claude/memory.db. WAL mode,0600perms. No vector database, no graph database, no agent server, no API keys for retrieval. Nothing leaves your machine β no telemetry, no phone-home. Optional Ollama for embeddings (also local). - Multi-agent native. One memory layer across the agents you actually use. Stable on Claude Code today; Pi and OpenCode connect via MCP; Codex CLI and Gemini CLI on the way. Memories captured by one agent are searchable from any other agent on the same machine.
- Structured taxonomy, not a flat blob. Decisions (with supersede/revert lifecycle and confidence scoring), learnings, breadcrumbs, and curated Library of Alexandria entries β each has a purpose and a query path. Importance scoring (1β10) surfaces what matters first.
- Hybrid search that works offline. FTS5 keyword search ships with SQLite β no embedding infrastructure required to find anything. Optional Ollama embeddings layer on top for semantic queries. Both are merged via Reciprocal Rank Fusion. Lose Ollama, lose nothing β the keyword path keeps working.
git clone https://github.com/edheltzel/Recall.git
cd Recall
./install.shVerify it works:
mem stats # Database overview
mem doctor # Health checkRestart your agent (Claude Code, Pi, or OpenCode) to load the MCP server and hooks.
Recall's tiered SessionRecall injects a small identity file at the top of every session (the L0 tier β your role, projects, tools, and working preferences). Without it, L0 is empty and every new session has to re-learn the basics.
mem onboardA 7-question interview that writes ~/.claude/MEMORY/identity.md. Run
it once. Re-run whenever your role, active projects, or working
preferences change. Use | (not ,) to separate values so a phrase
like no force-push, ever survives as a single entry.
From inside Claude Code, /Recall:update prints the current vs. latest
release and the exact command to run. From a shell:
./update.sh --check # version check only
./update.sh # full update: pull, build, migrate, re-register hooks./uninstall.sh --dry-run # preview, touch nothing
./uninstall.sh # surgical remove; preserves memory.db + backups
./uninstall.sh --purge # also destroy memory.db + backup tree (confirmed)Full installation guide β prerequisites, platform support, session extraction setup, uninstalling
Recall sits between your agent and a single SQLite database. A WRITE path captures sessions as you work; a READ path injects memory back into every new session. The diagram below shows both flows side-by-side, with the line styles in the legend distinguishing capture (solid), recall (dashed purple), and the write-only markdown mirror (dashed gray).
Text-only architecture diagram (for terminal viewers)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA ENTRY POINTS β
β β
β ββββββββββββββ ββββββββββββββ ββββββββββββββββ ββββββββββββββ β
β β CLI Direct β β MCP Server β β Stop Hook β β Batch β β
β β mem add β β (Claude β β SessionExt- β β Extract β β
β β mem dump β β Code) β β ract.ts β β (cron) β β
β βββββββ¬βββββββ βββββββ¬βββββββ ββββββββ¬ββββββββ βββββββ¬βββββββ β
ββββββββββΌβββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββΌβββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROCESSING LAYER β
β β
β Direct Inserts: Session Extraction Pipeline: β
β mem add breadcrumb βββ Read JSONL β
β mem add decision ββββ€ β Filter noise (tool results) β
β mem add learning ββββ€ β Dedup check (.extraction_tracker) β
β memory_add (MCP) ββββ€ β Acquire lock β
β β β Claude Haiku extract β
β β (>120K? chunk β meta-extract) β
β β (fallback: Ollama) β
β β β Quality gate β
β β (requires SUMMARY + MAIN IDEAS) β
β β β β
βββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β STORAGE LAYER (Dual-Write) β
β β
β SQLite (~/.claude/memory.db) Memory Files (~/.claude/MEMORY/) β
β ββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββ β
β β sessions βββ messages β β DISTILLED.md (archive) β β
β β decisions learnings β β HOT_RECALL.md (last 10) β β
β β breadcrumbs loa_entries β β SESSION_INDEX.json β β
β β embeddings (768-dim vecs) β β DECISIONS.log β β
β β β β REJECTIONS.log β β
β β FTS5 indexes (auto-sync) β β ERROR_PATTERNS.json β β
β β WAL mode Β· 0600 perms β ββββββββββββββββββββββββββββββββ β
β ββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RETRIEVAL LAYER β
β β
β βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββββββββββ β
β βKeyword (FTS5) β βSemantic (Embed)β β Hybrid (RRF Fusion) β β
β βmem search β βmem semantic β β mem hybrid (DEFAULT) β β
β βmemory_search β βembed β Ollama β β FTS5 rank ββ β β
β β β βcosine sim β β Embed rank ββ€β merged β β
β βββββββββββββββββ ββββββββββββββββββ β RRF(k=60) ββ β β
β βββββββββββββββββββββββββββ β
β Direct: mem recent Β· mem show Β· memory_recall Β· context_for_agent β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONSUMERS: Coding agents (MCP) Β· CLI user (mem) Β· Sub-agents β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The source .excalidraw file lives at assets/how-recall-works.excalidraw β drop it onto excalidraw.com to edit.
- Session starts β A
SessionStarthook injects two tiers of context: L0 identity (your~/.claude/MEMORY/identity.md, always on) and L1 top records (top 12 by importance score, with 4 slots reserved for curated Library of Alexandria entries). L2/L3 stay on disk and are pulled on demand via MCP search. - During the session β your agent searches memory via MCP tools (
memory_search,memory_hybrid_search,memory_recall,context_for_agent) before falling back to git history. Decisions, learnings, and breadcrumbs are recorded in real-time withmemory_add. - End of every turn β A
Stophook firesSessionExtract.ts, which self-spawns a background process (non-blocking). It checks.extraction_tracker.jsonand only re-extracts if the conversation has grown meaningfully since last time β so capture is incremental, not just an "on exit" event. - Extraction pipeline β The conversation JSONL is filtered, deduplicated, and sent to the
claudeCLI running Haiku (with chunking for large sessions >120K chars). Optional Ollama fallback if the CLI fails. A quality gate rejects low-quality extractions before they're stored. - PreCompact flush β When Claude Code is about to compact its context, a
PreCompacthook (SessionPreCompact.ts) flushes the in-flight messages first, so the squashed window is never lost. - Dual-write storage β Results are written to SQLite (the only query surface β every CLI/MCP read hits this) and to markdown artifacts (
DISTILLED.md,HOT_RECALL.md, etc., write-only, human-readable). - Batch catchup (optional) β A cron job (
BatchExtract.ts) sweeps any sessions the Stop hook missed during crashes or interruptions, and ingests sessions dropped by the OpenCode plugin and Pi extension into~/.claude/MEMORY/{opencode,pi}-sessions/.install.shprints the registration command at the end β opt in by running it once; nothing is auto-scheduled. - TELOS auto-sync (PAI users) β If you use Personal AI Infrastructure (PAI), Recall ships a
TelosSync.tsSessionStart hook that watches~/.claude/skills/PAI/USER/TELOS/for changes and silently runsmem telos import --updatewhen any file is newer than the last import. This is automatic β no action required once Recall is installed and PAI's TELOS directory exists. You can also import manually at any time withmem telos import --yes. If you don't use PAI, the hook checks for the directory, finds nothing, and exits in under 1ms.
| Strategy | Command | How it works |
|---|---|---|
| Keyword | mem search "query" |
FTS5 full-text search across all tables |
| Semantic | mem embed semantic "query" |
Ollama embeddings β cosine similarity (requires Ollama) |
| Hybrid (default) | mem "query" |
Both keyword + semantic, merged with Reciprocal Rank Fusion (k=60). Falls back to keyword-only if Ollama is unavailable |
Architecture deep-dive β database tables, FTS5 indexes, extraction pipeline details
- Auto-captured session memory β extracted incrementally (Stop hook on every turn) via Claude Haiku, with
BatchExtract.tscron sweeper as a crash-recovery safety net - MCP server (
mem-mcp) βmemory_search,memory_hybrid_search,memory_recall,memory_add,memory_dump,context_for_agentexposed to your agent mid-session - Hybrid search β FTS5 keyword search + optional Ollama embeddings, fused via Reciprocal Rank Fusion. Lose Ollama, lose nothing β keyword path keeps working
- Tiered SessionRecall (v0.7.0+) β L0 identity (
~/.claude/MEMORY/identity.md) + L1 top 12 records ranked by importance, with 4 reserved slots for curated Library of Alexandria entries. L2/L3 fetched on demand - Importance scoring (1β10) β every record carries an importance score that drives what surfaces in L1. Manage with
mem pin/mem unpin/mem importance backfill - PreCompact flush β
SessionPreCompact.tswrites in-flight messages to SQLite before Claude compacts its context window, so the squashed chunk is never lost - Decision lifecycle β
mem decision supersede/reverttracks when a decision was replaced or rolled back; confidence scoring (high/medium/low) on every decision and learning - Cross-host ingestion β OpenCode plugin and Pi extension drop sessions into
~/.claude/MEMORY/{opencode,pi}-sessions/; BatchExtract pulls them into the same SQLite DB. One memory layer across agents - Library of Alexandria β curated knowledge entries (session distillations, imported docs, telos goals, quotes) with Fabric
extract_wisdomanalysis. Default importance 8 β these get reserved L1 slots - TELOS integration (PAI users) β
TelosSync.tsauto-imports your TELOS framework files (goals, mission, projects, strategies) from PAI'sUSER/TELOS/directory on every session start. Changes are detected by mtime; unchanged files are skipped. Manual import:mem telos import --yes - Breadcrumbs, decisions, learnings β three structured record types for non-session memory, addable from CLI (
mem add), MCP (memory_add), or slash commands (/Recall:add) - Benchmark harness β
mem benchmark run Bmeasures wake-up context efficiency against locked baselines so regressions are visible - Onboarding β
mem onboardruns a 7-question interview that writes your L0 identity file
Suite B measures the byte cost of session-start memory injection. Latest tracked run (2026-04-18, scope atlas-recall):
| Variant | Chars | Tokens (est, 4 ch/tok) |
|---|---|---|
| v2 tiered SessionRecall (L0 + L1 top 12) | 5,306 | ~1,327 |
| v1 flat-blob SessionRecall (simulated) | 8,020 | ~2,005 |
| CLAUDE.md static baseline | 8,760 | ~2,190 |
v2 is 51% smaller than v1 on this corpus. CLAUDE.md is hand-written static context; Recall is auto-extracted dynamic memory β the two are complementary, not competitors. Numbers scale with your own DB and L0 identity; reproduce with mem benchmark run B. Methodology and caveats live in benchmarks/README.md.
mem "kubernetes auth" # Search your memory
mem onboard # Seed your L0 identity tier (one-time)
mem dump "Session Title" # Save this session
mem add decision "Use X" ... # Record a decision
mem decision list # List decisions with status and confidence
mem pin decisions 42 # Pin a record to high importance
mem benchmark run B # Measure wake-up context efficiency
mem prune # Preview stale records for removal
mem stats # See what's stored
mem doctor # Health checkIf you're an AI agent reading this repository:
| What you need | Where to find it |
|---|---|
| Using Recall from Claude Code (MCP tools, CLI, core rules) | FOR_CLAUDE.md |
| Using Recall from OpenCode | FOR_OPENCODE.md |
| Using Recall from Pi | FOR_PI.md |
| Developing Recall (build, test, conventions) | CLAUDE.md |
Recall is built around two integration surfaces: MCP (memory search and add, available from inside the agent) and lifecycle hooks (auto-extraction, session-start context injection, pre-compact flushes). Different agents support different surfaces β the table below tracks where each one stands.
| Agent | MCP | Lifecycle hooks | Status |
|---|---|---|---|
| Claude Code | β | β Stop Β· SessionStart Β· PreCompact | Stable β reference implementation |
| Pi | β | β Beta β recall-compaction + recall-extract extensions |
In progress |
| OpenCode | β | β Alpha β recall-extract plugin |
In progress |
| Codex CLI | β | β | Coming soon |
| Gemini CLI | β | β | Coming soon |
Candidate β Cursor: both .cursor/hooks.json and MCP are first-class; the integration model maps cleanly onto Recall's existing hook architecture. Tracked but not started.
Have an agent you'd like to see supported? Open an issue β Recall is designed to be agent-agnostic, and any host that speaks MCP is a candidate.
| Guide | Description |
|---|---|
| Installation | Prerequisites, install, verify, session extraction |
| CLI Reference | All commands and options |
| MCP Tools | Tools available to AI agents |
| Architecture | Database, search, extraction pipeline |
| Slash Commands | /Recall:* commands for Claude Code |
| Upgrading | Update, backup, migration system |
| Troubleshooting | Common issues and fixes |
| Changelog | Release notes and breaking changes |
Graciously borrowing and features inspired by:
- MemPalace β tiered session-start context, PreCompact hook, importance scoring
- Personal AI Infrastructure (PAI) β TELOS framework integration
MIT





