A memory database for AI agents. Stores four record types — Memory, Entity, Statement, Relation — with explicit provenance, confidence, and bi-temporal validity. Hybrid retrieval (semantic + lexical + entity-graph + temporal) fused with weighted rank fusion. One Rust core, one wire protocol, one schema. Apache 2.0.
$ brain
─────────────────────────────────────────────────────────────────────────────
◉ brain-shell v0.1.0 · connected to 127.0.0.1:9090
agent agent-019e433e (auto-minted)
Type `help` for commands, `quit` to exit.
─────────────────────────────────────────────────────────────────────────────
brain> encode "Had a difficult conversation with Alex about the project"
✓ ENCODED LSN 1 · s1/m1/v1 · 9 ms
brain> recall "conflicts with Alex" --top-k 5
# → ranked by semantic similarity, edge proximity, recency, and salience
# — not just vector distance.
# With a schema declared, RECALL routes through the hybrid path:
# semantic + lexical + entity-graph, fused via RRF.
brain> recall "what's Priya working on?" --include-graph
- Why Brain
- What Brain stores
- Schemaless vs schema-declared
- Quickstart
- Cognitive operations
- Architecture in 30 seconds
- Performance targets
- Status
- Documentation
- Repository layout
- Tech stack
- Platform support
- Contributing
- License
Today's agent stacks duct-tape four or five storage systems: a vector database for similarity, a graph database for relationships, a full-text store for keyword matching, an LLM extraction pipeline, plus an orchestration layer that pretends to keep them consistent. Half of that orchestration is reinventing transaction semantics across systems that don't agree on what "committed" means.
Brain collapses the stack into one Rust core with one wire protocol and one schema:
- Cognitive verbs, not CRUD.
encode/recall/plan/reason/forgetare the primitive operations. - Hybrid retrieval out of the box. Three retrievers (semantic / lexical / graph) fused via weighted RRF — not "top-k by cosine."
- Provenance is first-class. Every typed claim carries an evidence list back to source memories, plus four bi-temporal timestamps.
- Predictable tail latency. Thread-per-core (Glommio +
io_uring), single-writer-per-shard, lock-free reads, group-commit WAL. - Apache 2.0, end to end. No premium edition, no SaaS lock-in. The typed graph, the extractor pipeline, the reranker, the schema DSL, the SDKs are all in the open repo.
The architectural justification, the five design wedges, and the comparison with adjacent systems (Pinecone, Qdrant, Neo4j, Mem0, Letta, Zep) are in spec/01_architecture/.
Four record types, one database:
| Record | What it is | Example |
|---|---|---|
| Memory | Raw experience — text + 384-dim embedding + salience + edges + provenance | "Alex pushed the deadline to next Friday" |
| Entity | Canonical noun with a stable UUIDv7 identity, alias list, and typed attributes | Person(canonical_name="Alex Chen", aliases=["Alex"]) |
| Statement | Typed claim about entities — Fact / Preference / Event — with confidence and bi-temporal validity |
Event(subject=alex, predicate=pushed, object="deadline to Friday", valid_from=t0, confidence=0.92) |
| Relation | Typed binary edge between entities, with cardinality and evidence | reports_to(alex, priya) |
Entities, Statements, and Relations are derived from Memories by a three-tier extractor pipeline (pattern → GLiNER classifier → LLM with prompt cache) when a schema is declared.
The full data model is in spec/02_data_model/.
The same Brain binary serves both modes; the runtime gate is whether the per-shard SCHEMA_ACTIVE_VERSIONS_TABLE is empty.
| Mode | What's active | Use it when |
|---|---|---|
| Schemaless (default) | Memory record only. ENCODE, RECALL, PLAN, REASON, FORGET, SUBSCRIBE, TXN_* over vector memory. |
Prototyping; semantic memory without typed-graph overhead; small agents. |
| Schema-declared | All of the above + typed extraction + entity/statement/relation writes + hybrid retrieval over the typed graph. | Production agents that need provenance, temporal reasoning, supersession, or entity-anchored queries. |
A deployment can move in either direction. Declaring a schema after months of schemaless use kicks off a backfill. Schema is not a legacy or minimal mode — both postures are first-class.
The DSL is documented in spec/03_schema/. Example:
namespace acme
define entity_type Person {
attributes {
email: text optional unique
team: text optional
timezone: text optional
}
}
define predicate prefers {
kind: Preference
object: Value<text>
}
define relation_type reports_to {
from: Person
to: Person
cardinality: many-to-one
}
define extractor preferences {
kind: llm
target: statement Preference
trigger: on encode where memory.kind = episodic
model: "claude-haiku-4-5"
confidence_threshold: 0.7
cache: enabled
}
Requires: Docker, @devcontainers/cli (npm install -g @devcontainers/cli).
git clone https://github.com/brain-db-io/brain-db
cd brain-db
just docker-up # builds image, starts container, runs post-create
just docker-shell # bash inside the dev containerInside the container:
just verify # fmt + build + clippy + test
cargo run --bin brain-server -- --config config/dev.toml # the daemon
cargo run --bin brain # the interactive shell
cargo run --bin brain-cli -- stats # admin HTTP CLIThree binaries:
brain-server— the daemon (TCP + optional HTTP/WS/SSE).brain— interactive shell (REPL + one-shot wire-protocol verbs; thepsql/redis-cliequivalent).brain-cli— admin CLI over HTTP (snapshots, audit, worker control).
A persistent agent identity is opt-in; by default each connection mints an ephemeral one:
brain agent create work # named agent persisted to ~/.config/brain/
brain --agent work encode "..." # use itFull setup walkthrough: docs/tutorials/01-quickstart-docker.md.
The verbs that drive Brain. Full semantics are in spec/05_operations/.
| Verb | What it does |
|---|---|
| ENCODE | Store an experience. Embeds the text, picks a slot, writes the WAL record, updates metadata, inserts into HNSW. With a schema declared, queues extractors. |
| RECALL | Find memories relevant to a cue. Blends similarity with salience, recency, and edge proximity. Routes through the hybrid retriever when a schema is declared. |
| PLAN | Construct a path from one cognitive state to another. Pull-based executor with budgets (steps, wall time, branches). |
| REASON | Multi-hop traversal explaining why X is connected to Y. Returns the path, evidence memories, and confidence. |
| FORGET | Soft (mark + grace period) or hard (zero the slot) tombstoning. Cascades to derived typed-graph records when a schema is active. |
| LINK / UNLINK | Manually assert / retract a typed edge between two memories. |
| SUBSCRIBE | Stream events: memory created, statement created, extractor failed, schema updated, etc. |
| TXN_BEGIN / TXN_COMMIT / TXN_ABORT | Group multiple operations into one atomic unit. |
One-shot mode (each invocation runs a single verb and exits):
brain encode "Alex pushed the deadline to next Friday"
brain recall "when did Alex change the deadline?" --top-k 5 --include-text
brain plan "current sprint state" "feature shipped" --max-steps 8
brain forget s1/m18/v1 --mode softOr the same inside the REPL — no brain prefix:
brain> encode "Alex pushed the deadline to next Friday"
brain> recall "when did Alex change the deadline?" --top-k 5
brain> reason "Alex changed the deadline" --depth 3
brain> subscribe --kind episodic --collect 10
Encoding the same content twice is a no-op by default — pass --allow-duplicate to write a fresh copy.
┌─────────────────────────────────────────────────────────────────────────────┐
│ CLIENTS · SDKs (Rust · Python · TS · Go) │
└────────────────────────────────────┬────────────────────────────────────────┘
│ custom binary protocol over TCP
│ rkyv structured + bytemuck raw vectors
┌────────────────────────────────────▼────────────────────────────────────────┐
│ CONNECTION LAYER · Tokio · accept · TLS · frame validate · shard dispatch │
└────────────────────────────────────┬────────────────────────────────────────┘
│ message channels, one per shard
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Shard 0 │ │ Shard 1 │ │ Shard N │
│ Glommio + │ │ Glommio + │ │ Glommio + │
│ io_uring │ │ io_uring │ │ io_uring │
│ │ │ │ │ │
│ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │
│ │ arena │ │ │ │ arena │ │ │ │ arena │ │
│ │ WAL │ │ │ │ WAL │ │ │ │ WAL │ │
│ │ redb │ │ │ │ redb │ │ │ │ redb │ │
│ │ HNSW×3 │ │ │ │ HNSW×3 │ │ │ │ HNSW×3 │ │
│ │ tantvy │ │ │ │ tantvy │ │ │ │ tantvy │ │
│ └────────┘ │ │ └────────┘ │ │ └────────┘ │
│ │ │ │ │ │
│ Single writer per shard. Lock-free reads via ArcSwap + crossbeam. │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────────────────┼────────────────────┘
▼
BACKGROUND WORKERS (per-shard, dedicated cores)
decay · consolidation · HNSW maintenance · GC
schema-declared: extractors · text indexer · sweepers
Two runtimes, one host. Connection layer on Tokio (many tasks, accept TCP, decode 32-byte frame, dispatch). Shard layer on Glommio (thread-per-core, io_uring, single writer per shard). The two communicate via channels carrying messages — per-shard data never crosses the boundary.
Six data structures per shard:
| Structure | Role | Spec |
|---|---|---|
| Arena | mmap'd file of 1600-byte slots (1536 vector + 64 metadata/padding) | spec/08_storage/01_arena.md |
| WAL | Per-shard append-only log; O_DIRECT + pwritev2(RWF_DSYNC) group commit |
spec/08_storage/02_wal.md |
| redb | Embedded ACID B-tree for metadata + typed-graph tables | spec/10_metadata/02_table_layout.md |
| HNSW × 3 | Memory M=16, ef_c=200, ef_s=64; Entity M=16, ef_c=100, ef_s=64; Statement M=32, ef_c=200, ef_s=128 |
spec/09_indexing/01_hnsw_basics.md |
| tantivy × 2 | BM25 over memory text + statement text | spec/10_metadata/06_tantivy_layout.md |
| LLM cache | Separate redb for extractor responses with TTL | spec/11_extractors/06_prompt_caching.md |
Seven non-negotiable invariants (from spec/08_storage/00_purpose.md and CLAUDE.md §5):
- WAL-before-acknowledge — no operation returns success until its WAL record is fsynced.
- Single writer per shard — no locks needed; the discipline enforces it.
- CRC everywhere — every WAL record, every arena slot. Reads verify; mismatches halt.
- Slot version on
MemoryId— encoded in the ID; stale references →NotFound. - Idempotency by
RequestId— same params → cached response; different params →Conflict. - Tombstone grace before reclamation — default 7 days. Hard FORGET zeroes immediately.
- No silent corruption — fail-stop and alert. Never return wrong data.
Tested per spec/19_benchmarks/01_correctness_and_durability.md.
For the layered architecture diagram (seven internal layers from L1 connection through L7 sharding) and the full design wedges, see spec/01_architecture/04_layers.md and spec/01_architecture/07_wedges_and_roadmap.md.
Hard targets from spec/01_architecture/05_hardware_and_targets.md §7 and spec/19_benchmarks/02_performance_targets.md. Single shard, warm, reference hardware (16-core x86_64 / 64 GB RAM / NVMe SSD):
| Operation | p50 | p99 |
|---|---|---|
ENCODE (text, CPU embedding) |
≤ 12 ms | ≤ 25 ms |
ENCODE (text, GPU embedding) |
≤ 3 ms | ≤ 8 ms |
ENCODE_VECTOR_DIRECT (pre-supplied vector) |
≤ 1 ms | ≤ 5 ms |
RECALL (top-k = 10, schemaless) |
≤ 8 ms | ≤ 20 ms |
RECALL (top-k = 10, hybrid path) |
≤ 10 ms | ≤ 50 ms |
FORGET |
≤ 3 ms | ≤ 10 ms |
PLAN (simple) |
≤ 50 ms | ≤ 200 ms |
REASON |
≤ 100 ms | ≤ 500 ms |
Brain optimizes for predictable tails, not minimum averages. The combined acceptance suite at spec/19_benchmarks/06_complete_acceptance.md is the v1.0 release gate.
Pre-release (v0.1.0). No external users. The wire protocol, redb tables, and schema model are still in flux. Until v1.0 ships, breaking changes happen in place without back-compat shims.
The v1.0 release ships when the combined acceptance suite passes — functional, performance, storage, operational, and schemaless mode tests, end-to-end.
For the high-level phase index, see ROADMAP.md. For per-phase sub-task breakdowns, see docs/development/phases/.
| Topic | Location |
|---|---|
| Specification (153 files, 20 sections, normative) | spec/ |
| Spec entry point + glossary + doc map | spec/00_overview/ |
| System architecture + design wedges | spec/01_architecture/ |
| Data model (Memory / Entity / Statement / Relation) | spec/02_data_model/ |
| Wire protocol (frames + opcodes + handshake) | spec/04_wire_protocol/ |
| Schema DSL grammar | spec/03_schema/ |
| Acceptance gate for v1.0 | spec/19_benchmarks/06_complete_acceptance.md |
| Tutorials | docs/tutorials/ |
| Setup, CLI tour, SDK tour, troubleshooting | docs/development/usage/ |
CLI reference (brain shell + brain-cli admin) |
docs/reference/ |
| Roadmap (phase index) | ROADMAP.md |
| Project context for AI-assisted dev | CLAUDE.md |
| Autonomous-mode operating rules | AUTONOMY.md |
brain/
├── crates/
│ ├── brain-core/ Shared types: MemoryId, EdgeKind, Error, EntityId, ...
│ ├── brain-protocol/ Wire protocol: frame, opcodes, codec, schema DSL parser
│ ├── brain-storage/ Arena + WAL + recovery
│ ├── brain-metadata/ redb wrapper: memory + entity + statement + relation tables
│ ├── brain-index/ HNSW × 3 + tantivy
│ ├── brain-embed/ BGE embedding service
│ ├── brain-planner/ Query planner + executor
│ ├── brain-ops/ One write path + retrievers + extractor writes
│ ├── brain-workers/ Background workers (decay, consolidation, extractors, …)
│ ├── brain-extractors/ Pattern + classifier extractors
│ ├── brain-llm/ LLM client + cache + budget
│ ├── brain-http/ HTTP/WS/SSE transport
│ ├── brain-server/ Server binary
│ ├── brain-sdk-rust/ Rust SDK
│ └── brain-cli/ Admin CLI
├── spec/ The 153-file specification (authoritative)
├── docs/ Tutorials, references, development guides
├── ROADMAP.md Implementation phase index
├── CLAUDE.md Project context
└── AUTONOMY.md Autonomous-mode operating rules
Pinned in the workspace Cargo.toml. New dependencies require commit-message justification.
| Component | Crate |
|---|---|
| Async runtime (shards) | glommio — thread-per-core, io_uring |
| Async runtime (connection layer) | tokio |
| Wire encoding | rkyv + bytemuck |
| Metadata store | redb |
| ANN index | hnsw_rs |
| Lexical index | tantivy |
| Embedding inference | candle + tokenizers |
| SIMD math | matrixmultiply + wide |
| Lock-free swap | arc-swap |
| Epoch GC | crossbeam-epoch |
| CRC | crc32c |
| UUIDs (v7) | uuid |
| Errors | thiserror + anyhow |
| Telemetry | tracing + opentelemetry |
Linux only. Kernel ≥ 5.15 (for stable io_uring). macOS and Windows are not supported; use the supplied dev container for local development on those platforms.
Brain depends on Linux-specific I/O facilities: io_uring, O_DIRECT, madvise(MADV_RANDOM | MADV_DONTDUMP), fallocate(FALLOC_FL_KEEP_SIZE). Abstracting these would either leak platform differences in tail latency or bloat the codebase with multiple backends. For a system whose value proposition is latency, one optimized backend wins.
CPU: x86_64 with SSE 4.2 or ARM64 with the CRC32 extension. AVX2 / NEON used opportunistically. Full hardware envelope in spec/01_architecture/05_hardware_and_targets.md.
Brain is pre-release. The wire protocol, on-disk formats, and schema model still change without back-compat shims. Until v1.0:
- Spec changes go through the project owner. Code disagreements with the spec are fixed by changing the code.
- The seven invariants in
spec/08_storage/00_purpose.mdand CLAUDE.md §5 are non-negotiable. - New dependencies require commit-message justification; the approved set is in
Cargo.toml.
CI (.github/workflows/ci.yml) is the authoritative test gate. Run just verify locally before pushing — it does fmt + build + clippy -D warnings + test.
By submitting a pull request, you agree your contribution is licensed under the Apache-2.0 terms (per Apache-2.0 §5).
Apache-2.0. Source code, spec, and documentation are all under the same license.
Repository: https://github.com/brain-db-io/brain-db