Skip to content

Commit 64178d8

Browse files
phernandezclaude
andcommitted
docs: add semantic search guide
Covers configuration, embedding providers (FastEmbed/OpenAI), search modes (text/vector/hybrid), the reindex command, chunking strategy, RRF hybrid fusion, and database backend details (sqlite-vec/pgvector). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
1 parent 5414de5 commit 64178d8

1 file changed

Lines changed: 215 additions & 0 deletions

File tree

docs/semantic-search.md

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
# Semantic Search
2+
3+
This guide covers Basic Memory's optional semantic (vector) search feature, which adds meaning-based retrieval alongside the existing full-text search.
4+
5+
## Overview
6+
7+
Basic Memory's default search uses full-text search (FTS) — keyword matching with boolean operators. Semantic search adds vector embeddings that capture the *meaning* of your content, enabling:
8+
9+
- **Paraphrase matching**: Find "authentication flow" when searching for "login process"
10+
- **Conceptual queries**: Search for "ways to improve performance" and find notes about caching, indexing, and optimization
11+
- **Hybrid retrieval**: Combine the precision of keyword search with the recall of semantic similarity
12+
13+
Semantic search is **opt-in** — existing behavior is completely unchanged unless you enable it. It works on both SQLite (local) and Postgres (cloud) backends.
14+
15+
## Quick Start
16+
17+
1. Enable semantic search:
18+
19+
```bash
20+
export BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=true
21+
```
22+
23+
2. Build vector embeddings for your existing content:
24+
25+
```bash
26+
bm reindex --embeddings
27+
```
28+
29+
3. Search using semantic modes:
30+
31+
```python
32+
# Pure vector similarity
33+
search_notes("login process", search_type="vector")
34+
35+
# Hybrid: combines FTS precision with vector recall (recommended)
36+
search_notes("login process", search_type="hybrid")
37+
38+
# Traditional full-text search (still the default)
39+
search_notes("login process", search_type="text")
40+
```
41+
42+
## Configuration Reference
43+
44+
All settings are fields on `BasicMemoryConfig` and can be set via environment variables (prefixed with `BASIC_MEMORY_`).
45+
46+
| Config Field | Env Var | Default | Description |
47+
|---|---|---|---|
48+
| `semantic_search_enabled` | `BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED` | `false` | Enable semantic search. Required before vector/hybrid modes work. |
49+
| `semantic_embedding_provider` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER` | `"fastembed"` | Embedding provider: `"fastembed"` (local) or `"openai"` (API). |
50+
| `semantic_embedding_model` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_MODEL` | `"bge-small-en-v1.5"` | Model identifier. Auto-adjusted per provider if left at default. |
51+
| `semantic_embedding_dimensions` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_DIMENSIONS` | Auto-detected | Vector dimensions. 384 for FastEmbed, 1536 for OpenAI. Override only if using a non-default model. |
52+
| `semantic_embedding_batch_size` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_BATCH_SIZE` | `64` | Number of texts to embed per batch. |
53+
| `semantic_vector_k` | `BASIC_MEMORY_SEMANTIC_VECTOR_K` | `100` | Candidate count for vector nearest-neighbour retrieval. Higher values improve recall at the cost of latency. |
54+
55+
## Embedding Providers
56+
57+
### FastEmbed (default)
58+
59+
FastEmbed runs entirely locally using ONNX models — no API key, no network calls, no cost.
60+
61+
- **Model**: `BAAI/bge-small-en-v1.5`
62+
- **Dimensions**: 384
63+
- **Tradeoff**: Smaller model, fast inference, good quality for most use cases
64+
65+
```bash
66+
# FastEmbed is the default — just enable semantic search
67+
export BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=true
68+
```
69+
70+
### OpenAI
71+
72+
Uses OpenAI's embeddings API for higher-dimensional vectors. Requires an API key.
73+
74+
- **Model**: `text-embedding-3-small`
75+
- **Dimensions**: 1536
76+
- **Tradeoff**: Higher quality embeddings, requires API calls and an OpenAI key
77+
78+
```bash
79+
export BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=true
80+
export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=openai
81+
export OPENAI_API_KEY=sk-...
82+
```
83+
84+
When switching from FastEmbed to OpenAI (or vice versa), you must rebuild embeddings since the vector dimensions differ:
85+
86+
```bash
87+
bm reindex --embeddings
88+
```
89+
90+
## Search Modes
91+
92+
### `text` (default)
93+
94+
Full-text keyword search using FTS5 (SQLite) or tsvector (Postgres). Supports boolean operators (`AND`, `OR`, `NOT`), phrase matching, and prefix wildcards.
95+
96+
```python
97+
search_notes("project AND planning", search_type="text")
98+
```
99+
100+
This is the existing default and does not require semantic search to be enabled.
101+
102+
### `vector`
103+
104+
Pure semantic similarity search. Embeds your query and finds the nearest content vectors. Good for conceptual or paraphrase queries where exact keywords may not appear in the content.
105+
106+
```python
107+
search_notes("how to speed up the app", search_type="vector")
108+
```
109+
110+
Returns results ranked by cosine similarity. Individual observations and relations surface as first-class results, not collapsed into parent entities.
111+
112+
### `hybrid`
113+
114+
Combines FTS and vector results using reciprocal rank fusion (RRF). This is generally the best mode when you want both keyword precision and semantic recall.
115+
116+
```python
117+
search_notes("authentication security", search_type="hybrid")
118+
```
119+
120+
RRF merges the two ranked lists so that items appearing in both get a score boost, while items found by only one method still appear.
121+
122+
### When to Use Which
123+
124+
| Mode | Best For |
125+
|---|---|
126+
| `text` | Exact keyword matching, boolean queries, tag/category searches |
127+
| `vector` | Conceptual queries, paraphrase matching, exploratory searches |
128+
| `hybrid` | General-purpose search combining precision and recall |
129+
130+
## The Reindex Command
131+
132+
The `bm reindex` command rebuilds search indexes without dropping the database.
133+
134+
```bash
135+
# Rebuild everything (FTS + embeddings if semantic is enabled)
136+
bm reindex
137+
138+
# Only rebuild vector embeddings
139+
bm reindex --embeddings
140+
141+
# Only rebuild the full-text search index
142+
bm reindex --search
143+
144+
# Target a specific project
145+
bm reindex -p my-project
146+
```
147+
148+
### When You Need to Reindex
149+
150+
- **First enable**: After turning on `semantic_search_enabled` for the first time
151+
- **Provider change**: After switching between `fastembed` and `openai`
152+
- **Model change**: After changing `semantic_embedding_model`
153+
- **Dimension change**: After changing `semantic_embedding_dimensions`
154+
155+
The reindex command shows progress with embedded/skipped/error counts:
156+
157+
```
158+
Project: main
159+
Building vector embeddings...
160+
✓ Embeddings complete: 142 entities embedded, 0 skipped, 0 errors
161+
162+
Reindex complete!
163+
```
164+
165+
## How It Works
166+
167+
### Chunking
168+
169+
Each entity in the search index is split into semantic chunks before embedding:
170+
171+
- **Headers**: Markdown headers (`#`, `##`, etc.) start new chunks
172+
- **Bullets**: Each bullet item (`-`, `*`) becomes its own chunk for granular fact retrieval
173+
- **Prose sections**: Non-bullet text is merged up to ~900 characters per chunk
174+
- **Long sections**: Oversized content is split with ~120 character overlap to preserve context at boundaries
175+
176+
Each search index item type (entity, observation, relation) is chunked independently, so observations and relations are embeddable as discrete facts.
177+
178+
### Deduplication
179+
180+
Each chunk has a `source_hash` (SHA-256 of the chunk text). On re-sync, unchanged chunks skip re-embedding entirely. This makes incremental updates fast — only modified content triggers API calls or model inference.
181+
182+
### Hybrid Fusion
183+
184+
Hybrid search uses reciprocal rank fusion (RRF) to merge FTS and vector results:
185+
186+
1. Run FTS search to get keyword-ranked results
187+
2. Run vector search to get similarity-ranked results
188+
3. For each result, compute: `score = 1/(k + fts_rank) + 1/(k + vector_rank)` where `k = 60`
189+
4. Sort by fused score
190+
191+
Items found by both methods get a natural score boost. Items found by only one method still appear but rank lower.
192+
193+
### Observation-Level Results
194+
195+
Vector and hybrid modes return individual observations and relations as first-class search results, not just parent entities. This means a search for "water temperature for brewing" can surface the specific observation about 205°F without returning the entire "Coffee Brewing Methods" entity.
196+
197+
## Database Backends
198+
199+
### SQLite (local)
200+
201+
- **Vector storage**: [sqlite-vec](https://github.com/asg017/sqlite-vec) virtual table
202+
- **Table creation**: At runtime when semantic search is first used — no migration needed
203+
- **Embedding table**: `search_vector_embeddings` using `vec0(embedding float[N])` where N is the configured dimensions
204+
- **Chunk metadata**: `search_vector_chunks` table stores chunk text, keys, and source hashes
205+
206+
The sqlite-vec extension is loaded per-connection. Vector tables are created lazily on first use.
207+
208+
### Postgres (cloud)
209+
210+
- **Vector storage**: [pgvector](https://github.com/pgvector/pgvector) with HNSW indexing
211+
- **Chunk metadata table**: Created via Alembic migration (`search_vector_chunks` with `BIGSERIAL` primary key)
212+
- **Embedding table**: `search_vector_embeddings` created at runtime (dimension-dependent, same pattern as SQLite)
213+
- **Index**: HNSW index on the embedding column for fast approximate nearest-neighbour queries
214+
215+
The Alembic migration creates the dimension-independent chunks table. The embeddings table and HNSW index are deferred to runtime because they depend on the configured vector dimensions.

0 commit comments

Comments
 (0)