You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -407,39 +408,164 @@ A virtual table for performing hybrid semantic search.
407
408
SELECT*FROM memory_search WHERE query ='search text';
408
409
```
409
410
410
-
**Columns:**
411
+
**Hidden filter columns (used in WHERE):**
412
+
| Column | Type | Required | Description |
413
+
|--------|------|----------|-------------|
414
+
|`query`| TEXT | Yes | The search query |
415
+
|`max_entries`| INTEGER | No | Override `max_results` setting for this query only |
416
+
|`context`| TEXT | No | Restrict results to a specific context label |
417
+
418
+
**Output columns:**
411
419
| Column | Type | Description |
412
420
|--------|------|-------------|
413
-
|`query`| TEXT (HIDDEN) | Search query (required in WHERE clause) |
414
421
|`hash`| INTEGER | Content hash identifier |
422
+
|`seq`| INTEGER | Chunk sequence number within the document (0-based) |
423
+
|`ranking`| REAL | Combined similarity score (0.0 - 1.0) |
415
424
|`path`| TEXT | Source file path or generated UUID for text content |
416
-
|`context`| TEXT | Context label (NULL if not set) |
417
425
|`snippet`| TEXT | Text snippet from the matching chunk |
418
-
|`ranking`| REAL | Combined similarity score (0.0 - 1.0) |
419
426
420
427
**Notes:**
421
428
- Requires sqlite-vector extension loaded first
422
429
- Performs hybrid search combining vector similarity and FTS5
423
430
- Results are ranked by combined score
424
-
- Limited by `max_results` setting (default: 20)
431
+
- Limited by `max_results` setting (default: 20), overridable per-query with `max_entries`
425
432
- Filtered by `min_score` setting (default: 0.7)
426
433
- Updates `last_accessed` timestamp if `update_access` is enabled
427
434
428
435
**Example:**
429
436
```sql
430
437
-- Basic search
431
-
SELECT*FROM memory_search WHERE query ='database indexing strategies';
438
+
SELECTpath, snippet, rankingFROM memory_search WHERE query ='database indexing strategies';
432
439
433
440
-- Search with ranking filter
434
441
SELECTpath, snippet, ranking
435
442
FROM memory_search
436
443
WHERE query ='how to optimize queries'
437
444
AND ranking >0.8;
438
445
439
-
-- Search within a specific context
440
-
SELECT*FROM memory_search
446
+
-- Restrict to a specific context
447
+
SELECTpath, snippet, ranking
448
+
FROM memory_search
441
449
WHERE query ='meeting action items'
442
450
AND context ='meetings';
451
+
452
+
-- Override result limit for this query only
453
+
SELECTpath, snippet, ranking
454
+
FROM memory_search
455
+
WHERE query ='architecture overview'
456
+
AND max_entries =5;
457
+
458
+
-- Get the chunk sequence number (useful for reconstructing document order)
459
+
SELECTpath, seq, snippet, ranking
460
+
FROM memory_search
461
+
WHERE query ='installation steps';
462
+
```
463
+
464
+
---
465
+
466
+
## C API
467
+
468
+
In addition to the SQL interface, sqlite-memory exposes a C API for embedding custom providers directly from application code.
469
+
470
+
### `sqlite3_memory_register_provider`
471
+
472
+
```c
473
+
intsqlite3_memory_register_provider(
474
+
sqlite3 *db,
475
+
const char *provider_name,
476
+
const dbmem_provider_t *provider
477
+
);
478
+
```
479
+
480
+
Registers a custom embedding engine for a specific database connection. Once registered, calling `memory_set_model(provider_name, model)` from SQL will use your engine instead of the built-in local or remote engines.
481
+
482
+
**Parameters:**
483
+
| Parameter | Type | Description |
484
+
|-----------|------|-------------|
485
+
|`db`|`sqlite3 *`| The database connection to register the provider on |
486
+
|`provider_name`|`const char *`| Name used to activate the provider via `memory_set_model()`|
487
+
|`provider`|`const dbmem_provider_t *`| Pointer to a struct containing the engine callbacks |
488
+
489
+
**Returns:**`SQLITE_OK` on success, or a SQLite error code.
490
+
491
+
**`dbmem_provider_t` struct:**
492
+
```c
493
+
typedefstruct {
494
+
// Called when memory_set_model(provider_name, model) is executed.
495
+
// api_key is the value set via memory_set_apikey() (may be NULL).
496
+
// xdata is the user pointer from this struct.
497
+
// Return an opaque engine pointer on success, or NULL on error (fill err_msg).
int (*compute)(void *engine, const char *text, int text_len, void *xdata, dbmem_embedding_result_t *result);
503
+
504
+
// Free the engine. Called on context teardown or when the model changes.
505
+
// May be NULL if no cleanup is needed.
506
+
void (*free)(void *engine, void *xdata);
507
+
508
+
// Optional user-supplied pointer passed to all three callbacks.
509
+
void *xdata;
510
+
} dbmem_provider_t;
511
+
```
512
+
513
+
**`dbmem_embedding_result_t` struct:**
514
+
```c
515
+
typedefstruct {
516
+
int n_tokens; // Number of tokens processed
517
+
int n_tokens_truncated; // Tokens that were truncated (0 if none)
518
+
int n_embd; // Embedding dimension
519
+
float *embedding; // Embedding vector (engine-owned, valid until next call or free)
520
+
} dbmem_embedding_result_t;
521
+
```
522
+
523
+
**Notes:**
524
+
- Works regardless of `DBMEM_OMIT_LOCAL_ENGINE` / `DBMEM_OMIT_REMOTE_ENGINE` compile flags
525
+
- The `embedding` buffer in `dbmem_embedding_result_t` must remain valid until the next `compute` call or `free` — it is engine-owned, not copied by the caller
526
+
- Only one custom provider can be registered per connection at a time; registering again replaces the previous one
527
+
- The provider struct is copied by value; the caller does not need to keep it alive after registration
Copy file name to clipboardExpand all lines: README.md
+33-23Lines changed: 33 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,8 @@
1
1
# SQLite Memory
2
2
3
-
A SQLite extension that gives AI agents persistent, searchable memory. Features hybrid semantic search (vector similarity + FTS5), markdown-aware chunking, and local embedding via llama.cpp. Memory databases can be synced between agents using **offline first technology** each agent works independently and syncs when connected, making it ideal for distributed AI systems, edge deployments, and collaborative agent architectures.
3
+
A SQLite extension that gives AI agents persistent, searchable memory, optimized for markdown content. Features hybrid semantic search (vector similarity + FTS5), markdown-aware chunking, and local embedding via llama.cpp.
4
+
5
+
Agent memory databases can be synchronized between agents using **offline-first technology** via [sqlite-sync](https://github.com/sqliteai/sqlite-sync). Each agent works independently and syncs when connected, making it ideal for distributed AI systems, edge deployments, and collaborative agent architectures.
4
6
5
7
## The Future of AI Agent Memory
6
8
@@ -33,10 +35,10 @@ sqlite-memory bridges these concepts, allowing any SQLite-powered application to
33
35
34
36
-**Hybrid Search**: Combines vector similarity (cosine distance) with FTS5 full-text search for superior retrieval
-**Intelligent Sync**: Content-hash change detection, unchanged files are skipped, modified files are atomically replaced, deleted files are cleaned up
37
-
-**Transactional Safety**: Every sync operation runs inside a SAVEPOINT transaction, either fully succeeds or fully rolls back, no partially-indexed content
38
+
-**Intelligent Sync**: Content-hash change detection skips unchanged files, atomically replaces modified ones, and cleans up deleted ones
39
+
-**Transactional Safety**: Every sync operation runs inside a SAVEPOINT transaction - either fully succeeds or fully rolls back, no partially-indexed content
38
40
-**Efficient Storage**: Binary embeddings with configurable dimensions
39
-
-**Embedding Cache**: Automatically caches computed embeddings so re-indexing the same text skips redundant API calls and computation
41
+
-**Embedding Cache**: Automatically caches computed embeddings, so re-indexing the same text skips redundant API calls and computation
40
42
-**Flexible Embedding**: Use local models (llama.cpp) or [vectors.space](https://vectors.space) remote API
41
43
42
44
## Architecture
@@ -63,14 +65,16 @@ sqlite-memory bridges these concepts, allowing any SQLite-powered application to
@@ -149,15 +154,21 @@ memories = recall("what's the project timeline")
149
154
150
155
All `memory_add_*` functions use content-hash change detection to avoid redundant work:
151
156
152
-
-**`memory_add_text`** — Computes a hash of the content. If the same content was already indexed, it is skipped entirely. No duplicate embeddings are ever created.
153
-
-**`memory_add_file`** — Reads the file and hashes its content. If the file was previously indexed with different content, the old entry (chunks, embeddings, FTS) is atomically replaced. Unchanged files are skipped.
154
-
-**`memory_add_directory`** — Performs a full two-phase sync:
157
+
-**`memory_add_text`**: Computes a hash of the content. If the same content was already indexed, it is skipped entirely. No duplicate embeddings are ever created.
158
+
-**`memory_add_file`**: Reads the file and hashes its content. If the file was previously indexed with different content, the old entry (chunks, embeddings, FTS) is atomically replaced. Unchanged files are skipped.
159
+
-**`memory_add_directory`**: Performs a full two-phase sync:
155
160
1.**Cleanup**: Removes database entries for files that no longer exist on disk
156
-
2.**Scan**: Recursively processes all matching files — adding new ones, replacing modified ones, and skipping unchanged ones
161
+
2.**Scan**: Recursively processes all matching files - adding new ones, replacing modified ones, and skipping unchanged ones
157
162
158
163
Every sync operation is wrapped in a SQLite SAVEPOINT transaction. If anything fails mid-sync (embedding error, disk issue, etc.), the entire operation rolls back cleanly. There is no risk of partially-indexed files or orphaned entries.
159
164
160
-
This makes all sync functions safe to call repeatedly — for example, on a cron schedule or at agent startup — with minimal overhead.
165
+
This makes all sync functions safe to call repeatedly - for example, on a cron schedule or at agent startup - with minimal overhead.
166
+
167
+
## AI Agents Offline Syncing
168
+
169
+
Thanks to sqlite-sync, agents can share knowledge. Each markdown file added to the database is intelligently parsed and subdivided into chunks, and a [block-based LWW CRDT algorithm](https://github.com/sqliteai/sqlite-sync?tab=readme-ov-file#block-level-lww) keeps everything in sync. All memory, or just a specific memory context, can be kept in sync between agents.
@@ -259,17 +269,17 @@ MIT License - see [LICENSE](LICENSE) for details.
259
269
260
270
## Part of the SQLite AI Ecosystem
261
271
262
-
This project is part of the **SQLite AI** ecosystem, a collection of extensions that bring modern AI capabilities to the world’s most widely deployed database. The goal is to make SQLite the default data and inference engine for Edge AI applications.
272
+
This project is part of the **SQLite AI** ecosystem, a collection of extensions that bring modern AI capabilities to the world's most widely deployed database. The goal is to make SQLite the default data and inference engine for Edge AI applications.
263
273
264
274
Other projects in the ecosystem include:
265
275
266
-
-**[SQLite-AI](https://github.com/sqliteai/sqlite-ai)**— On-device inference and embedding generation directly inside SQLite.
267
-
-**[SQLite-Memory](https://github.com/sqliteai/sqlite-memory)**— Markdown-based AI agent memory with semantic search.
268
-
-**[SQLite-Vector](https://github.com/sqliteai/sqlite-vector)**— Ultra-efficient vector search for embeddings stored as BLOBs in standard SQLite tables.
269
-
-**[SQLite-Sync](https://github.com/sqliteai/sqlite-sync)**— Local-first CRDT-based synchronization for seamless, conflict-free data sync and real-time collaboration across devices.
270
-
-**[SQLite-Agent](https://github.com/sqliteai/sqlite-agent)**— Run autonomous AI agents directly from within SQLite databases.
271
-
-**[SQLite-MCP](https://github.com/sqliteai/sqlite-mcp)**— Connect SQLite databases to MCP servers and invoke their tools.
272
-
-**[SQLite-JS](https://github.com/sqliteai/sqlite-js)**— Create custom SQLite functions using JavaScript.
273
-
-**[Liteparser](https://github.com/sqliteai/liteparser)**— A highly efficient and fully compliant SQLite SQL parser.
276
+
-**[SQLite-AI](https://github.com/sqliteai/sqlite-ai)**- On-device inference and embedding generation directly inside SQLite.
277
+
-**[SQLite-Memory](https://github.com/sqliteai/sqlite-memory)**- Markdown-based AI agent memory with semantic search.
278
+
-**[SQLite-Vector](https://github.com/sqliteai/sqlite-vector)**- Ultra-efficient vector search for embeddings stored as BLOBs in standard SQLite tables.
279
+
-**[SQLite-Sync](https://github.com/sqliteai/sqlite-sync)**- Local-first CRDT-based synchronization for seamless, conflict-free data sync and real-time collaboration across devices.
280
+
-**[SQLite-Agent](https://github.com/sqliteai/sqlite-agent)**- Run autonomous AI agents directly from within SQLite databases.
281
+
-**[SQLite-MCP](https://github.com/sqliteai/sqlite-mcp)**- Connect SQLite databases to MCP servers and invoke their tools.
282
+
-**[SQLite-JS](https://github.com/sqliteai/sqlite-js)**- Create custom SQLite functions using JavaScript.
283
+
-**[Liteparser](https://github.com/sqliteai/liteparser)**- A highly efficient and fully compliant SQLite SQL parser.
0 commit comments