You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: API.md
+94-22Lines changed: 94 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,7 @@ A SQLite extension that provides semantic memory capabilities with hybrid search
5
5
## Table of Contents
6
6
7
7
-[Overview](#overview)
8
+
-[Sync Behavior](#sync-behavior)
8
9
-[Loading the Extension](#loading-the-extension)
9
10
-[SQL Functions](#sql-functions)
10
11
-[General Functions](#general-functions)
@@ -29,6 +30,31 @@ sqlite-memory enables semantic search over text content stored in SQLite. It:
29
30
30
31
---
31
32
33
+
## Sync Behavior
34
+
35
+
All `memory_sync_*` functions use **content-hash change detection** to avoid redundant embedding computation. Each piece of content is hashed before processing — if the hash already exists in the database, the content is skipped.
36
+
37
+
### Change Detection
38
+
39
+
| Scenario | Behavior |
40
+
|----------|----------|
41
+
| New content | Chunked, embedded, and indexed |
42
+
| Unchanged content | Skipped (hash match) |
43
+
| Modified file | Old entry atomically deleted, new content reindexed |
44
+
| Deleted file | Entry removed during directory sync |
45
+
46
+
### Transactional Safety
47
+
48
+
Every sync operation is wrapped in a SQLite **SAVEPOINT** transaction. If any step fails (embedding error, disk issue, constraint violation), the entire operation rolls back. This guarantees:
49
+
50
+
-**No partially-indexed files** — content is either fully indexed or not at all
51
+
-**No orphaned chunks** — embeddings and FTS entries are always consistent with `dbmem_content`
52
+
-**Safe to retry** — a failed sync leaves the database in its previous valid state
53
+
54
+
This makes all sync functions idempotent and safe to call repeatedly (e.g., on a schedule or at application startup).
#### `memory_add_text(content TEXT [, context TEXT])`
203
+
#### `memory_sync_text(content TEXT [, context TEXT])`
178
204
179
-
Adds text content to memory.
205
+
Syncs text content to memory. Duplicate content (same hash) is skipped automatically.
180
206
181
207
**Parameters:**
182
208
| Parameter | Type | Required | Description |
@@ -189,23 +215,24 @@ Adds text content to memory.
189
215
**Notes:**
190
216
- Content is chunked based on `max_tokens` and `overlay_tokens` settings
191
217
- Each chunk is embedded and stored in `dbmem_vault`
192
-
- Content hash prevents duplicate storage
218
+
- Content hash prevents duplicate storage — calling with the same content is a no-op
219
+
- Runs inside a SAVEPOINT transaction (see [Sync Behavior](#sync-behavior))
193
220
- Sets `created_at` timestamp automatically
194
221
195
222
**Example:**
196
223
```sql
197
224
-- Add text without context
198
-
SELECTmemory_add_text('SQLite is a C-language library that implements a small, fast, self-contained SQL database engine.');
225
+
SELECTmemory_sync_text('SQLite is a C-language library that implements a small, fast, self-contained SQL database engine.');
199
226
200
227
-- Add text with context
201
-
SELECTmemory_add_text('Important meeting notes from 2024-01-15...', 'meetings');
228
+
SELECTmemory_sync_text('Important meeting notes from 2024-01-15...', 'meetings');
202
229
```
203
230
204
231
---
205
232
206
-
#### `memory_add_file(path TEXT [, context TEXT])`
233
+
#### `memory_sync_file(path TEXT [, context TEXT])`
207
234
208
-
Adds a file to memory.
235
+
Syncs a file to memory. Unchanged files are skipped; modified files are atomically replaced.
209
236
210
237
**Parameters:**
211
238
| Parameter | Type | Required | Description |
@@ -218,39 +245,51 @@ Adds a file to memory.
218
245
**Notes:**
219
246
- Only processes files matching configured extensions (default: `md,mdx`)
220
247
- File path is stored in `dbmem_content.path`
248
+
- If the file was previously indexed with different content, the old entry (chunks, embeddings, FTS) is deleted and new content is reindexed — all within a single SAVEPOINT transaction (see [Sync Behavior](#sync-behavior))
221
249
- Not available when compiled with `DBMEM_OMIT_IO`
|`cache_max_entries`| INTEGER | 0 | Max cache entries (0 = no limit). When exceeded, oldest entries are evicted |
469
+
|`search_oversample`| INTEGER | 0 | Search oversampling multiplier (0 = no oversampling). When set, retrieves N * multiplier candidates from each index before merging down to N final results |
398
470
399
471
---
400
472
@@ -404,7 +476,7 @@ The extension tracks two timestamps for each memory:
404
476
405
477
### `created_at`
406
478
407
-
- Set automatically when content is added via `memory_add_text`, `memory_add_file`, or `memory_add_directory`
479
+
- Set automatically when content is added via `memory_sync_text`, `memory_sync_file`, or `memory_sync_directory`
408
480
- Stored as Unix timestamp (seconds since 1970-01-01 00:00:00 UTC)
- State your assumptions explicitly. If uncertain, ask.
13
+
- If multiple interpretations exist, present them - don't pick silently.
14
+
- If a simpler approach exists, say so. Push back when warranted.
15
+
- If something is unclear, stop. Name what's confusing. Ask.
16
+
17
+
## 2. Simplicity First
18
+
19
+
**Minimum code that solves the problem. Nothing speculative.**
20
+
21
+
- No features beyond what was asked.
22
+
- No abstractions for single-use code.
23
+
- No "flexibility" or "configurability" that wasn't requested.
24
+
- No error handling for impossible scenarios.
25
+
- If you write 200 lines and it could be 50, rewrite it.
26
+
27
+
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
28
+
29
+
## 3. Surgical Changes
30
+
31
+
**Touch only what you must. Clean up only your own mess.**
32
+
33
+
When editing existing code:
34
+
- Don't "improve" adjacent code, comments, or formatting.
35
+
- Don't refactor things that aren't broken.
36
+
- Match existing style, even if you'd do it differently.
37
+
- If you notice unrelated dead code, mention it - don't delete it.
38
+
39
+
When your changes create orphans:
40
+
- Remove imports/variables/functions that YOUR changes made unused.
41
+
- Don't remove pre-existing dead code unless asked.
42
+
43
+
The test: Every changed line should trace directly to the user's request.
44
+
45
+
## 4. Goal-Driven Execution
46
+
47
+
**Define success criteria. Loop until verified.**
48
+
49
+
Transform tasks into verifiable goals:
50
+
- "Add validation" → "Write tests for invalid inputs, then make them pass"
51
+
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
52
+
- "Refactor X" → "Ensure tests pass before and after"
53
+
54
+
For multi-step tasks, state a brief plan:
55
+
```
56
+
1. [Step] → verify: [check]
57
+
2. [Step] → verify: [check]
58
+
3. [Step] → verify: [check]
59
+
```
60
+
61
+
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
62
+
63
+
---
64
+
65
+
**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
0 commit comments