Skip to content
This repository was archived by the owner on Feb 18, 2026. It is now read-only.

Commit 1df89f3

Browse files
committed
feat(batch_insert): implement batch mode for multiple inserts with persistent BlobCache
- Added diskann_begin_batch() and diskann_end_batch() functions to manage batch insert operations. - Introduced BlobCache ownership mode to optimize BLOB I/O during batch inserts. - Updated diskann_insert() to utilize the batch cache when in batch mode. - Enhanced BlobSpot structure to track cache ownership with is_cached flag. - Implemented tests for batch insert functionality, including validation of cache behavior and recall metrics. - Adjusted profiling tests to accommodate increased base index size for better performance analysis.
1 parent ad404e1 commit 1df89f3

14 files changed

Lines changed: 724 additions & 13 deletions

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,9 @@ temp/
5050
*.db-shm
5151
*.db-wal
5252

53+
# Profiling output
54+
profiling_timing.csv
55+
5356
# From clangd
5457
compile_commands.events.json
5558
# From bear
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
# Serial Batch Insert Optimization
2+
3+
## Summary
4+
5+
Optimize DiskANN insert throughput for batch workloads using single-connection serial techniques: transaction batching, persistent cache, lazy back-edges, neighbor deduplication, and intra-batch candidates. Target: 4-12x speedup over serial `diskann_insert()` for 500-vector batches. No threading — all target DB drivers are single-connection.
6+
7+
## Current Phase
8+
9+
- [x] Research & Planning
10+
- [x] Test Design
11+
- [x] Implementation Design
12+
- [x] Test-First Development
13+
- [x] Implementation
14+
- [x] Integration
15+
- [ ] Cleanup & Documentation — pending Phase 1b, 2, 3
16+
- [ ] Final Review — pending benchmarks + remaining phases
17+
18+
## Required Reading
19+
20+
- `CLAUDE.md` — Project conventions
21+
- `TDD.md` — Testing methodology
22+
- `DESIGN-PRINCIPLES.md` — C coding standards
23+
- `src/diskann_insert.c` — Current insert (lines 177-377)
24+
- `src/diskann_search.c` — Beam search internals
25+
- `src/diskann_cache.c` — BlobCache implementation
26+
- `_research/write-performance-analysis.md` — Full options analysis
27+
- `_todo/20260211-insert-profiling.md` — Phase 0 profiling results (prerequisite)
28+
29+
## Description
30+
31+
**Problem:** Serial `diskann_insert()` is too slow for batch workloads. PhotoStructure imports 100-1000 photos at a time.
32+
33+
**Constraint:** Single SQLite connection only. No parallel WAL readers. All optimization must be serial.
34+
35+
**Success Criteria:** 500-vector batch into 10k+ index at >= 300 inserts/sec (vs current 189/sec) with recall >= 85%.
36+
37+
**Prerequisite complete:** `_todo/20260211-insert-profiling.md` — profiling data collected at 10k scale, decision gates evaluated below.
38+
39+
## Tribal Knowledge
40+
41+
- SAVEPOINT per insert is required because writable BLOBs need transaction context
42+
- BlobCache is created/destroyed per insert (lines 231-236, 369-371) — huge waste for batches
43+
- `insert_shadow_row()` calls `sqlite3_mprintf` + `sqlite3_prepare_v2` every time — should be prepared once
44+
- Phase 2 (back-edges, lines 311-330) flushes each neighbor BLOB individually
45+
- FreshDiskANN paper: collect back-edges in delta structure, apply in single pass
46+
- Reference impl: intra-batch candidates use CPU-only distance computation (no I/O)
47+
48+
### Profiling Results (10k scale, post random_start fix)
49+
50+
| Phase | Avg (us) | % of Total | Notes |
51+
| ------------------- | --------- | ---------- | ------------------------------------- |
52+
| search | 2,970 | 56.1% | Beam search — inherent cost, dominant |
53+
| phase2 (back-edges) | 1,624 | 30.7% | Acceptance rate 13% at 10k |
54+
| phase1 (fwd edges) | 505 | 9.5% | Low |
55+
| random_start | 46 | 0.9% | **Fixed** (was 26% / 1.5ms) |
56+
| savepoint | ~1 | 0.02% | Negligible |
57+
| **Total** | **5,291** | **100%** | **189 inserts/sec** |
58+
59+
- **Cache hits: 0%** — BlobCache created/destroyed per-insert, ~130 nodes visited per insert
60+
- **Similar vectors 15% faster** — search converges faster (977us vs 2,970us), but phase2 33% more expensive
61+
- Source: `_todo/20260211-insert-profiling.md` Session 2026-02-11
62+
63+
## Solutions
64+
65+
### Phase 1: Transaction Batching + Persistent Cache
66+
67+
Wrap N inserts in a single BEGIN/COMMIT. Persist BlobCache across all inserts in the batch. Cache the prepared INSERT statement.
68+
69+
**API:**
70+
71+
```c
72+
int diskann_begin_batch(DiskAnnIndex *idx);
73+
/* call diskann_insert() N times */
74+
int diskann_end_batch(DiskAnnIndex *idx);
75+
```
76+
77+
**Decision gate:** SAVEPOINT overhead = 0.02% → **SKIP transaction batching**. Cache hit rate = 0% → **DO persistent cache** (top priority).
78+
79+
### Phase 2: Lazy Back-Edges + Batch Repair
80+
81+
Skip Phase 2 (lines 311-330) during insert — store forward edges only. After all inserts, run a repair pass that scans modified neighbors and applies all missing back-edges in one sequential pass with deduplication.
82+
83+
**Decision gate:** Phase 2 = 31% of insert time → just above 30% threshold → **DO this** (second priority). Expected -30% total time but high complexity.
84+
85+
### Phase 3: Intra-Batch Candidates
86+
87+
For similar vectors (same event photos), batch members are likely neighbors of each other. Compute all-pairs distances within batch (CPU-only, no I/O), add close batch members as candidate neighbors. Reduces the number of nodes that must be found via expensive beam search.
88+
89+
**Decision gate:** Search = 56% of insert time → above 50% threshold → **CONSIDER this** (third priority). Only benefits similar-vector batches (PhotoStructure's typical case). Deferred until Phase 1 + 2 measured.
90+
91+
## Tasks
92+
93+
Based on profiling (10k scale, 189 inserts/sec baseline):
94+
95+
- [x] **Phase 1a: Persistent BlobCache across batch** — 0% cache hits on ~130 visited nodes/insert. Expected -10-20% total time. Medium effort.
96+
- [ ] Phase 1b: Prepared statement caching for `insert_shadow_row()` — minor optimization, low effort
97+
- [ ] ~~Phase 1c: Transaction batching~~ — SKIPPED, SAVEPOINT = 0.02% of insert time
98+
- [ ] **Phase 2: Lazy back-edges + batch repair** — 31% of insert time. Expected -30% total time. High effort/complexity.
99+
- [ ] Phase 3: Intra-batch candidates — deferred until Phase 1+2 measured
100+
- [ ] Benchmark: 500-vector batch into 10k index, compare serial vs batch
101+
- [ ] Validate recall >= 85% for batch inserts
102+
103+
**Verification:**
104+
105+
```bash
106+
make test # All existing tests pass + new batch tests
107+
make asan # No memory errors
108+
make clean && make valgrind # No leaks
109+
```
110+
111+
## Notes
112+
113+
### 2026-02-11: Profiling Complete, Priorities Established
114+
115+
Phase 0 profiling (`_todo/20260211-insert-profiling.md`) is complete. Key findings:
116+
117+
- **random_start fixed** (commit `ad404e1`) — was 26% of insert time, now 0.9%
118+
- **Search dominates** at 56% — inherent beam search cost, hard to optimize
119+
- **Phase 2 = 31%** — borderline for lazy back-edges, but a real win for batch workloads
120+
- **Cache hits = 0%** — BlobCache per-insert is pure waste; persistent cache is the top optimization
121+
- **SAVEPOINT negligible** — 0.02%, no need for transaction batching
122+
123+
**Recommended implementation order:** Phase 1a (persistent cache) first — medium effort, immediate 10-20% win. Phase 2 (lazy back-edges) second — high effort, 30% win but affects graph quality (needs careful recall testing).
124+
125+
### 2026-02-11: Research & Planning for Phase 1a (Persistent BlobCache)
126+
127+
**Ownership Model — The Core Problem:**
128+
129+
Current flow: `diskann_node_free()``blob_spot_free()`. Cache holds borrowed pointers. For persistent cache, cache must own BlobSpots across inserts, but nodes still need valid pointers during Phase 1 + Phase 2.
130+
131+
**Solution: `is_cached` flag on BlobSpot + `owns_blobs` flag on BlobCache**
132+
133+
1. `BlobSpot.is_cached` (default 0): When a cache with `owns_blobs=1` stores a BlobSpot, it sets `is_cached=1`.
134+
2. `diskann_node_free()`: If `blob_spot->is_cached`, skip `blob_spot_free()`. Cache owns it.
135+
3. On cache eviction: Set `evicted->is_cached=0`. Node still references it, `diskann_node_free()` will free it.
136+
4. `blob_cache_deinit()` with `owns_blobs=1`: Iterates LRU chain, `blob_spot_free()` all remaining entries.
137+
5. Per-insert caches (`owns_blobs=0`): No behavioral change. `is_cached` stays 0 throughout.
138+
139+
**Why this works:**
140+
141+
- During a single insert: nodes + cache both reference same BlobSpots. `is_cached=1` prevents node teardown from freeing them.
142+
- On eviction during search (capacity exceeded): `is_cached=0` on evicted entry. Node still has valid pointer, frees it normally during context deinit.
143+
- Between inserts: All nodes freed via `diskann_search_ctx_deinit()`. Cached BlobSpots survive (is_cached=1). Evicted ones freed by node teardown (is_cached=0). No dangling pointers.
144+
- On batch end: `blob_cache_deinit()` frees all remaining BlobSpots.
145+
146+
**Cache data freshness after Phase 2:**
147+
148+
- Phase 2 modifies a node's in-memory buffer (adds back-edge) then flushes to disk.
149+
- Cached copy IS the latest version — in-memory buffer is authoritative.
150+
- `blob_spot_reload()` with same rowid + `is_initialized=1` returns immediately (no stale read).
151+
- This means the persistent cache is BETTER than re-reading from disk.
152+
153+
**SAVEPOINT rollback safety:**
154+
155+
- If `diskann_insert()` fails and SAVEPOINT is rolled back, cached BlobSpots may have stale data (Phase 2 mods rolled back on disk but still in memory).
156+
- Solution: On any insert error during a batch, caller should call `diskann_end_batch()` which clears the cache. The `diskann_end_batch()` function should handle this gracefully.
157+
- `diskann_begin_batch()` could also wrap everything in a single outer transaction (BEGIN/COMMIT) for atomicity.
158+
159+
**New node caching:**
160+
161+
- After flushing the newly inserted node's BlobSpot, put it in the batch cache. Future inserts visiting this node get a cache hit instead of re-reading from disk.
162+
163+
**Files to modify:**
164+
165+
- `src/diskann_blob.h` — Add `is_cached` field to BlobSpot
166+
- `src/diskann_blob.c` — Initialize `is_cached=0` in `blob_spot_create()`
167+
- `src/diskann_cache.h` — Add `owns_blobs` field to BlobCache
168+
- `src/diskann_cache.c` — Eviction frees with `is_cached=0`; deinit frees remaining with `owns_blobs=1`
169+
- `src/diskann_node.c``diskann_node_free()` checks `is_cached`
170+
- `src/diskann_internal.h` — Add `BlobCache *batch_cache` to DiskAnnIndex
171+
- `src/diskann.h` — Declare `diskann_begin_batch()` / `diskann_end_batch()`
172+
- `src/diskann_api.c` — Implement begin/end batch
173+
- `src/diskann_insert.c` — Use `idx->batch_cache` when available; cache new node's BlobSpot
174+
- `tests/c/test_cache.c` — New tests for owning cache eviction + deinit
175+
- `tests/c/test_insert.c` — New tests for batch insert (begin/end API, recall, error handling)
176+
177+
### 2026-02-11: Phase 1a Implementation Complete
178+
179+
**What was done:**
180+
181+
1. Added `is_cached` flag to `BlobSpot` (diskann_blob.h)
182+
2. Added `owns_blobs` flag to `BlobCache` (diskann_cache.h)
183+
3. Modified `blob_cache_put()`: sets `is_cached=1` in owning mode
184+
4. Modified `get_free_slot()`: clears `is_cached=0` on eviction in owning mode
185+
5. Modified `blob_cache_deinit()`: frees all BlobSpots via LRU chain in owning mode
186+
6. Modified `diskann_node_free()`: skips `blob_spot_free()` when `is_cached=1`
187+
7. Added `batch_cache` field to `DiskAnnIndex`
188+
8. Implemented `diskann_begin_batch()` / `diskann_end_batch()` in diskann_api.c
189+
9. Modified `diskann_insert()`: uses `idx->batch_cache` when available via `active_cache` pointer
190+
10. New node BlobSpot cached in batch mode after flush (future inserts get cache hit)
191+
11. `diskann_close_index()` cleans up active batch cache (safety net)
192+
12. 12 new tests: 4 cache ownership + 8 batch API
193+
13. Forward declaration for `BlobCache` in diskann_internal.h to avoid circular include
194+
195+
**Verification:**
196+
197+
- `make test`: 204/204 pass (was 192)
198+
- `make asan`: clean (no errors)
199+
- `make clean && make valgrind`: 0 errors, 0 leaks
200+
- Extension builds: `make all` produces diskann.so
201+
202+
**Next: benchmark to measure actual speedup (expected 10-20%)**

src/diskann.h

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,36 @@ int diskann_search_filtered(DiskAnnIndex *idx, const float *query,
186186
uint32_t dims, int k, DiskAnnResult *results,
187187
DiskAnnFilterFn filter_fn, void *filter_ctx);
188188

189+
/*
190+
** Begin batch mode for multiple inserts.
191+
**
192+
** Creates a persistent BlobCache that is shared across inserts. Reduces
193+
** redundant BLOB I/O when nodes are visited by multiple consecutive inserts.
194+
**
195+
** Parameters:
196+
** idx - Index handle (must not be NULL, must not already be in batch mode)
197+
**
198+
** Returns:
199+
** DISKANN_OK on success
200+
** DISKANN_ERROR_INVALID if idx is NULL or already in batch mode
201+
** DISKANN_ERROR_NOMEM if cache allocation fails
202+
**
203+
** Caller must call diskann_end_batch() when done inserting.
204+
*/
205+
int diskann_begin_batch(DiskAnnIndex *idx);
206+
207+
/*
208+
** End batch mode and free the persistent cache.
209+
**
210+
** Parameters:
211+
** idx - Index handle (must not be NULL, must be in batch mode)
212+
**
213+
** Returns:
214+
** DISKANN_OK on success
215+
** DISKANN_ERROR_INVALID if idx is NULL or not in batch mode
216+
*/
217+
int diskann_end_batch(DiskAnnIndex *idx);
218+
189219
/*
190220
** Delete a vector from the index.
191221
**

src/diskann_api.c

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
*/
99
#include "diskann.h"
1010
#include "diskann_blob.h"
11+
#include "diskann_cache.h"
1112
#include "diskann_internal.h"
1213
#include "diskann_node.h"
1314
#include "diskann_util.h"
@@ -477,6 +478,13 @@ void diskann_close_index(DiskAnnIndex *idx) {
477478
return; /* Safe to call with NULL */
478479
}
479480

481+
/* Clean up batch cache if still active */
482+
if (idx->batch_cache) {
483+
blob_cache_deinit(idx->batch_cache);
484+
sqlite3_free(idx->batch_cache);
485+
idx->batch_cache = NULL;
486+
}
487+
480488
/* Free malloc'd strings */
481489
if (idx->db_name) {
482490
sqlite3_free(idx->db_name);
@@ -499,6 +507,47 @@ void diskann_close_index(DiskAnnIndex *idx) {
499507
free(idx);
500508
}
501509

510+
int diskann_begin_batch(DiskAnnIndex *idx) {
511+
if (!idx) {
512+
return DISKANN_ERROR_INVALID;
513+
}
514+
if (idx->batch_cache != NULL) {
515+
return DISKANN_ERROR_INVALID; /* Already in batch mode */
516+
}
517+
518+
/* Allocate batch cache on heap (persists across inserts) */
519+
BlobCache *cache = (BlobCache *)sqlite3_malloc(sizeof(BlobCache));
520+
if (!cache) {
521+
return DISKANN_ERROR_NOMEM;
522+
}
523+
524+
int rc = blob_cache_init(cache, 200);
525+
if (rc != DISKANN_OK) {
526+
sqlite3_free(cache);
527+
return rc;
528+
}
529+
cache->owns_blobs = 1; /* Cache owns BlobSpots in batch mode */
530+
531+
idx->batch_cache = cache;
532+
return DISKANN_OK;
533+
}
534+
535+
int diskann_end_batch(DiskAnnIndex *idx) {
536+
if (!idx) {
537+
return DISKANN_ERROR_INVALID;
538+
}
539+
if (idx->batch_cache == NULL) {
540+
return DISKANN_ERROR_INVALID; /* Not in batch mode */
541+
}
542+
543+
/* Free cache (frees all owned BlobSpots in owning mode) */
544+
blob_cache_deinit(idx->batch_cache);
545+
sqlite3_free(idx->batch_cache);
546+
idx->batch_cache = NULL;
547+
548+
return DISKANN_OK;
549+
}
550+
502551
/*
503552
** Delete a vector from the index.
504553
**

src/diskann_blob.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ typedef struct BlobSpot {
3939
int is_writable; /* 1 if opened for writing, 0 for reading */
4040
int is_initialized; /* 1 if buffer contains valid data */
4141
int is_aborted; /* 1 if BLOB operations have been aborted */
42+
int is_cached; /* 1 if owned by an owning BlobCache (do not free via node) */
4243
} BlobSpot;
4344

4445
/*

0 commit comments

Comments
 (0)