Skip to content

[BUG] bm reindex --embeddings fails with "no such module: vec0" on Windows #735

@mrhoga

Description

@mrhoga

basic-memory version: 0.20.3
sqlite-vec version: 0.1.9
OS: Windows 11
Python: 3.12 (via uv tool install basic-memory --with sqlite-vec --python 3.12)

Description

Running bm reindex or bm reindex --embeddings fails immediately with:

OperationalError: (sqlite3.OperationalError) no such module: vec0
[SQL: DELETE FROM search_vector_embeddings WHERE rowid IN (SELECT id FROM
search_vector_chunks WHERE project_id = ? AND entity_id NOT IN
(SELECT id FROM entity WHERE project_id = ?))]

The sqlite-vec extension itself works correctly — the following succeeds:

& "$env:APPDATA\uv\tools\basic-memory\Scripts\python.exe" -c "
import sqlite3, sqlite_vec
c = sqlite3.connect(':memory:')
c.enable_load_extension(True)
sqlite_vec.load(c)
print(c.execute('select vec_version()').fetchone()[0])
"
Output: v0.1.9

Root Cause

reindex_vectors in search_service.py calls _purge_stale_search_rows() before any sqlite-vec loading has occurred. _purge_stale_search_rows uses self.repository.execute_query(...) which internally opens a new session (= new connection) from the SQLAlchemy pool without calling _ensure_sqlite_vec_loaded first.

The key insight: SQLite's load_extension() (via sqlite3_load_extension) registers virtual table modules per-connection, not process-globally. Every new connection from the pool starts without the vec0 module, regardless of whether it was loaded on another connection previously.

The codebase already has the right abstraction (_ensure_sqlite_vec_loaded) and applies it correctly in several places — but _purge_stale_search_rows bypasses it by using execute_query which opens an anonymous connection.

This is related to #658, which describes the same pattern in project info.

Fix

Replace the execute_query call for search_vector_embeddings in _purge_stale_search_rows with an explicit session that loads the extension first:

search_service.py — _purge_stale_search_rows

Before (broken):

if isinstance(self.repository, SQLiteSearchRepository):
await self.repository.execute_query(
text("DELETE FROM search_vector_embeddings WHERE rowid IN (...)"),
params,
)

After (fixed):

if isinstance(self.repository, SQLiteSearchRepository):
from basic_memory import db as bm_db
async with bm_db.scoped_session(self.repository.session_maker) as _vec_session:
await self.repository._ensure_sqlite_vec_loaded(_vec_session)
await _vec_session.execute(
text("DELETE FROM search_vector_embeddings WHERE rowid IN (...)"),
params,
)

This guarantees that the extension is loaded on the exact same connection that executes the vec0 virtual table query.

Notes

A broader fix would be to load sqlite-vec in the SQLAlchemy connect event listener in db.py so every new pool connection gets the extension automatically. However, since enable_load_extension is an async method in aiosqlite and the connect event is synchronous, this requires a more involved change to the engine setup.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions