fix(store): embed searchVec query with the store's pinned model by Ciel2142 · Pull Request #690 · tobi/qmd

Ciel2142 · 2026-05-30T19:46:19Z

Problem

store.searchVector() (and the internal multi-query vector path) embed the
search-time query with the global QMD_EMBED_MODEL (getDefaultLlamaCpp())
instead of the store's pinned embed model. When a store is created with
config.models.embed different from QMD_EMBED_MODEL, the query vector width
differs from the stored vectors and sqlite-vec rejects it before any comparison:

SqliteError: Dimension mismatch for query vector for the "embedding" column. Expected 768 dimensions but received 4096.

(store embedded with embeddinggemma-300M → 768-dim stored vectors; env
QMD_EMBED_MODEL = Qwen3-Embedding-8B → query embedded at 4096-dim.) It also
loads the wrong, often much larger model at query time.

Root cause

searchVec() calls getEmbedding(query, model, true, session) without an
llmOverride, so getEmbedding falls back to getDefaultLlamaCpp() (the env
model). The store's own llm (store.llm) is never threaded into the query-embed
step. getEmbedding already accepts an llmOverride parameter — it simply isn't
passed.

Hybrid search (store.search) is unaffected: it pre-computes query embeddings
via the store session/llm and passes them as precomputedEmbedding, which
short-circuits before getEmbedding. Only the inline-embed paths hit the bug.

Fix

Thread getLlm(store) (which returns store.llm ?? getDefaultLlamaCpp())
through the bound store.searchVec into getEmbedding, mirroring the existing
expandQuery/rerank llm threading. Three small edits in src/store.ts:

the bound store.searchVec passes getLlm(store) as the new last arg
searchVec() gains an optional llm?: LlamaCpp parameter
it forwards that into getEmbedding(query, model, true, session, llm)

Backward compatible — the new param is optional; session- and
precomputedEmbedding-based callers are unaffected (session is preferred over
the override, and precomputed skips getEmbedding entirely).

Test

Adds a regression test in test/store.test.ts: it pins a store to a
non-default (3-dim) embed model, sets the global default to a different (4-dim)
model, embeds a doc, then calls store.searchVec(). It asserts hits are returned
with no dimension mismatch and that the global default llm is never used to embed
the query. The test fails on the current code with Dimension mismatch ... Expected 3 ... received 4 and passes with the fix.

store.searchVector() and the internal multi-query vector path embedded the query with the global QMD_EMBED_MODEL (getDefaultLlamaCpp) instead of the store's pinned embed model. A store whose config.models.embed differs from QMD_EMBED_MODEL hit a sqlite-vec dimension mismatch (e.g. stored 768d vs query 4096d) and loaded the wrong, larger model at query time. Thread getLlm(store) through the bound store.searchVec into getEmbedding, which already accepts an llmOverride. Backward compatible: the new param is optional; session and precomputedEmbedding callers are unaffected, so hybrid/precomputed search paths are unchanged. Adds a regression test that pins a non-default store embed model and asserts searchVec embeds the query with it (hits returned, no dim mismatch, global default llm untouched). Refs: qmd-prm, qmd-se8, qmd-mg2

Ciel2142 · 2026-06-01T07:33:38Z

Prior-art note for reviewers: this is a follow-up to #497 (closed), not a duplicate.

#497 fixed the query-time embed model name — store.js hardcoded DEFAULT_EMBED_MODEL = "embeddinggemma". On current main that's export const DEFAULT_EMBED_MODEL = DEFAULT_EMBED_MODEL_URI (src/store.ts:45), and #501 made a dimension mismatch throw instead of silently rebuilding. Those address the wrong-model-string path.

This PR fixes a different mechanism in the same searchVec path: the llm instance that runs the query embed. On current main, the inline-embed path still embeds with getDefaultLlamaCpp() (env QMD_EMBED_MODEL) and never threads store.llm:

src/store.ts:1890  searchVec(..., session, precomputedEmbedding)                              // store.llm never passed
src/store.ts:3471  const embedding = precomputedEmbedding ?? await getEmbedding(query, model, true, session);   // no llmOverride
src/store.ts:3562  await (llmOverride ?? getDefaultLlamaCpp()).embed(formattedText, { model, isQuery });        // env fallback

So #497's DEFAULT_EMBED_MODEL_URI change does not cover a store pinned via config.models.embed that differs from env QMD_EMBED_MODEL: the inline-embed path (no session, no precomputedEmbedding) still loads the env model and fails with the same Dimension mismatch ... Expected N ... received M symptom. Hybrid / precomputed / session callers are unaffected — they short-circuit before getEmbedding, which is why this slipped past the #497 fix.

The fix threads getLlm(store) (store.llm ?? getDefaultLlamaCpp()) into that one path, mirroring the existing expandQuery/rerank llm threading. The added regression test fails on current main (dim mismatch) and passes with the fix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(store): embed searchVec query with the store's pinned model#690

fix(store): embed searchVec query with the store's pinned model#690
Ciel2142 wants to merge 1 commit into
tobi:mainfrom
Ciel2142:fix/searchvec-pinned-embed-model

Ciel2142 commented May 30, 2026

Uh oh!

Ciel2142 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ciel2142 commented May 30, 2026

Problem

Root cause

Fix

Test

Uh oh!

Ciel2142 commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant