fix(sqlite): inline JSON paths for expression indexes by lalitkapoor · Pull Request #1487 · TanStack/db

lalitkapoor · 2026-04-21T01:24:34Z

🎯 Changes

Fixes a SQLite planner mismatch in @tanstack/db-sqlite-persistence-core.

This came from a real app query against persisted threadMessages data:

q
  .from({ m: threadMessages })
  .where(({ m }) => eq(m.threadId, threadId))
  .orderBy(({ m }) => m.createdAt, 'desc')
  .orderBy(({ m }) => m.id, 'desc')

In the app, we first added the missing persisted filter index:

collection.createIndex((row) => row.threadId)

At that point, we expected SQLite to stop doing a full table scan for the filter.

The important expectation was not just “there is now an index on threadId”. It was:

the persisted expression index for threadId is stored as literal-path SQL
the runtime query for WHERE threadId = ? should compile to the same expression shape
if those shapes match, SQLite should be able to plan SEARCH ... USING INDEX ... for the filter

The expected planner outcome after adding that index was:

SEARCH ... USING INDEX <threadId expression index> for the WHERE threadId = ? filter
possibly still USE TEMP B-TREE FOR ORDER BY, because the app query also sorts by createdAt DESC, id DESC and there is no composite persisted index for (threadId, createdAt, id)

What the runtime query was actually compiled into was the full SQL shape below:

SELECT key, value, metadata, row_version
FROM "c_z6sj6b_e"
WHERE ((CASE json_extract(value, ?)
  WHEN 'bigint' THEN CAST(json_extract(value, ?) AS NUMERIC)
  WHEN 'date' THEN json_extract(value, ?)
  WHEN 'nan' THEN NULL
  WHEN 'infinity' THEN NULL
  WHEN '-infinity' THEN NULL
  ELSE json_extract(value, ?)
END) = ?)
ORDER BY (CASE json_extract(value, ?)
  WHEN 'bigint' THEN CAST(json_extract(value, ?) AS NUMERIC)
  WHEN 'date' THEN json_extract(value, ?)
  WHEN 'nan' THEN NULL
  WHEN 'infinity' THEN NULL
  WHEN '-infinity' THEN NULL
  ELSE json_extract(value, ?)
END) DESC NULLS FIRST,
(CASE json_extract(value, ?)
  WHEN 'bigint' THEN CAST(json_extract(value, ?) AS NUMERIC)
  WHEN 'date' THEN json_extract(value, ?)
  WHEN 'nan' THEN NULL
  WHEN 'infinity' THEN NULL
  WHEN '-infinity' THEN NULL
  ELSE json_extract(value, ?)
END) DESC NULLS FIRST,
key ASC

with params shaped like:

[
  '$.threadId.__tanstack_db_persisted_type__',
  '$.threadId.value',
  '$.threadId.value',
  '$.threadId',
  '92ebe40d-e545-40f2-b0fd-bba86c41b86e',
  '$.createdAt.__tanstack_db_persisted_type__',
  '$.createdAt.value',
  '$.createdAt.value',
  '$.createdAt',
  '$.id.__tanstack_db_persisted_type__',
  '$.id.value',
  '$.id.value',
  '$.id',
]

The persisted expression index for threadId had already been normalized into literal-path SQL such as:

json_extract(value, '$.threadId')

So the runtime query and the persisted index were logically equivalent but not structurally identical from SQLite's perspective. SQLite expression-index matching is shape-sensitive here: json_extract(value, ?) is not the same expression as json_extract(value, '$.threadId') for planner matching purposes.

That is why adding the missing threadId index was necessary but not sufficient: the index existed, but the framework was still compiling the runtime predicate into a form that could not match that index.

What we should have expected the runtime query to look like, in order to leverage the threadId expression index, was the full query shape below. The important part is that the threadId expression is compiled with literal JSON paths, while the actual thread ID remains bound:

SELECT key, value, metadata, row_version
FROM "c_z6sj6b_e"
WHERE ((CASE json_extract(value, '$.threadId.__tanstack_db_persisted_type__')
  WHEN 'bigint' THEN CAST(json_extract(value, '$.threadId.value') AS NUMERIC)
  WHEN 'date' THEN json_extract(value, '$.threadId.value')
  WHEN 'nan' THEN NULL
  WHEN 'infinity' THEN NULL
  WHEN '-infinity' THEN NULL
  ELSE json_extract(value, '$.threadId')
END) = ?)
ORDER BY (CASE json_extract(value, '$.createdAt.__tanstack_db_persisted_type__')
  WHEN 'bigint' THEN CAST(json_extract(value, '$.createdAt.value') AS NUMERIC)
  WHEN 'date' THEN json_extract(value, '$.createdAt.value')
  WHEN 'nan' THEN NULL
  WHEN 'infinity' THEN NULL
  WHEN '-infinity' THEN NULL
  ELSE json_extract(value, '$.createdAt')
END) DESC NULLS FIRST,
(CASE json_extract(value, '$.id.__tanstack_db_persisted_type__')
  WHEN 'bigint' THEN CAST(json_extract(value, '$.id.value') AS NUMERIC)
  WHEN 'date' THEN json_extract(value, '$.id.value')
  WHEN 'nan' THEN NULL
  WHEN 'infinity' THEN NULL
  WHEN '-infinity' THEN NULL
  ELSE json_extract(value, '$.id')
END) DESC NULLS FIRST,
key ASC

with params shaped like:

['92ebe40d-e545-40f2-b0fd-bba86c41b86e']

That full query shape would let SQLite match the filter expression against the persisted threadId expression index. The remaining ORDER BY could still require a temp sort, which is fine and expected for this query shape.

This PR fixes that by compiling runtime ref expressions with inlined JSON-path literals in compileRefExpressionSql(...), while still keeping real filter values bound.

After this change, the runtime SQL shape matches the persisted index expression shape, which allows SQLite to use the index for the filter path.

It also adds two regressions:

core/shared regression: verifies runtime subset SQL inlines JSON paths but keeps actual values bound
node driver regression: verifies better-sqlite3 uses the expression index for the emitted runtime SQL via EXPLAIN QUERY PLAN

✅ Checklist

I have tested this code locally with pnpm test.

🚀 Release Impact

This change affects published code, and I have generated a changeset.
This change is docs/CI/dev-only (no release).

Verification

Passed on the fixed tree:

pnpm exec vitest --run tests/sqlite-core-adapter-cli-runtime.test.ts
pnpm exec vitest --run tests/node-sqlite-core-adapter-contract.test.ts
pnpm --filter @tanstack/db-sqlite-persistence-core test

Negative verification:

I temporarily reverted only compileRefExpressionSql(...) locally and re-ran the targeted regressions.

Without the fix:

the SQL-shape regression failed because the emitted query contained json_extract(value, ?) instead of inlined JSON paths
the node planner regression failed because the runtime query reintroduced bound JSON-path params ('$.threadId.__tanstack_db_persisted_type__', '$.threadId.value', etc.) instead of keeping only the real filter value bound

lalitkapoor added 2 commits April 20, 2026 20:15

fix(sqlite): inline json paths for expression indexes

8ea215d

chore: add changeset for sqlite expression index fix

41236c3

lalitkapoor marked this pull request as ready for review April 21, 2026 01:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sqlite): inline JSON paths for expression indexes#1487

fix(sqlite): inline JSON paths for expression indexes#1487
lalitkapoor wants to merge 2 commits intoTanStack:mainfrom
lalitkapoor:fix/sqlite-expression-index-json-paths

lalitkapoor commented Apr 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lalitkapoor commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 Changes

✅ Checklist

🚀 Release Impact

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lalitkapoor commented Apr 21, 2026 •

edited

Loading