@@ -66,24 +66,33 @@ BM25 (`ORDER BY bm25(<entity>_fts)`), highest signal first.
6666
6767``` python
6868try :
69- rows = conn.exec_driver_sql(match_sql, params).fetchall()
69+ async with self ._engine.session() as session:
70+ conn = await session.connection()
71+ cursor = await conn.exec_driver_sql(sql, (fts_match, limit))
72+ ids = [int (row[0 ]) for row in cursor.all()]
7073except OperationalError as exc:
7174 if not _is_fts_syntax_error(exc):
7275 raise # genuine OperationalError (locked DB, missing table) — propagate
73- rows = []
74- if not rows :
75- return await self .search_fuzzy( cls , query, limit = limit, ... )
76+ return await _fuzzy() # FTS5 syntax error → fuzzy
77+ if not ids :
78+ return await _fuzzy() # FTS5 returned no hits → fuzzy
7679```
7780
7881Two conditions trigger the fuzzy fall-through:
7982
80- 1 . ** FTS5 returned zero hits.** Common for partial matches, typos, or queries that
81- tokenize to an empty list (e.g., a query of only punctuation).
83+ 1 . ** FTS5 returned zero hits.** Common for partial matches and typos. Note that queries
84+ that * tokenize* to an empty list (whitespace-only or punctuation-only input) return
85+ ` [] ` directly without fuzzy — there's nothing to match against.
82861 . ** FTS5 raised an ` OperationalError ` whose underlying ` orig ` message starts with
8387 ` fts5: syntax error ` or ` fts5: unknown ` .** This is narrow on purpose — other
8488 ` OperationalError ` s (locked DB, missing FTS table, disk I/O) propagate so operators
8589 see real failures instead of degraded-but-silent search.
8690
91+ A third path exists: entities ** without** an FTS sidecar (the lookup-only types listed
92+ in the coverage table below) skip FTS entirely and go straight to fuzzy. ` smart_search `
93+ on ` CachedTaxRate ` , for example, has no ` tax_rate_fts ` table to query — fuzzy is the
94+ only available path.
95+
8796The narrowing is done via ` str(exc.orig).startswith(...) ` (with a defensive fallback to
8897` str(exc) ` if ` exc.orig ` happens to be ` None ` , which SQLAlchemy shouldn't produce in
8998practice but is guarded for safety). ` str(exc) ` on its own would include SQLAlchemy's
@@ -230,12 +239,14 @@ The four cache-only fields on `CachedVariant`:
230239| ` display_name ` | ` parent.name ` + variant ` config_attributes ` | Human-readable result rendering |
231240| ` supplier_item_codes_text ` | ` " ".join(supplier_item_codes) ` | FTS5 tokenizes whitespace-separated text; a list field wouldn't be indexable |
232241
233- The variant ` EntitySpec ` declares ` depends_on=("product", "material") ` so the parent
234- records exist by the time the postprocess hook reads them. Cold-start ordering — parents
235- before children — is the responsibility of the caller (today, ` ensure_variants_synced `
236- explicitly ` asyncio.gather ` s product+material syncs before running the variant sync);
237- ` depends_on ` is documentation + a future scheduler hook (see
238- ` _validate_dependency_graph ` ).
242+ ** An aside on ` depends_on ` ** : the postprocess hook gets parent data from the * extended
243+ API payload* (the variant fetch uses ` extend=[PRODUCT_OR_MATERIAL] ` so each variant
244+ arrives with its parent inlined), ** not** from cached parent rows. The variant
245+ ` EntitySpec ` 's ` depends_on=("product", "material") ` is therefore declarative metadata —
246+ useful for FK-join queries against the cache and for ` _validate_dependency_graph `
247+ cycle-detection at engine open, but not load-bearing for variant denormalization itself.
248+ Cold-start ordering today is handled explicitly: ` ensure_variants_synced `
249+ ` asyncio.gather ` s the product + material syncs before running the variant sync.
239250
240251## Soft-state filtering
241252
@@ -320,20 +331,23 @@ hits = await catalog.smart_search(CachedProduct, "kitchen knife")
320331Tokenize: ` ["kitchen", "knife"] ` . Match: ` "kitchen"* AND "knife"* ` . Behavior unchanged
321332from the legacy cache — multi-token AND with prefix expansion.
322333
323- ### Syntax-error fall-through
334+ ### Syntax-error fall-through (the test-only path)
324335
325- ``` python
326- hits = await catalog.smart_search(CachedVariant, " AND OR NOT" )
327- # → falls through to search_fuzzy; returns whatever difflib scores best
328- ```
336+ The ` \W+ ` tokenizer plus ` _build_fts_match ` 's double-quote escaping protect every
337+ user-typed query from producing an invalid FTS5 MATCH expression. Reserved-word queries
338+ like ` "AND OR NOT" ` tokenize to ` ["AND", "OR", "NOT"] ` and emit the valid (if odd)
339+ expression ` "AND"* AND "OR"* AND "NOT"* ` — the words land inside the quoted string
340+ literals and don't trigger FTS5's keyword parser.
329341
330- Tokenize produces tokens that, after FTS5 sees them, form an invalid match expression.
331- FTS5 raises ` OperationalError("fts5: syntax error near ...") ` . ` _is_fts_syntax_error `
332- returns ` True ` ; ` search_fuzzy ` runs.
342+ This is by design — ` _is_fts_syntax_error ` exists to handle the rare case where a future
343+ generator regression or a SQLite version bump produces a query the FTS5 parser rejects.
344+ The regression test for this path patches ` exec_driver_sql ` to raise
345+ ` OperationalError("fts5: syntax error near ...") ` directly and asserts the fall-through
346+ routes to fuzzy.
333347
334348A ` LockedError ` or ` OperationalError("no such table: ...") ` would ** not** match the
335- prefix check and would propagate. That's the design — silent fall-through is reserved
336- for the recoverable case.
349+ ` fts5: ` prefix check and would propagate. That's the design — silent fall-through is
350+ reserved for the recoverable case.
337351
338352## Where to look
339353
0 commit comments