Commit 7dbbe95
committed
feat(search): require --type, route to server-side bm25_search/vector_distance
`hotdata search` now requires `--type vector|bm25` (no default; same rule
as `indexes create --type`) and a positional query text argument. Both
modes run entirely server-side with no client-side embedding.
Routing:
- `--type vector "<query>"` →
SELECT *, vector_distance(<col>, '<query>') AS dist FROM <t> ORDER BY dist
Server resolves the embedding column, model, dimensions, and metric from
the index metadata. The user names the source text column.
- `--type bm25 "<query>"` → existing bm25_search() server-side path.
Removed:
- `--model` flag (was: client-side OpenAI embedding + `l2_distance` SQL).
- Stdin-piped-vector path (was: read JSON vector from stdin, generate
`l2_distance` SQL).
- `src/embedding.rs` module (its only callers were the two paths above).
Both removed paths hardcoded `l2_distance` regardless of the index's
actual metric, which silently produced wrong rankings on cosine indexes.
They also required the user to point `--column` at the auto-generated
`_embedding` column rather than the source text column. Power users who
need client-side embedding or want to query with a precomputed vector
can use raw SQL via `hotdata query` (e.g. `SELECT *, cosine_distance(...)`).
Verified against prod on `my_ducklake.main.internet_pages_small`:
- BM25 "basketball" → finds the basketball ProCamp title (score 2.92)
- BM25 "HIV" → finds the HIV Story titles (score 4.81)
- Vector "sports games athletes" → ranks the basketball ProCamp first
(cosine distance 0.69), heart-attack-fitness second (0.80)
- Vector "travel vacation cruise" → ranks the cruise excursion first
(0.63), 48-hours-in-Cesky-Krumlov second (0.74)
The semantically meaningful vector results confirm auto-embedding produced
useful vectors AND the server-side rewrite correctly resolves
provider+metric+output_column from index metadata. Cleaned up indexes
after the test run.1 parent f7a2532 commit 7dbbe95
5 files changed
Lines changed: 62 additions & 197 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
204 | | - | |
205 | | - | |
206 | | - | |
| 204 | + | |
207 | 205 | | |
208 | | - | |
209 | | - | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
210 | 209 | | |
211 | | - | |
212 | | - | |
| 210 | + | |
| 211 | + | |
213 | 212 | | |
214 | 213 | | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
219 | 217 | | |
| 218 | + | |
220 | 219 | | |
221 | 220 | | |
222 | 221 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
297 | 297 | | |
298 | 298 | | |
299 | 299 | | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | 300 | | |
304 | | - | |
305 | | - | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
306 | 306 | | |
307 | | - | |
308 | | - | |
| 307 | + | |
| 308 | + | |
309 | 309 | | |
310 | | - | |
311 | | - | |
312 | | - | |
313 | | - | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
314 | 313 | | |
315 | 314 | | |
316 | | - | |
| 315 | + | |
| 316 | + | |
317 | 317 | | |
318 | 318 | | |
319 | 319 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
141 | | - | |
142 | | - | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
143 | 153 | | |
144 | 154 | | |
145 | 155 | | |
146 | 156 | | |
147 | 157 | | |
148 | | - | |
| 158 | + | |
| 159 | + | |
149 | 160 | | |
150 | 161 | | |
151 | 162 | | |
| |||
157 | 168 | | |
158 | 169 | | |
159 | 170 | | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | 171 | | |
165 | 172 | | |
166 | 173 | | |
| |||
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | 9 | | |
11 | 10 | | |
12 | 11 | | |
| |||
554 | 553 | | |
555 | 554 | | |
556 | 555 | | |
| 556 | + | |
557 | 557 | | |
558 | 558 | | |
559 | 559 | | |
560 | 560 | | |
561 | | - | |
562 | 561 | | |
563 | 562 | | |
564 | 563 | | |
565 | 564 | | |
566 | 565 | | |
567 | 566 | | |
568 | | - | |
569 | | - | |
570 | | - | |
571 | | - | |
572 | | - | |
573 | | - | |
574 | | - | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
585 | | - | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | | - | |
594 | | - | |
595 | | - | |
596 | | - | |
597 | | - | |
598 | | - | |
599 | | - | |
600 | | - | |
601 | | - | |
602 | | - | |
603 | | - | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
604 | 581 | | |
605 | | - | |
606 | | - | |
607 | | - | |
608 | | - | |
609 | | - | |
610 | | - | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
611 | 593 | | |
612 | 594 | | |
613 | 595 | | |
| |||
0 commit comments