You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A `cachedAsyncBuffer` deduplicates footer / offset-index byte ranges across all the parallel `parquetRead` calls.
183
183
184
-
**PQ + rerank path** (`algorithm: 'pq'`, or `auto` when a file has PQ but no binary column): scan compact `vector_pq` codes over the selected cluster ranges, approximate-score candidates with lookup tables built from the queryand stored PQ codebooks, then fetch full float32 vectors only for the candidate pool and exact-rerank as above. When `clusters > 0`, PQ uses the same contiguous cluster row ranges as the binary path.
184
+
**IVF-PQ + rerank path** (`algorithm: 'pq'`, or `auto` when a file has PQ but no binary column): rank stored float IVF centroids against the query, scan compact residual `vector_pq` codes over the selected IVF row groups, approximate-score candidates with lookup tables built from the query, IVF centroid, and residual PQ codebooks, then fetch full float32 vectors only for the candidate pool and exact-rerank as above. IVF-PQ uses its own row ordering and should not be combined with binary `clusters`.
185
185
186
186
For pre-normalized vectors with `metric: 'cosine'`, the search normalizes the query once and scores via dot product to skip the per-candidate sqrt loop.
187
187
@@ -208,9 +208,13 @@ Key-value metadata:
208
208
|`hypvector.clusters`| number of k-means clusters (0 if not clustered) |
0 commit comments