Skip to content

Commit 834fbb2

Browse files
committed
Normalize cosine similarity to [0,1] range
- Vector search now normalizes cosine similarity from [-1,1] to [0,1] - Cache get also normalized to match - Fixes threshold comparisons (0.85-0.9 now meaningful) - Prevents cache miss due to similarity range mismatch
1 parent 93b2655 commit 834fbb2

2 files changed

Lines changed: 4 additions & 0 deletions

File tree

src/toondb/database.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2283,6 +2283,8 @@ def cache_get(
22832283

22842284
if query_norm > 0 and cached_norm > 0:
22852285
score = dot_product / (query_norm * cached_norm)
2286+
# Normalize from [-1, 1] to [0, 1] for threshold comparisons
2287+
score = (score + 1.0) / 2.0
22862288
else:
22872289
score = 0.0
22882290

src/toondb/namespace.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -594,6 +594,8 @@ def _vector_search(self, request: SearchRequest) -> SearchResults:
594594

595595
if query_norm > 0 and doc_norm > 0:
596596
similarity = dot_product / (query_norm * doc_norm)
597+
# Normalize from [-1, 1] to [0, 1] for threshold comparisons
598+
similarity = (similarity + 1.0) / 2.0
597599
else:
598600
similarity = 0.0
599601

0 commit comments

Comments
 (0)