Skip to content

Commit b78963f

Browse files
committed
fix(udf): remove sqrt from l2_distance to match USearch L2sq metric
l2_distance UDF was computing actual L2 (with sqrt) while USearch and the rewritten execution paths all use L2sq (no sqrt). This caused the same query to return different numeric distance values depending on whether the optimizer rewrote it. Remove sqrt to match USearch's MetricKind::L2sq and DuckDB VSS's array_distance behavior. All paths now return consistent L2sq values.
1 parent 80059cc commit b78963f

3 files changed

Lines changed: 3 additions & 4 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -272,11 +272,11 @@ All three distance functions are **lower-is-closer**:
272272

273273
| SQL function | Index metric | Kernel |
274274
|---|---|---|
275-
| `l2_distance(a, b)` | `L2sq` | `sqrt(sum((a_i - b_i)^2))` (UDF) / `sum((a_i - b_i)^2)` (index) |
275+
| `l2_distance(a, b)` | `L2sq` | `sum((a_i - b_i)^2)` |
276276
| `cosine_distance(a, b)` | `Cos` | `1 - dot(a,b) / (norm(a) * norm(b))` |
277277
| `negative_dot_product(a, b)` | `IP` | `-(a . b)` |
278278

279-
Note: `l2_distance` UDF returns actual L2 (with sqrt) for human-readable distances; USearch uses L2sq internally (no sqrt). The sort order is identical.
279+
`l2_distance` returns squared L2 (no sqrt), matching USearch's `MetricKind::L2sq`. This ensures numeric consistency between the UDF, the rewritten index path, and the brute-force path.
280280

281281
### Running tests
282282

src/lib.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ use datafusion::prelude::SessionContext;
9393
/// Register all extension components with a DataFusion [`SessionContext`].
9494
///
9595
/// Registers:
96-
/// - `l2_distance(col, query)` — Euclidean distance (L2)
96+
/// - `l2_distance(col, query)` — squared Euclidean distance (L2sq)
9797
/// - `cosine_distance(col, query)` — cosine distance
9898
/// - `negative_dot_product(col, query)` — negated inner product
9999
/// - `vector_usearch(table, query, k)` — explicit ANN table function

src/udf.rs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,6 @@ fn l2_kernel(a: &[f32], b: &[f32]) -> f32 {
2727
.zip(b.iter())
2828
.map(|(x, y)| (x - y) * (x - y))
2929
.sum::<f32>()
30-
.sqrt()
3130
}
3231

3332
fn cosine_kernel(a: &[f32], b: &[f32]) -> f32 {

0 commit comments

Comments
 (0)