Skip to content

Commit e1012ae

Browse files
committed
fix(contract): loosen Fisher-z clamp ±0.999 → ±0.9999 so self-match reads ~1
`Distance::similarity_z` clamps the similarity away from ±1 to keep the `atanh` (the `ln` term) finite. The bound was ±0.999, which made `tanh(atanh(0.999)) = 0.999` the maximum value `cohort_similarity_z` could ever return — so a perfect self-match read ~0.99899, and any "self ≈ 1.0" assertion (`s > 0.999`) was unreachable. ±0.9999 keeps atanh finite (≈4.95) while letting a self-match round-trip to ≈0.99986, which reads as "essentially identical". Pure numerical guard, no semantic change for moderate similarities; the existing distance tests (s=0.8 roundtrip, z=0.5 averaging, sign-only checks) are unaffected. Fixes medcare-analytics graph_contract::cohort_similarity_z_ self_returns_one (latent — the lance-phase2-rbac suite had never compiled in the dev env until protoc was installed this session). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01EYvNjD8M8LMNYbRy3gq2FP
1 parent ddb6c84 commit e1012ae

1 file changed

Lines changed: 7 additions & 1 deletion

File tree

crates/lance-graph-contract/src/distance.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,13 @@ pub trait Distance: Sized {
4141
#[inline]
4242
fn similarity_z(&self, other: &Self) -> f32 {
4343
let s = self.similarity(other);
44-
let clamped = s.clamp(-0.999, 0.999);
44+
// Clamp away from ±1 so `atanh` (the `ln` below) stays finite.
45+
// The bound is ±0.9999, not ±0.999: a self-match (s = 1.0) must
46+
// round-trip back through `tanh(atanh(clamp)) = clamp` to a value
47+
// that reads as "essentially identical" (≈0.99986), not be capped
48+
// at 0.999 — otherwise `cohort_similarity_z(self) > 0.999` is
49+
// unreachable. atanh(0.9999) ≈ 4.95 is comfortably finite.
50+
let clamped = s.clamp(-0.9999, 0.9999);
4551
((1.0 + clamped) / (1.0 - clamped)).ln() * 0.5
4652
}
4753
}

0 commit comments

Comments
 (0)