You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Harden TurboQuant L2Norm fast path and document the crate
Fix two correctness bugs in the L2Norm(TQDecode(_)) fast path. (1) The
kernel coerced the returned norms to the child's nullability rather than
the parent's, so wider-child-validity storage shapes that parse_storage
accepts errored out at the dtype invariant. The kernel now coerces to
parent.dtype().nullability() and a new test mirrors the malformed.rs
shape. (2) The per-row inv_direction_norm computation could store a 0.0
sentinel for finite rows whose squared sum overflowed to +inf in f32 (or
a +inf for denormal norm_squared), making decode emit zeros while the
kernel returned the nonzero stored norm. Encode now rejects non-finite
input norms up front and the denormal recip is guarded by is_normal();
regression tests cover both cases.
Several should-fix items go with the must-fix: parse_storage_norms_only
lets the kernel skip executing the codes and inv_direction_norms
children it does not consume; the parity test pins down the exact
new = old * inv_direction_norm[row] relationship rather than asserting
"the values differ"; file roundtrip now asserts the new field survives
serialization and the kernel still preserves stored norms; tests are
parameterized over f16/f32/f64 and across padded vs unpadded dimensions;
the kernel result is cross-checked against canonical L2Norm of the
materialized decode. The hypothetical defensive metadata check on the
kernel is dropped (registry key plus TQDecode signature already enforce
shape). The dev-dep on vortex-array switches to workspace = true to
match sibling encodings. Over-long doc lines are reflowed.
Every type in the crate now has a doc comment, emphasizing the new
inv_direction_norms storage child and the 0.0 sentinel semantics.
Module docs single-source the storage schema rationale in storage.rs;
lib.rs and the scalar-fn modules defer to it.
Verified: cargo check, cargo clippy --all-targets --all-features,
cargo +nightly fmt --all --check, cargo doc --no-deps, and
cargo nextest run (102 tests, +14 new) all clean.
Signed-off-by: "Connor Tsui" <connor.tsui20@gmail.com>
0 commit comments