Commit 9a1caeb
committed
nsm: replace alignment-unsafe u8->f32 cast with safe LE decode
build_distance_matrix_from_cam reinterpreted a &[u8] codebook buffer
as &[f32] via `as_ptr() as *const f32` + from_raw_parts. A &[u8]
carries no alignment guarantee, so on an unaligned buffer (mmap'd
file, sub-slice) the f32 reinterpret is UB. Every other byte cast in
the repo widens [u64]->[u8] (alignment decreases = sound); this one
narrowed alignment-up and was the lone genuine soundness risk found
in the unsafe audit.
Replace with chunks_exact(4) + f32::from_le_bytes: alignment-free,
endian-correct (matches the workspace LE contract), no unsafe. The
codebook is read-only downstream, so owning a Vec<f32> is fine.
The CAM-PQ codebook centroels are f32 by definition (6 subspaces x
256 centroids x subspace_dim); the stored word distance remains the
u8-quantized L2 in WordDistanceMatrix.
https://claude.ai/code/session_0147hSzjmWZDuy2MSQNrhEK51 parent 4e537c7 commit 9a1caeb
1 file changed
Lines changed: 10 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
181 | 184 | | |
182 | 185 | | |
183 | 186 | | |
| |||
0 commit comments