Commit 7de925b
Fix int16 overflow in l2_sqr_int8_neon SIMD distance
vmulq_s16(diff, diff) produced int16 results, but diff can be up to
255 for int8 vectors (-128 vs 127), and 255^2 = 65025 overflows
int16 (max 32767). This caused NaN/wrong results for int8 vectors
with large differences.
Fix: use vmull_s16 (widening multiply) to produce int32 results
directly, avoiding the intermediate int16 overflow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 4bee883 commit 7de925b
2 files changed
Lines changed: 14 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
258 | 258 | | |
259 | 259 | | |
260 | 260 | | |
261 | | - | |
| 261 | + | |
262 | 262 | | |
263 | 263 | | |
264 | | - | |
265 | 264 | | |
266 | | - | |
267 | | - | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
268 | 271 | | |
269 | 272 | | |
270 | 273 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
381 | 381 | | |
382 | 382 | | |
383 | 383 | | |
384 | | - | |
| 384 | + | |
385 | 385 | | |
386 | 386 | | |
387 | 387 | | |
388 | 388 | | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
389 | 395 | | |
390 | 396 | | |
391 | 397 | | |
| |||
0 commit comments