You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(simd): preserve NaN in simd_exp_f32 (codex review on PR #142)
The pre-clamp via simd_clamp silently destroyed NaN inputs. simd_clamp is
implemented as max(lo).min(hi); _mm512_max_ps returns the SECOND operand
when the first is NaN (per Intel SDM § MAXPS), so NaN got clamped to lo
(-87.336) and exp(-87.336) ≈ 1.4e-38 — a tiny finite value pretending to
be valid.
Fix: capture NaN lanes via x.simd_ne(x) (NaN ≠ itself per IEEE 754) BEFORE
the clamp, then mask-select NaN back into those lanes after the polynomial.
NaN propagates per-lane; finite lanes are unchanged.
Two regression tests:
simd_exp_f32_propagates_nan — full-NaN vector returns full-NaN
simd_exp_f32_propagates_nan_per_lane — mixed NaN/0.0 input; NaN lanes
propagate, finite lanes compute exp(0)=1 unaffected
1788 passed (+2 from 1786).
Reported-by: codex review on PR #142.
0 commit comments