Skip to content

fix(mex): -ffast-math broke NaN detection in to_step_function_mex; CI perf gates report-only#201

Open
HanSur94 wants to merge 1 commit into
mainfrom
claude/mex-nan-fastmath
Open

fix(mex): -ffast-math broke NaN detection in to_step_function_mex; CI perf gates report-only#201
HanSur94 wants to merge 1 commit into
mainfrom
claude/mex-nan-fastmath

Conversation

@HanSur94

Copy link
Copy Markdown
Owner

Summary

Third batch from the review sweep — the remaining correctness bug plus the CI flake source that hit both previous PRs.

-ffast-math silently broke NaN detection (correctness)

-ffast-math implies -ffinite-math-only: the compiler assumes no NaNs exist and folds the IEEE self-compare NaN test (v == v) to constant true — including clang's IR lowering of the NEON/AVX compare intrinsics. to_step_function_mex's NaN-segment scan therefore saw every segment as active: an all-NaN input produced a full step function, and NaN gaps in StateTag staircases rendered as solid steps. This is why all 9 TestToStepFunctionMex cases failed on ARM64 MATLAB (the suite is gated off on headless-Linux CI, which is how it went unnoticed; the prebuilt binaries from #134 carry the bug).

Fix, belt and braces:

  • build_mex.m appends -fno-finite-math-only after every -ffast-math (6 sites) — restores NaN/Inf semantics while keeping the reassociation/FMA/no-signed-zero wins. Rationale comment added.
  • The kernel's scalar tails now use mxIsNaN (an opaque libmx call the optimizer can't fold — survives any fast-math mode, including MSVC /fp:fast, where there is no granular opt-out). Constraint documented in the kernel header.
  • ARM64 binaries rebuilt and committed (FastSense + SensorThreshold copies); the refresh-mex-binaries workflow triggers on these paths and regenerates the remaining platforms.

Kernel audit: only to_step_function_mex.c used the vulnerable idiom — build_store_mex.c already used isnan() with a comment warning about exactly this hazard, violation_cull_mex.c uses mxIsNaN, and the minmax/LTTB kernels receive pre-segmented NaN-free data by contract.

Verified live (R2025b / Apple Silicon): to_step_function_mex([1 5 10],[NaN NaN NaN],20) → empty (was 6 elements); TestToStepFunctionMex 13/13 (was 9 failures); TestStateTag 18/18; flat test_statetag green.

CI perf gates become report-only on shared runners

All five TestTagPerfRegression benches gate on wall-clock measurements (relative overhead, absolute ms, micro-timing ratios). On GitHub's shared runners they produced three false failures across two unrelated PRs in one day (#197/#198consumerMigrationTickGate 23→35% on identical code that also passed, bench_sensortag_getxy tripping alongside). invokeBenchOrSkip_ now converts gate trips into assumption-skips with the full measurement diagnostic when CI is set; FASTSENSE_PERF_GATES=strict opts a job back into hard enforcement (for a dedicated/self-hosted runner), and gates remain hard on developer machines. Verified in both modes (simulated CI=true: 3 passed / 2 report-only skips / 0 failed; local: unchanged hard gates).

🤖 Generated with Claude Code

… perf gates report-only

-ffast-math implies -ffinite-math-only, letting the compiler assume no
NaNs and fold the IEEE self-compare NaN test (v == v) to constant true —
including clang's IR lowering of the NEON/AVX compare intrinsics. The
to_step_function_mex NaN-segment scan therefore treated every NaN segment
as active: all-NaN inputs produced full step functions and NaN gaps in
StateTag staircases rendered as solid steps (all 9 TestToStepFunctionMex
failures on ARM64 MATLAB).

- build_mex.m: append -fno-finite-math-only after every -ffast-math
  (6 sites) — restores NaN/Inf semantics, keeps reassociation + FMA wins
- to_step_function_mex.c: scalar tails use mxIsNaN (opaque libmx call,
  survives any fast-math mode incl. MSVC /fp:fast); FAST-MATH CONSTRAINT
  documented in the header
- rebuilt ARM64 binaries (FastSense + SensorThreshold copies); the
  refresh-mex-binaries workflow triggers on these paths and regenerates
  the remaining platforms
- kernel audit: only this kernel used the vulnerable idiom (build_store
  already used isnan with a hazard comment; violation_cull uses mxIsNaN;
  minmax/lttb receive pre-segmented NaN-free data)
- TestTagPerfRegression: wall-clock gates become report-only skips on
  shared CI runners (three false failures across two unrelated PRs on
  2026-06-10); FASTSENSE_PERF_GATES=strict opts back in; gates stay hard
  on developer machines

Verified R2025b: TestToStepFunctionMex 13/13 (was 9 failures),
TestStateTag 18/18, flat test_statetag green; gate policy verified in
simulated-CI and local modes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
libs/FastSense/build_mex.m 0.00% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant