Commit de52a44
committed
fix(simd): SimdProfile::detect() consults amx_available() — Risk #3 closure
Integration plan risk #3 ("Detection robustness across hypervisors"):
CPUID may advertise AMX-TILE while the OS/hypervisor has not enabled
the tile XSAVE state. Without the OS-level check, the dispatch table
routes to AMX kernels that SIGILL at first use.
Fix: SimdProfile::detect() now reads `simd_amx::amx_available()` (the
existing 4-step gate: CPUID → OSXSAVE → XCR0[17,18] → arch_prctl
XCOMP_PERM on Linux 5.19+) and demotes when CPUID and OS disagree.
The GraniteRapids and SapphireRapids arms now require both the CPUID
bits AND `amx_usable`; the Zen4Avx512 arm catches SPR-class CPUID
with locked-down hypervisor XSAVE so dispatch falls to the AVX-512
BF16/FP16 path instead.
Verified on the build host (Sapphire Rapids silicon, kernel 6.18.5):
- CPUID reports amx_tile=1, amx_int8=1, amx_bf16=1 (all true)
- simd_amx::amx_available() returns false (hypervisor masks
XCR0[17,18] or the arch_prctl(XCOMP_PERM) request fails)
- SimdProfile::detect() correctly resolves to Zen4Avx512, not
SapphireRapids — the AMX kernels are not reachable from
dispatch on this OS state.
Without this fix, the e40f3a3 detect path would have resolved to
SapphireRapids on this exact silicon/OS combination, then SIGILL'd
the first time a dispatch table called an AMX kernel. Bug closed
before any consumer was wired to the dispatch table.
The probe binary (examples/simd_profile_probe.rs) gains a new "AMX
gating (CPUID vs OS)" section so the CPUID-vs-OS gap is visible
without reading source. Format mirrors how the matrix-doc cell
summary appears: terse, two lines plus an optional demotion note
when the bits disagree.
Pinned mode (cpu-* cargo features) intentionally bypasses this gate
since pinning is a build-time assertion that the target OS supports
the chosen variant — pinned binaries are non-portable by design.
Tests: 2077/2077 lib pass. cargo clippy --lib clean under default
and --features cpu-spr. Behaviour on hardware with proper AMX
enablement (full prctl path success) is unchanged: SapphireRapids
still resolves to SapphireRapids when amx_available() returns true.1 parent 03b30e5 commit de52a44
2 files changed
Lines changed: 37 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
52 | 73 | | |
53 | 74 | | |
54 | 75 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
89 | 99 | | |
90 | 100 | | |
91 | | - | |
| 101 | + | |
92 | 102 | | |
93 | 103 | | |
94 | 104 | | |
95 | 105 | | |
96 | | - | |
| 106 | + | |
97 | 107 | | |
98 | 108 | | |
99 | | - | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
100 | 112 | | |
101 | 113 | | |
102 | 114 | | |
103 | 115 | | |
104 | | - | |
| 116 | + | |
105 | 117 | | |
106 | 118 | | |
107 | 119 | | |
| |||
0 commit comments