Commit 2cd3d8b
committed
feat(backend): unified INT8/BF16 GEMM dispatch + CBLAS-compat aliases
Adds auto-dispatched gemm_i8 and gemm_bf16 to the backend module,
plus CBLAS-compat aliases so consumers have ONE call for each dtype:
ndarray::backend::gemm_f32(...) // f32 (AVX-512/AVX2/NEON)
ndarray::backend::gemm_f64(...) // f64
ndarray::backend::gemm_i8(...) // i8 (VNNI → scalar)
ndarray::backend::gemm_bf16(...) // bf16 (tiled bf16_gemm_f32)
ndarray::backend::cblas_sgemm(...) // MKL drop-in
ndarray::backend::cblas_dgemm(...) // MKL drop-in
ndarray::backend::cblas_gemm_s8s8s32(...) // MKL drop-in
ndarray::backend::cblas_gemm_bf16bf16f32(...) // MKL drop-in
INT8 dispatch: vnni_gemm::int8_gemm_vnni handles VNNI detection
internally (VPDPBUSD when available, scalar fallback otherwise).
BF16 dispatch: quantized::bf16_gemm_f32 (tiled, f32 accumulation).
All 1767 tests pass.
https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj1 parent dfa25a6 commit 2cd3d8b
1 file changed
Lines changed: 75 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
203 | 203 | | |
204 | 204 | | |
205 | 205 | | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
0 commit comments