|
| 1 | +# Direct-to-master audit — burn-parity post-sprint (2026-04-30) |
| 2 | + |
| 3 | +5 commits pushed directly to master during live session. This file |
| 4 | +documents the rationale for each — the audit trail that was skipped |
| 5 | +when pushing directly. |
| 6 | + |
| 7 | +## Commits |
| 8 | + |
| 9 | +| SHA | Title | LOC | |
| 10 | +|---|---|---| |
| 11 | +| `ccf5b77b` | fix(deps): surgical hpc-extras gate | +24/-19 | |
| 12 | +| `dfa25a62` | fix(backend): missing cfg gate + CBLAS aliases | +40/-1 | |
| 13 | +| `2cd3d8b1` | feat(backend): unified INT8/BF16 GEMM dispatch | +75 | |
| 14 | +| `00b6ee57` | feat(backend): re-export all slice-level ops | +44 | |
| 15 | +| `c1c7ae42` | feat(simd): elementwise slice ops (simd_ops.rs) | +294 | |
| 16 | + |
| 17 | +## ccf5b77b — surgical hpc-extras gate |
| 18 | + |
| 19 | +PR #116 (sprint A1) gated ALL of `pub mod hpc;` behind `hpc-extras`. |
| 20 | +This hid BF16, F16, quantization, fingerprints, VSA, plane, seal — |
| 21 | +everything burn-ndarray and lance-graph need daily. |
| 22 | + |
| 23 | +Fix: `pub mod hpc;` now `#[cfg(feature = "std")]` (always available). |
| 24 | +Only 5 research modules gated: p64_bridge, crystal_encoder, deepnsm, |
| 25 | +spo_bundle, compression_curves. blake3 made unconditional. |
| 26 | + |
| 27 | +## dfa25a62 — CBLAS-compat aliases |
| 28 | + |
| 29 | +`pub use mkl::{ gemm_f32, ... }` was missing its `#[cfg(feature = "intel-mkl")]` |
| 30 | +gate — broken without the feature. Fixed + added `cblas_sgemm` / `cblas_dgemm` |
| 31 | +as MKL drop-in replacements routing through native SIMD. |
| 32 | + |
| 33 | +## 2cd3d8b1 — unified GEMM dispatch |
| 34 | + |
| 35 | +INT8 GEMM existed in 3 places, BF16 in 2, with no unified entry point. |
| 36 | +Added `backend::gemm_i8()` (VNNI → scalar) and `backend::gemm_bf16()`. |
| 37 | +Plus CBLAS aliases `cblas_gemm_s8s8s32` / `cblas_gemm_bf16bf16f32`. |
| 38 | + |
| 39 | +## 00b6ee57 — unified slice-op re-exports |
| 40 | + |
| 41 | +Scattered across kernels_avx512 (pub(crate)), simd_int_ops, simd_half, |
| 42 | +hpc/reductions. Now all reachable from `ndarray::backend::*`. |
| 43 | + |
| 44 | +## c1c7ae42 — simd_ops.rs |
| 45 | + |
| 46 | +Portable elementwise slice ops using operator traits on polyfill types. |
| 47 | +`ndarray::simd::{add_f32, mul_f32, scale_f32, ...}`. |
| 48 | +Works on all platforms. 11 tests. 1778 total pass. |
0 commit comments