Commit 5b98eee
committed
perf: use BF16 SIMD batch converter for GGUF dequantization
Replace scalar bf16_to_f32 loop with quantized::bf16_to_f32_slice
batch path. Same BF16 repr (transparent u16), zero-copy reinterpret
of raw bytes to BF16 slice, then batch convert to f32.
https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o71 parent c21572b commit 5b98eee
1 file changed
Lines changed: 11 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
224 | 229 | | |
225 | 230 | | |
226 | 231 | | |
| |||
0 commit comments