Commit 09728b0

committed

BF16-direct indexer: skip f32, strided octave + halftone projection

Replace chunked f32 streaming with BF16-direct path: - read_tensor_bf16_raw: reusable Vec<u16> buffer, no f32 alloc (141 MB vs 424 MB) - project_row_bf16_direct: inline BF16→f64, 136 bytes stack - project_row_bf16_strided: octave stride + halftone drop (97% fewer conversions) - stream_index_gguf_bf16: combined optimized indexer with octave_stride param - HALFTONE_POS/HALFTONE_TO_BIN: compile-time position tables - 3 new unit tests: halftone coverage, bf16_to_f64 accuracy, strided vs full https://claude.ai/code/session_01HmdXNPit7QsTCfhJFef3Ee

1 parent fb9d03b commit 09728b0Copy full SHA for 09728b0

1 file changed

src/hpc
- gguf_indexer.rs

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 09728b0

File tree

0 commit comments