Commit 0ee29e5
docs: Add NVFP4 implementation guide
Comprehensive technical guide covering the E2M1 format, two-level
micro-block scaling, Blackwell hardware (tcgen05/mma.sync), rotation-based
quantization, QuTLASS/FP-Quant/Four Over Six implementations, and
CUDA-level implementation details for NVFP4 in bitsandbytes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent ffadf57 commit 0ee29e5
1 file changed
+954
-0
lines changed
0 commit comments