Commit 70ade20
committed
Use precision-matched quantized references in INT4 matmul tests
Replace eager float32 references with precision-matched quantized references
that align with each kernel's internal dequant precision:
- dequant_w4_to_bf16: bitwise exact vs pure-Python dequant (was atol=0.01)
- int4_matmul: cuBLAS bf16 GEMM reference (both truncate to bf16)
- int4_matvec: f32 matmul reference (both keep dequant in f32, atol=1e-3
vs prior atol=1.0)
Co-authored-by: Claude <noreplyanthropic.com>1 parent 8ae05c2 commit 70ade20
2 files changed
Lines changed: 327 additions & 100 deletions
0 commit comments