Commit 03fb3bc

committed

NVFP4: make rich+RSF adaptive scaling default

Enable top1 rich scale search for NVFP4 adaptive quantization with slots 6,5,4,3,2,1.5,1 and ±1 UE4M3 code search. Add RSF-lite per-tensor scale multiplier fitting by default, plus summary stats for rich candidates, slot distribution, and RSF multipliers.

1 parent 278d8a6 commit 03fb3bcCopy full SHA for 03fb3bc

4 files changed

ggml/src
- ggml-quants.c
- ggml-quants.h
src
- llama-quant.cpp
tools/quantize
- quantize.cpp

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 03fb3bc

File tree

0 commit comments