Skip to content

[FP16] Improved performance by fusing dequantize with compute in kernels: 20-30% Inference Speedup#78

Merged
mikepapadim merged 68 commits into
mainfrom
feat/deq-n-compute
Dec 11, 2025
Merged

[FP16] Improved performance by fusing dequantize with compute in kernels: 20-30% Inference Speedup#78
mikepapadim merged 68 commits into
mainfrom
feat/deq-n-compute

Remove redundant workers in Qwen3 Q8_0 FFN layers.

664f160
Select commit
Loading
Failed to load commit list.
Sign in for the full log view