Commit 39fac95
committed
ROCm: wire up tiled 8-bit QMV launches for fp16 and bf16
Add explicit tiled QMV launch cases for 8-bit affine quantization in the
ROCm quantized matmul path.
This fixes 8-bit models being left off the tiled fast path and restores
correct, faster decode behavior for tested Qwen 8-bit models.1 parent 4f60779 commit 39fac95
1 file changed
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2959 | 2959 | | |
2960 | 2960 | | |
2961 | 2961 | | |
| 2962 | + | |
| 2963 | + | |
| 2964 | + | |
| 2965 | + | |
2962 | 2966 | | |
2963 | 2967 | | |
2964 | 2968 | | |
2965 | 2969 | | |
2966 | 2970 | | |
2967 | 2971 | | |
| 2972 | + | |
| 2973 | + | |
| 2974 | + | |
| 2975 | + | |
2968 | 2976 | | |
2969 | 2977 | | |
2970 | 2978 | | |
| |||
0 commit comments