Skip to content

Commit b39a7bf

Browse files
authored
ggml-cuda: tune RDNA3 Q6_K MMVQ nwarps (#23349)
1 parent b28a2f3 commit b39a7bf

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

ggml/src/ggml-cuda/mmvq.cu

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -359,7 +359,9 @@ static constexpr __host__ __device__ int calc_nwarps(ggml_type type, int ncols_d
359359
case GGML_TYPE_Q5_1:
360360
case GGML_TYPE_Q8_0:
361361
case GGML_TYPE_Q4_K:
362+
return 8;
362363
case GGML_TYPE_Q6_K:
364+
return 2;
363365
case GGML_TYPE_IQ4_NL:
364366
return 8;
365367
default:

0 commit comments

Comments
 (0)