Skip to content

Commit b28a2f3

Browse files
shaofeiqilhez
andauthored
opencl: add MoE support for q4_k, q5_k, q6_k on Adreno (ggml-org#23303)
* opencl: add q4_k moe support * opencl: add q5_k moe support * opencl: add q6_k moe support * opencl: adjust format --------- Co-authored-by: Li He <lih@qti.qualcomm.com>
1 parent 17d22a3 commit b28a2f3

9 files changed

Lines changed: 2601 additions & 8 deletions

ggml/src/ggml-opencl/CMakeLists.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,12 @@ set(GGML_OPENCL_KERNELS
110110
gemv_moe_q5_0_f32_ns
111111
gemm_moe_q5_1_f32_ns
112112
gemv_moe_q5_1_f32_ns
113+
gemm_moe_q4_k_f32_ns
114+
gemv_moe_q4_k_f32_ns
115+
gemm_moe_q5_k_f32_ns
116+
gemv_moe_q5_k_f32_ns
117+
gemm_moe_q6_k_f32_ns
118+
gemv_moe_q6_k_f32_ns
113119
gemm_moe_mxfp4_f32
114120
gemv_moe_mxfp4_f32
115121
gemm_moe_mxfp4_f32_ns

ggml/src/ggml-opencl/ggml-opencl.cpp

Lines changed: 940 additions & 8 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)