Skip to content

Commit 1772701

Browse files
authored
opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
* opencl: add q6_K noshuffle kernels, initial q6_K gemv, some host code * opencl: add q6_K transpose * opencl: fix cvt kernel name * opencl: add call to q6_K gemv * opencl: fix q6_K scale transpose * opencl: fix loading for gemv q6_K, refactor * opencl: fix transpose_8_buf kernel assignment, refactor * opencl: refactor q6_K transpose * opencl: add gemm_noshuffle_q6_k_f32 * opencl: fix qh loading * opencl: refactor q6_K gemv host side, release bufs and imgs * opencl: refactor * opencl: fix q6_K dequant and scale selection * opencl: workaround compiler bug, fix dump_tensor * opencl: refactor q6_K convert kernels * opencl: unpack transformed q6_K in get_tensor * opencl: refactor, handle non-uniform workgroups * opencl: support non-vector subgroup bcast
1 parent 39bf0d3 commit 1772701

5 files changed

Lines changed: 920 additions & 40 deletions

File tree

ggml/src/ggml-opencl/CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,8 @@ set(GGML_OPENCL_KERNELS
114114
gemv_noshuffle_q4_1_f32
115115
gemm_noshuffle_q4_1_f32
116116
gemv_noshuffle_general_q8_0_f32
117+
gemv_noshuffle_q6_k_f32
118+
gemm_noshuffle_q6_k_f32
117119
mul
118120
neg
119121
norm

0 commit comments

Comments
 (0)