Commit 1772701
authored
opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
* opencl: add q6_K noshuffle kernels, initial q6_K gemv, some host code
* opencl: add q6_K transpose
* opencl: fix cvt kernel name
* opencl: add call to q6_K gemv
* opencl: fix q6_K scale transpose
* opencl: fix loading for gemv q6_K, refactor
* opencl: fix transpose_8_buf kernel assignment, refactor
* opencl: refactor q6_K transpose
* opencl: add gemm_noshuffle_q6_k_f32
* opencl: fix qh loading
* opencl: refactor q6_K gemv host side, release bufs and imgs
* opencl: refactor
* opencl: fix q6_K dequant and scale selection
* opencl: workaround compiler bug, fix dump_tensor
* opencl: refactor q6_K convert kernels
* opencl: unpack transformed q6_K in get_tensor
* opencl: refactor, handle non-uniform workgroups
* opencl: support non-vector subgroup bcast1 parent 39bf0d3 commit 1772701
5 files changed
Lines changed: 920 additions & 40 deletions
File tree
- ggml/src/ggml-opencl
- kernels
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
| 117 | + | |
| 118 | + | |
117 | 119 | | |
118 | 120 | | |
119 | 121 | | |
| |||
0 commit comments