Commit 9725a31
authored
CUDA: reduce MMQ stream-k overhead (ggml-org#22298)
* CUDA: reduce MMQ stream-k overhead
* use 32 bit integers for kbc1 parent d164904 commit 9725a31
1 file changed
Lines changed: 138 additions & 139 deletions
1 parent d164904 commit 9725a31
1 file changed
0 commit comments