Skip to content

Commit ff76165

Browse files
Metal backend: port quantized GEMM from MLX (#17362)
Optimized 4-bit linear in the Metal backend for the Matrix-Matrix case (M > 1), by porting the quantized GEMM shaders (QMM) from MLX
1 parent fd0adb6 commit ff76165

1 file changed

Lines changed: 808 additions & 246 deletions

File tree

0 commit comments

Comments
 (0)