Skip to content

Add k-bit blockwise quantization (K=2-5) with warp-level CUDA kernels #2159

Add k-bit blockwise quantization (K=2-5) with warp-level CUDA kernels

Add k-bit blockwise quantization (K=2-5) with warp-level CUDA kernels #2159

Job Run time
1m 33s
15s
5m 47s
22s
1m 12s
14s
4m 30s
4m 53s
4m 34s
4m 59s
4m 19s
5m 51s
4m 40s
4m 6s
4m 22s
4m 11s
4m 27s
4m 26s
4m 23s
5m 49s
5m 49s
3m 19s
4m 45s
3m 40s
3m 24s
5m 50s
4m 39s
5m 11s
5m 49s
4m 21s
3m 30s
5m 48s
5m 49s
4m 17s
4m 56s
3m 19s
4m 33s
4m 4s
5m 49s
3m 42s
4m 25s
3m 23s
5m 44s
5m 49s
4m 53s
1s
1s
1s
1s
3h 11m 45s