[ROCm] Enable blocksize 32 4-bit quantization and GEMV kernels on AMD CDNA #168
| Job | Run time |
|---|---|
| 26s | |
| 14s | |
| 16s | |
| 14s | |
| 18s | |
| 3m 3s | |
| 2m 59s | |
| 14s | |
| 18s | |
| 17s | |
| 3m 16s | |
| 3m 30s | |
| 3m 44s | |
| 3m 8s | |
| 2m 56s | |
| 3m 13s | |
| 3m 24s | |
| 2m 55s | |
| 15m 39s | |
| 12m 59s | |
| 8m 37s | |
| 10m 44s | |
| 9m 45s | |
| 9m 31s | |
| 9m 10s | |
| 8m 3s | |
| 7m 10s | |
| 9m 5s | |
| 6m 35s | |
| 7m 57s | |
| 4m 28s | |
| 9m 25s | |
| 6m 11s | |
| 5m 45s | |
| 3m 48s | |
| 33m 8s | |
| 3h 32m 25s |