Commit 67855db
committed
vulkan: add Q1_0_g128 (1-bit ternary) shader support
Add Vulkan compute shader support for the GGML_TYPE_Q1_0_g128
quantization format (1-bit sign / binary quantization, group size 128).
New files:
- dequant_q1_0_g128.comp: standalone dequantization shader
- mul_mat_vec_q1_0_g128.comp: fused matrix-vector multiply shader
(4 threads/block, 32 elements/thread, 8x dot(vec4,vec4))
Modified files:
- types.glsl: block_q1_0_g128 struct, QUANT_K=128, QUANT_R=1
- dequant_funcs.glsl: dequantize/dequantize4 + single-scale get_dm
- mul_mm_funcs.glsl: branchless FMA load path for batched matmul
- vulkan-shaders-gen.cpp: type registration, LOAD_VEC_A=4, excluded
from coopmat2 flash attention and integer dot product Q8_1 paths
- ggml-vulkan.cpp: pipeline registration for dequant, get_rows,
mul_mat_vec (f32/f16/id), mul_mat_mat, mul_mat_mat_id, supports_op
- test-backend-ops.cpp: Q1_0_g128 test cases for get_rows, mul_mat,
mul_mat_id
Performance on AMD Radeon 680M (RDNA2 iGPU):
eval: 0.28 -> 23.4 t/s (84x), prompt: 0.31 -> 38.3 t/s (124x)
graph splits: 291 -> 21 parent f5dda72 commit 67855db
File tree
8 files changed
+272
-10
lines changed- ggml/src/ggml-vulkan
- vulkan-shaders
- tests
8 files changed
+272
-10
lines changedLarge diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
25 | 60 | | |
26 | 61 | | |
27 | 62 | | |
| |||
448 | 483 | | |
449 | 484 | | |
450 | 485 | | |
451 | | - | |
| 486 | + | |
452 | 487 | | |
453 | 488 | | |
454 | 489 | | |
| |||
Lines changed: 29 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
Lines changed: 108 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
131 | 162 | | |
132 | 163 | | |
133 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
49 | 67 | | |
50 | 68 | | |
51 | 69 | | |
| |||
Lines changed: 9 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| 53 | + | |
53 | 54 | | |
54 | 55 | | |
55 | 56 | | |
| |||
220 | 221 | | |
221 | 222 | | |
222 | 223 | | |
223 | | - | |
| 224 | + | |
224 | 225 | | |
225 | 226 | | |
226 | 227 | | |
| |||
554 | 555 | | |
555 | 556 | | |
556 | 557 | | |
557 | | - | |
| 558 | + | |
558 | 559 | | |
559 | 560 | | |
560 | 561 | | |
| |||
580 | 581 | | |
581 | 582 | | |
582 | 583 | | |
583 | | - | |
| 584 | + | |
584 | 585 | | |
585 | 586 | | |
586 | 587 | | |
587 | 588 | | |
588 | 589 | | |
589 | 590 | | |
590 | | - | |
| 591 | + | |
591 | 592 | | |
592 | 593 | | |
593 | 594 | | |
| |||
645 | 646 | | |
646 | 647 | | |
647 | 648 | | |
648 | | - | |
| 649 | + | |
649 | 650 | | |
650 | 651 | | |
651 | 652 | | |
| |||
680 | 681 | | |
681 | 682 | | |
682 | 683 | | |
683 | | - | |
| 684 | + | |
684 | 685 | | |
685 | 686 | | |
686 | 687 | | |
| |||
697 | 698 | | |
698 | 699 | | |
699 | 700 | | |
700 | | - | |
| 701 | + | |
701 | 702 | | |
702 | 703 | | |
703 | 704 | | |
| |||
1139 | 1140 | | |
1140 | 1141 | | |
1141 | 1142 | | |
1142 | | - | |
| 1143 | + | |
1143 | 1144 | | |
1144 | 1145 | | |
1145 | 1146 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7094 | 7094 | | |
7095 | 7095 | | |
7096 | 7096 | | |
| 7097 | + | |
7097 | 7098 | | |
7098 | 7099 | | |
7099 | 7100 | | |
| |||
7796 | 7797 | | |
7797 | 7798 | | |
7798 | 7799 | | |
| 7800 | + | |
| 7801 | + | |
7799 | 7802 | | |
7800 | 7803 | | |
7801 | 7804 | | |
| |||
0 commit comments