Commit fa3eedc
[SYCL] Fix reorder MMVQ assert on unaligned vocab sizes (ggml-org#22035)
* [SYCL] Fix reorder MMVQ assert on unaligned vocab sizes
The reorder mul_mat_vec_q dispatchers for Q4_0, Q8_0, Q4_K, and Q6_K
asserted that block_num_y was a multiple of 16 subgroups. Models with
a vocab size not divisible by 16 (for example HY-MT at 120818) aborted
on model load when the output projection tripped the assert.
I replaced the assert with padding: block_num_y now rounds up to a
whole number of subgroup-sized workgroups. The kernel already has the
row bounds check (`if (row >= nrows) return;`) so the extra padded
threads early-exit cleanly. Row values are uniform across a subgroup
so the collective reduce stays safe.
For aligned vocab sizes the padded block_num_y equals the old value,
so the kernel launch is identical and there is no regression.
Thanks to @arthw for flagging the relationship to ggml-org#21527.
Fixes ggml-org#22020.
AI assisted coding, tested on Intel B70 hardware.
* sycl: use WARP_SIZE for num_subgroups in reorder MMVQ launches
Replaces the hardcoded 16 with WARP_SIZE in the four reorder_mul_mat_vec
launch helpers (Q4_0, Q8_0, Q4_K, Q6_K). Compile-time no-op on the Intel
target where WARP_SIZE is 16, but makes the relationship to subgroup
size explicit. Per review by @NeoZhangJianyu on ggml-org#22035.
Assisted by Claude.1 parent ae1f5b8 commit fa3eedc
1 file changed
Lines changed: 12 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
537 | 537 | | |
538 | 538 | | |
539 | 539 | | |
540 | | - | |
541 | | - | |
542 | | - | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
543 | 543 | | |
544 | 544 | | |
545 | 545 | | |
| |||
682 | 682 | | |
683 | 683 | | |
684 | 684 | | |
685 | | - | |
686 | | - | |
687 | | - | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
688 | 688 | | |
689 | 689 | | |
690 | 690 | | |
| |||
798 | 798 | | |
799 | 799 | | |
800 | 800 | | |
801 | | - | |
802 | | - | |
803 | | - | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
804 | 804 | | |
805 | 805 | | |
806 | 806 | | |
| |||
842 | 842 | | |
843 | 843 | | |
844 | 844 | | |
845 | | - | |
846 | | - | |
847 | | - | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
848 | 848 | | |
849 | 849 | | |
850 | 850 | | |
| |||
0 commit comments