Commit 009acc3
fix: gate turbo V unpad on V type, not K type (#42)
When using asymmetric KV (-ctk q8_0 -ctv turbo4), the V unpad code
was gated on k->type being turbo. Since K is q8_0, the unpad was
skipped even when V was turbo and padded to 128. This caused a shape
mismatch at the wo matmul (ggml_can_mul_mat assertion) for models
with non-128-aligned head_dim (e.g., GPT-OSS-120B with head_dim=64,
openai_moe_iswa architecture).
Fix: check v->type instead of k->type for V unpad blocks in both
build_attn overloads. Q rotation remains correctly gated on k->type.
Reported-by: NigelTufnel12345
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: tturney@psyguard.ai1 parent 1073622 commit 009acc3
1 file changed
Lines changed: 4 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2189 | 2189 | | |
2190 | 2190 | | |
2191 | 2191 | | |
2192 | | - | |
| 2192 | + | |
| 2193 | + | |
2193 | 2194 | | |
2194 | 2195 | | |
2195 | 2196 | | |
| |||
2415 | 2416 | | |
2416 | 2417 | | |
2417 | 2418 | | |
2418 | | - | |
| 2419 | + | |
| 2420 | + | |
2419 | 2421 | | |
2420 | 2422 | | |
2421 | 2423 | | |
| |||
0 commit comments