Commit dd42d51
kv-cache : disable attn_rot_k/v when SPLIT_MODE_TENSOR is active
The Hadamard rotation path (ggml_mul_mat_aux) reshapes the K/V tensor
from [n_embd_head, n_head_kv, n_tokens] to a 2-D layout for matmul,
then restores it. The split-axis tracker in ggml-backend-meta.cpp does
not follow this reshape correctly when the source split axis falls on
the collapsed dimension, producing an AXIS_1 tag on the values passed
to ggml_set_rows, which then trips an assertion.
Disabling attn_rot when tensor parallelism is in use sidesteps the
incompatibility without changing inference quality: the Hadamard
pre-rotation is a lossless precision aid, not a model requirement.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 1c220db commit dd42d51
1 file changed
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
| 292 | + | |
292 | 293 | | |
293 | 294 | | |
294 | 295 | | |
295 | 296 | | |
296 | 297 | | |
297 | 298 | | |
| 299 | + | |
298 | 300 | | |
299 | 301 | | |
300 | 302 | | |
| |||
0 commit comments