Skip to content

Commit ce3710d

Browse files
Refine attention_begin structure by splitting CUDA/NPU and MLA/GQA
1 parent bc80233 commit ce3710d

2 files changed

Lines changed: 213 additions & 230 deletions

File tree

0 commit comments

Comments
 (0)