Skip to content

[CUDA] Enable XQA decode for GroupQueryAttention with attention sink#29162

Merged
tianleiwu merged 7 commits into
mainfrom
tlwu/20260618/xqa_head_sink
Jun 20, 2026
Merged

[CUDA] Enable XQA decode for GroupQueryAttention with attention sink#29162
tianleiwu merged 7 commits into
mainfrom
tlwu/20260618/xqa_head_sink

Address PR review feedback: fix head_size comment, profile_gqa import…

7306218
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Linux Android Emulator QNN CI Pipeline succeeded Jun 19, 2026 in 13m 27s

Build #20260619.11 succeeded