Skip to content

Add flash attention for non-quantized CPU GroupQueryAttention#28962

Merged
tianleiwu merged 6 commits into
mainfrom
tlwu/20260608/gqa_cpu_flash_att
Jun 24, 2026
Merged

Add flash attention for non-quantized CPU GroupQueryAttention#28962
tianleiwu merged 6 commits into
mainfrom
tlwu/20260608/gqa_cpu_flash_att

Remove dead flash-decoding code from FP32 GQA path

11195e7
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Linux Android Emulator QNN CI Pipeline succeeded Jun 23, 2026 in 13m 37s

Build #20260622.43 succeeded