Skip to content

Pull requests: fla-org/flash-linear-attention

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Fix] fix benchamrk issues and add ub management for npu
#1001 opened Jul 1, 2026 by sunyi0505 Contributor Loading…
[GDN] Add FlashQLA backend dispatch
#998 opened Jul 1, 2026 by Erix025 Loading…
7 tasks done
[Perf] Generalize fused q/k/v short convolution across layers
#977 opened Jun 23, 2026 by zhiyuan1i Collaborator Loading…
[GDN] Fix GDN precision on Blackwell
#948 opened Jun 14, 2026 by syeehyn Loading…
[Fix] Fix shared memory race in tilelang chunk_bwd dg_last accumulation help wanted Extra attention is needed
#890 opened May 11, 2026 by Erix025 Loading…
[SSE] Add SSE integration
#882 opened May 9, 2026 by Pan-Yuqi Contributor Loading…
[GDN] Tricked kernels: ungated KKT + fused inference via similarity transform
#797 opened Mar 28, 2026 by hypnopump Contributor Loading…
5 tasks
[Layernorm] Fix autotuner crash and OOB writes in layer_norm_bwd on high-SM GPUs
#796 opened Mar 28, 2026 by mpurland Contributor Loading…
5 tasks done
Add fused short convolution kernel with L2 norm
#661 opened Nov 24, 2025 by sustcsonglin Collaborator Loading…
[kda] add recursive block intra implementation
#656 opened Nov 22, 2025 by sustcsonglin Collaborator Loading…
ProTip! Follow long discussions with comments:>50.