Skip to content

[cuda backend][gemma4_31b] TQ4 SDPA: no-spill prefill kernel + analytic causal #20512

Merged
Gasoonjia merged 7 commits into
mainfrom
gemma4_31b-tq4-prefill-decode-tuned
Jun 25, 2026
Merged

[cuda backend][gemma4_31b] TQ4 SDPA: no-spill prefill kernel + analytic causal #20512
Gasoonjia merged 7 commits into
mainfrom
gemma4_31b-tq4-prefill-decode-tuned