We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 8fdd559 commit 87928a4Copy full SHA for 87928a4
src/liger_kernel/ops/backends/_ascend/ops/multi_token_attention.py
@@ -1,10 +1,3 @@
1
-"""
2
-Fused causal masking + softmax/sparsemax Triton kernels for NPU.
3
-
4
-This implementation fuses causal masking with softmax and sparsemax forward and backward
5
-operations in single kernels to reduce memory traffic and improve performance.
6
7
8
import torch
9
import torch.nn.functional as F
10
import triton
0 commit comments