We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 1a0ed36 commit 7c9ff55Copy full SHA for 7c9ff55
1 file changed
src/liger_kernel/ops/backends/_ascend/ops/multi_token_attention.py
@@ -1,10 +1,3 @@
1
-"""
2
-Fused causal masking + softmax/sparsemax Triton kernels for NPU.
3
-
4
-This implementation fuses causal masking with softmax and sparsemax forward and backward
5
-operations in single kernels to reduce memory traffic and improve performance.
6
7
8
import torch
9
import torch.nn.functional as F
10
import triton
0 commit comments