Commit ad30176
committed
[None][feat] add paged DSV4 sparse attention cache
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>1 parent e3a1b95 commit ad30176
3 files changed
Lines changed: 877 additions & 173 deletions
File tree
- tensorrt_llm/_torch/auto_deploy/custom_ops
- attention
- tests/unittest/auto_deploy/singlegpu/custom_ops/attention
0 commit comments