Skip to content

feat(sparsity): add VecAttention sparse prefill for VLM#320

Merged
anminliu merged 2 commits into
Tencent:mainfrom
anminliu:dev_vecattention
May 29, 2026
Merged

feat(sparsity): add VecAttention sparse prefill for VLM#320
anminliu merged 2 commits into
Tencent:mainfrom
anminliu:dev_vecattention

Conversation

@anminliu
Copy link
Copy Markdown
Collaborator

Integrate VecAttention into AngelSlim as a sparse attention method for Vision-Language Models (Qwen2.5-VL).

  • Add vecattention subpackage under compressor/sparsity/
  • Add vllm-flash-attention as git submodule for sparse_attn_func kernel
  • Add Triton kernels for MinP threshold selection and query pooling
  • Add run_vecattention.py tool for image/video inference

Integrate VecAttention into AngelSlim as a sparse attention method for
Vision-Language Models (Qwen2.5-VL).

- Add vecattention subpackage under compressor/sparsity/
- Add vllm-flash-attention as git submodule for sparse_attn_func kernel
- Add Triton kernels for MinP threshold selection and query pooling
- Add run_vecattention.py tool for image/video inference
yghstill
yghstill previously approved these changes May 28, 2026
@anminliu anminliu force-pushed the dev_vecattention branch from 033eb00 to 3197960 Compare May 28, 2026 13:10
@anminliu anminliu requested a review from yghstill May 28, 2026 14:20
@anminliu anminliu merged commit bec53be into Tencent:main May 29, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants