Skip to content

feat: support Qwen3.5-VL model on npu device[6/N].#1212

Closed
yingxudeng wants to merge 2 commits intojd-opensource:mainfrom
yingxudeng:feat/qwen35_video_2_ok_3_ing_2
Closed

feat: support Qwen3.5-VL model on npu device[6/N].#1212
yingxudeng wants to merge 2 commits intojd-opensource:mainfrom
yingxudeng:feat/qwen35_video_2_ok_3_ing_2

Conversation

@yingxudeng
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the Qwen3.5 VL model, including updates for linear attention, multimodal Rotary Positional Embeddings (mRoPE), and deepstack processing. It also refactors model type detection and enhances NPU attention layers. Review feedback highlights a critical compilation error from modifying a constant reference, a memory allocation mismatch in the KV cache, and a style guide violation regarding anonymous namespaces. Additionally, suggestions were made to fix hardcoded logic in RoPE and deepstack retrieval to ensure safety and configuration compliance.

Comment thread xllm/models/vlm/qwen3_5_vl.h
Comment thread xllm/core/distributed_runtime/vlm_engine.cpp
Comment thread xllm/core/layers/common/rotary_embedding_util.cpp
Comment thread xllm/core/layers/npu_torch/qwen3_next_attention.cpp
Comment thread xllm/models/vlm/qwen3_5_vl.h
@yingxudeng yingxudeng changed the title feat: support Qwen3.5-VL model on npu device. feat: support Qwen3.5-VL model on npu device[6/N]. Apr 8, 2026
@yingxudeng yingxudeng marked this pull request as ready for review April 8, 2026 08:34
@yingxudeng yingxudeng force-pushed the feat/qwen35_video_2_ok_3_ing_2 branch from f137ab2 to 8f60c20 Compare April 9, 2026 17:37
@yingxudeng yingxudeng force-pushed the feat/qwen35_video_2_ok_3_ing_2 branch from 8f60c20 to 9ddb0d7 Compare April 10, 2026 06:13
@wly-115 wly-115 self-requested a review April 10, 2026 07:11
@wly-115
Copy link
Copy Markdown
Collaborator

wly-115 commented Apr 10, 2026

torch::Tensor deepstack_process(torch::Tensor hidden_states, torch::Tensor visual_pos_masks, const torch::Tensor& visual_embeds)
I remember Qwen 3.5 already removed DeepStack.

@yingxudeng yingxudeng closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants