feat: support Qwen3.5-VL model on npu device[6/N].#1212
feat: support Qwen3.5-VL model on npu device[6/N].#1212yingxudeng wants to merge 2 commits intojd-opensource:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds support for the Qwen3.5 VL model, including updates for linear attention, multimodal Rotary Positional Embeddings (mRoPE), and deepstack processing. It also refactors model type detection and enhances NPU attention layers. Review feedback highlights a critical compilation error from modifying a constant reference, a memory allocation mismatch in the KV cache, and a style guide violation regarding anonymous namespaces. Additionally, suggestions were made to fix hardcoded logic in RoPE and deepstack retrieval to ensure safety and configuration compliance.
f137ab2 to
8f60c20
Compare
8f60c20 to
9ddb0d7
Compare
|
|
No description provided.