Skip to content

Qwen3.5 prefix_caching#2256

Open
wenbinc-Bin wants to merge 84 commits into
HabanaAI:aice/v1.22.0from
wenbinc-Bin:qwen3.5_updatePR_prefix_v2
Open

Qwen3.5 prefix_caching#2256
wenbinc-Bin wants to merge 84 commits into
HabanaAI:aice/v1.22.0from
wenbinc-Bin:qwen3.5_updatePR_prefix_v2

Conversation

@wenbinc-Bin
Copy link
Copy Markdown

Add qwen3.5 prefix_caching support

wenbinc-Bin and others added 30 commits February 11, 2026 06:51
vllm-project#34110
missing changes in
vllm/transformers_utils/model_arch_config_convertor.py
vllm/v1/spec_decode/eagle.py

Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
We don't support mtp so we can remove slicing.

Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
HabanaAI#2238)

Update vllm推理手册 and add MiniMax-M2.5 related script/files for
MiniMax-M2.5 deployment。
convert_for_minimax_unit_scale.py
minimax-m2.5-input-scale.tar.safetensor packaged in
minimax-m2.5-input-scale.tar.gz
HabanaAI#2240)

Update vllm推理手册:modify the vLLM-fork commit ID in vllm推理手册 for
MiniMax-M2.5 to include related scripts and packages.
Jing1Ling and others added 21 commits April 9, 2026 17:23
Signed-off-by: Jing <jing1.ling@intel.com>
Signed-off-by: Jing <jing1.ling@intel.com>
docs: update default context length to 256K
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Chen, Wenbin <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants