[TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE (with the MTP fixes)#14599
[TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE (with the MTP fixes)#14599moraxu wants to merge 6 commits into
Conversation
|
/bot run |
|
PR_Github #50402 [ run ] triggered by Bot. Commit: |
|
PR_Github #50402 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50459 [ run ] triggered by Bot. Commit: |
|
PR_Github #50459 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50590 [ run ] triggered by Bot. Commit: |
|
PR_Github #50590 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50604 [ run ] triggered by Bot. Commit: |
|
PR_Github #50604 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #50613 [ run ] triggered by Bot. Commit: |
|
PR_Github #50613 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50720 [ run ] triggered by Bot. Commit: |
|
PR_Github #50720 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50739 [ run ] triggered by Bot. Commit: |
|
PR_Github #50739 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50834 [ run ] triggered by Bot. Commit: |
|
PR_Github #50834 [ run ] completed with state
|
|
/bot run |
|
PR_Github #50866 [ run ] triggered by Bot. Commit: |
|
PR_Github #50866 [ run ] completed with state
|
|
/bot run |
|
/bot run |
|
PR_Github #53962 [ run ] triggered by Bot. Commit: |
|
PR_Github #53962 [ run ] completed with state
|
|
/bot run |
venkywonka
left a comment
There was a problem hiding this comment.
lgtm from doc-owners
|
PR_Github #55103 [ run ] triggered by Bot. Commit: |
|
PR_Github #55103 [ run ] completed with state
|
|
/bot run |
|
PR_Github #55170 [ run ] triggered by Bot. Commit: |
|
PR_Github #55170 [ run ] completed with state
|
|
/bot run |
|
PR_Github #55537 [ run ] triggered by Bot. Commit: |
|
PR_Github #55537 [ run ] completed with state
|
|
please rebase or merge origin/main to resolve the GB200-4_GPUs-PyTorch-PerfSanity-* failures with 91dc145 |
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
260144a to
4035427
Compare
|
/bot run |
|
PR_Github #55617 [ run ] triggered by Bot. Commit: |
|
PR_Github #55617 [ run ] completed with state
|
|
/bot run |
|
PR_Github #55701 [ run ] triggered by Bot. Commit: |
|
PR_Github #55701 [ run ] completed with state
|
|
/bot run |
|
PR_Github #55742 [ run ] triggered by Bot. Commit: |
|
PR_Github #55742 [ run ] completed with state
|
Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests
Description
Qwen3.5-MoE-VL(Qwen3_5MoeForConditionalGeneration) on top of #12611.transformers.Qwen3_5MoeConfig(present in5.3.0), adds a thin post-load normalizer that materializes the handful of aliases the reusedQwen3Nextruntime expects ontext_config(intermediate_sizefrom the MoE fields,rope_theta/partial_rotary_factor/rope_scalingfromrope_parameters), and centralizes hybrid-cache dtype resolution in two helpers.Qwen3VLModelBaseMTP/eagle-compatible: threadsspec_metadata/resource_manager/ pre-fusionorig_input_idsto the inner LM. UnblocksTestQwen3_5_35B_A3B::test_bf16_mtp[mtp_on], which started failing after [TRTLLM-11547][feat] Add Qwen3.5 MTP support. #12646 added the MTP test on top of the original [TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE - REVERTED by #14599 #14164 and the sameQwen3.5-35B-A3Bcheckpoint started routing through the VLM wrapper - seenvbugs/6206179for detailsTest Coverage
Accuracy & unit tests
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.