Skip to content

[None][feat] Revert Add support for Qwen3.5 VL MoE (#14164)#14465

Merged
nv-guomingz merged 1 commit into
mainfrom
revert-14164-qwen3_5_vl_moe
May 23, 2026
Merged

[None][feat] Revert Add support for Qwen3.5 VL MoE (#14164)#14465
nv-guomingz merged 1 commit into
mainfrom
revert-14164-qwen3_5_vl_moe

Conversation

@nv-guomingz

@nv-guomingz nv-guomingz commented May 22, 2026

Copy link
Copy Markdown
Collaborator

This reverts commit 96a4a09.

Summary by CodeRabbit

  • Documentation

    • Removed Qwen3.5 MoE Vision Language model from the supported models matrix.
  • Refactor

    • Simplified model weight loading and configuration handling for supported Qwen3.5 variants.
    • Consolidated configuration normalization logic for improved maintainability.
  • Tests

    • Removed test coverage for the discontinued Qwen3.5 MoE Vision Language model variant.
    • Updated related test configurations and lists accordingly.

Review Change Stack

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@nv-guomingz

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@nv-guomingz

Copy link
Copy Markdown
Collaborator Author

The #14164 blocked the L0, please refer to #14455

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This PR removes Qwen3.5-MoE VL conditional generation model support by deleting model exports, updating weight mapper registration, relocating config normalization logic, simplifying model signatures and KV-cache handling, and removing all related tests and documentation.

Changes

Qwen3.5-MoE-VL Support Removal and Config Refactoring

Layer / File(s) Summary
Model export and documentation cleanup
tensorrt_llm/_torch/models/__init__.py, docs/source/models/supported-models.md
Remove Qwen3_5MoeVLModel from public module exports and __all__ registry; delete conditional generation model from supported models documentation.
Weight mapper registration update
tensorrt_llm/_torch/models/checkpoints/hf/qwen3_5_weight_mapper.py
Reregister Qwen3_5MoeHfWeightMapper to Qwen3_5MoeForCausalLM instead of Qwen3_5MoeForConditionalGeneration.
Config compatibility consolidation and relocation
tensorrt_llm/_torch/models/modeling_qwen3_5.py, tensorrt_llm/_torch/pyexecutor/config_utils.py
Move Qwen35ConfigCompat from modeling to config utilities as _Qwen35ConfigCompat, consolidating text-config extraction, architecture selection, quantization module rewriting, linear-attention bf16 workaround, and RoPE parameter flattening; update modeling imports and docstrings.
Config loading normalization
tensorrt_llm/_torch/pyexecutor/config_utils.py
Update load_pretrained_config to use local _Qwen35ConfigCompat.normalize() for Qwen3.5 configs and remove older VLM normalization branch.
KV cache dtype resolution simplification
tensorrt_llm/_torch/pyexecutor/config_utils.py, tensorrt_llm/_torch/pyexecutor/model_loader.py
Remove dtype-coercion helpers and update mamba KV cache dtype resolution to use direct config field access, eliminating fallback chains.
Model signature simplification
tensorrt_llm/_torch/models/modeling_qwen3_next.py
Simplify Qwen3NextForCausalLM.load_weights to accept only weights and weight_mapper, removing optional params_map and allow_partial_loading parameters.
VL model architecture support removal
tensorrt_llm/_torch/models/modeling_qwen3vl.py
Remove conditional generation architecture branch and simplify head dimension calculation in Qwen3VLModelBase.
Test utility simplification
tests/unittest/_torch/modeling/test_modeling_multimodal.py
Remove hybrid KV-cache manager support and related hooks; simplify init_kv_cache_manager to always use standard get_kv_cache_manager path; clean up related imports and setUp logic.
Test case and configuration removal
tests/integration/defs/accuracy/references/mmmu.yaml, tests/integration/defs/accuracy/test_llm_api_pytorch_multimodal.py, tests/integration/test_lists/qa/llm_function_core.txt, tests/integration/test_lists/test-db/l0_l40s.yml, tests/unittest/_torch/modeling/test_modeling_qwen3_5_vl_moe.py
Remove TestQwen3_5_35B_A3B_VL test class and test entries; update MMMU config to explicit bfloat16 dtype; delete entire Qwen3.5 MoE-VL test module.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/TensorRT-LLM#14164: This PR reverses the Qwen3.5 MoE-VL support addition by removing the model exports, weight mapper registration, and test coverage that were previously introduced.

Suggested reviewers

  • 2ez4bz
  • liji-nv
  • syuoni
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description is incomplete. While it identifies that this is a revert, it lacks an explanation of why the revert is necessary, what issues prompted it, and provides no test coverage information or completed PR checklist. Add a clear rationale for the revert, describe the motivation and any related issues, list relevant test coverage that validates the revert, and complete the PR checklist items.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly summarizes the main change: reverting support for Qwen3.5 VL MoE from the referenced commit, which matches the changeset's comprehensive removal of related code and tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch revert-14164-qwen3_5_vl_moe

Comment @coderabbitai help to get the list of available commands and usage tips.

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #49944 [ run ] triggered by Bot. Commit: ee593dc Link to invocation

@nv-guomingz nv-guomingz changed the title Revert "[TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE (#14164)" [None][feat] Revert Add support for Qwen3.5 VL MoE (#14164) May 22, 2026
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #49944 [ run ] completed with state SUCCESS. Commit: ee593dc
/LLM/main/L0_MergeRequest_PR pipeline #39515 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@moraxu

moraxu commented May 22, 2026

Copy link
Copy Markdown
Collaborator

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #49997 [ run ] triggered by Bot. Commit: ee593dc Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #49997 [ run ] completed with state SUCCESS. Commit: ee593dc
/LLM/main/L0_MergeRequest_PR pipeline #39562 completed with status: 'SUCCESS'

CI Report

Link to invocation

@nv-guomingz nv-guomingz merged commit 751be5d into main May 23, 2026
16 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants