[None][feat] Revert Add support for Qwen3.5 VL MoE (#14164) by nv-guomingz · Pull Request #14465 · NVIDIA/TensorRT-LLM

nv-guomingz · 2026-05-22T13:42:51Z

This reverts commit 96a4a09.

Summary by CodeRabbit

Documentation
- Removed Qwen3.5 MoE Vision Language model from the supported models matrix.
Refactor
- Simplified model weight loading and configuration handling for supported Qwen3.5 variants.
- Consolidated configuration normalization logic for improved maintainability.
Tests
- Removed test coverage for the discontinued Qwen3.5 MoE Vision Language model variant.
- Updated related test configurations and lists accordingly.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

This reverts commit 96a4a09.

nv-guomingz · 2026-05-22T13:43:17Z

/bot run --disable-fail-fast

nv-guomingz · 2026-05-22T13:46:51Z

The #14164 blocked the L0, please refer to #14455

coderabbitai · 2026-05-22T13:47:22Z

📝 Walkthrough

Walkthrough

This PR removes Qwen3.5-MoE VL conditional generation model support by deleting model exports, updating weight mapper registration, relocating config normalization logic, simplifying model signatures and KV-cache handling, and removing all related tests and documentation.

Changes

Qwen3.5-MoE-VL Support Removal and Config Refactoring

Layer / File(s)	Summary
Model export and documentation cleanup `tensorrt_llm/_torch/models/__init__.py`, `docs/source/models/supported-models.md`	Remove `Qwen3_5MoeVLModel` from public module exports and `__all__` registry; delete conditional generation model from supported models documentation.
Weight mapper registration update `tensorrt_llm/_torch/models/checkpoints/hf/qwen3_5_weight_mapper.py`	Reregister `Qwen3_5MoeHfWeightMapper` to `Qwen3_5MoeForCausalLM` instead of `Qwen3_5MoeForConditionalGeneration`.
Config compatibility consolidation and relocation `tensorrt_llm/_torch/models/modeling_qwen3_5.py`, `tensorrt_llm/_torch/pyexecutor/config_utils.py`	Move `Qwen35ConfigCompat` from modeling to config utilities as `_Qwen35ConfigCompat`, consolidating text-config extraction, architecture selection, quantization module rewriting, linear-attention bf16 workaround, and RoPE parameter flattening; update modeling imports and docstrings.
Config loading normalization `tensorrt_llm/_torch/pyexecutor/config_utils.py`	Update `load_pretrained_config` to use local `_Qwen35ConfigCompat.normalize()` for Qwen3.5 configs and remove older VLM normalization branch.
KV cache dtype resolution simplification `tensorrt_llm/_torch/pyexecutor/config_utils.py`, `tensorrt_llm/_torch/pyexecutor/model_loader.py`	Remove dtype-coercion helpers and update mamba KV cache dtype resolution to use direct config field access, eliminating fallback chains.
Model signature simplification `tensorrt_llm/_torch/models/modeling_qwen3_next.py`	Simplify `Qwen3NextForCausalLM.load_weights` to accept only `weights` and `weight_mapper`, removing optional `params_map` and `allow_partial_loading` parameters.
VL model architecture support removal `tensorrt_llm/_torch/models/modeling_qwen3vl.py`	Remove conditional generation architecture branch and simplify head dimension calculation in `Qwen3VLModelBase`.
Test utility simplification `tests/unittest/_torch/modeling/test_modeling_multimodal.py`	Remove hybrid KV-cache manager support and related hooks; simplify `init_kv_cache_manager` to always use standard `get_kv_cache_manager` path; clean up related imports and setUp logic.
Test case and configuration removal `tests/integration/defs/accuracy/references/mmmu.yaml`, `tests/integration/defs/accuracy/test_llm_api_pytorch_multimodal.py`, `tests/integration/test_lists/qa/llm_function_core.txt`, `tests/integration/test_lists/test-db/l0_l40s.yml`, `tests/unittest/_torch/modeling/test_modeling_qwen3_5_vl_moe.py`	Remove `TestQwen3_5_35B_A3B_VL` test class and test entries; update MMMU config to explicit bfloat16 dtype; delete entire Qwen3.5 MoE-VL test module.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/TensorRT-LLM#14164: This PR reverses the Qwen3.5 MoE-VL support addition by removing the model exports, weight mapper registration, and test coverage that were previously introduced.

Suggested reviewers

2ez4bz
liji-nv
syuoni

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 54.55% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	⚠️ Warning	The PR description is incomplete. While it identifies that this is a revert, it lacks an explanation of why the revert is necessary, what issues prompted it, and provides no test coverage information or completed PR checklist.	Add a clear rationale for the revert, describe the motivation and any related issues, list relevant test coverage that validates the revert, and complete the PR checklist items.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title clearly summarizes the main change: reverting support for Qwen3.5 VL MoE from the referenced commit, which matches the changeset's comprehensive removal of related code and tests.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch revert-14164-qwen3_5_vl_moe

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tensorrt-cicd · 2026-05-22T13:49:44Z

PR_Github #49944 [ run ] triggered by Bot. Commit: ee593dc Link to invocation

tensorrt-cicd · 2026-05-22T21:33:09Z

PR_Github #49944 [ run ] completed with state SUCCESS. Commit: ee593dc
/LLM/main/L0_MergeRequest_PR pipeline #39515 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

moraxu · 2026-05-22T22:25:54Z

/bot run

tensorrt-cicd · 2026-05-22T22:31:46Z

PR_Github #49997 [ run ] triggered by Bot. Commit: ee593dc Link to invocation

tensorrt-cicd · 2026-05-23T02:08:50Z

PR_Github #49997 [ run ] completed with state SUCCESS. Commit: ee593dc
/LLM/main/L0_MergeRequest_PR pipeline #39562 completed with status: 'SUCCESS'

CI Report

Link to invocation

…IDIA#14465)

Revert "[TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE (#14164)"

ee593dc

This reverts commit 96a4a09.

nv-guomingz requested review from a team as code owners May 22, 2026 13:42

nv-guomingz requested review from Shixiaowei02, arysef, dongxuy04, moraxu, tijyojwad and yechank-nvidia May 22, 2026 13:42

github-actions Bot assigned nv-guomingz May 22, 2026

nv-guomingz changed the title ~~Revert "[TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE (#14164)"~~ [None][feat] Revert Add support for Qwen3.5 VL MoE (#14164) May 22, 2026

venkywonka approved these changes May 22, 2026

View reviewed changes

moraxu approved these changes May 22, 2026

View reviewed changes

2ez4bz approved these changes May 22, 2026

View reviewed changes

LarryXFly approved these changes May 23, 2026

View reviewed changes

xinhe-nv approved these changes May 23, 2026

View reviewed changes

Funatiq approved these changes May 23, 2026

View reviewed changes

nv-guomingz merged commit 751be5d into main May 23, 2026
16 of 21 checks passed

nv-guomingz deleted the revert-14164-qwen3_5_vl_moe branch May 23, 2026 09:52

KleinBlueC pushed a commit to KleinBlueC/TensorRT-LLM that referenced this pull request May 26, 2026

[None][feat] Revert Add support for Qwen3.5 VL MoE (NVIDIA#14164) (NV…

75f79f9

…IDIA#14465)

bmarimuthu-nv pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request May 28, 2026

[None][feat] Revert Add support for Qwen3.5 VL MoE (NVIDIA#14164) (NV…

9ae3cbf

…IDIA#14465)

coderabbitai Bot mentioned this pull request Jun 2, 2026

[TRTLLM-12500][feat] Add support for Qwen3.5 VL MoE (with the MTP fixes) #14599

Open

1 task

coderabbitai Bot mentioned this pull request Jun 11, 2026

[TRTLLM-13383][feat] Add support for Qwen3.5 VL Dense #15249

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[None][feat] Revert Add support for Qwen3.5 VL MoE (#14164)#14465

[None][feat] Revert Add support for Qwen3.5 VL MoE (#14164)#14465
nv-guomingz merged 1 commit into
mainfrom
revert-14164-qwen3_5_vl_moe

nv-guomingz commented May 22, 2026 •

edited

Loading

Uh oh!

nv-guomingz commented May 22, 2026

Uh oh!

nv-guomingz commented May 22, 2026

Uh oh!

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (2 warnings)

Uh oh!

tensorrt-cicd commented May 22, 2026

Uh oh!

tensorrt-cicd commented May 22, 2026

Uh oh!

moraxu commented May 22, 2026

Uh oh!

tensorrt-cicd commented May 22, 2026

Uh oh!

tensorrt-cicd commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Uh oh!

Conversation

nv-guomingz commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

nv-guomingz commented May 22, 2026

Uh oh!

nv-guomingz commented May 22, 2026

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (2 warnings)

Uh oh!

tensorrt-cicd commented May 22, 2026

Uh oh!

tensorrt-cicd commented May 22, 2026

Uh oh!

moraxu commented May 22, 2026

Uh oh!

tensorrt-cicd commented May 22, 2026

Uh oh!

tensorrt-cicd commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

nv-guomingz commented May 22, 2026 •

edited

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading