[None][fix] Add missing allow_partial_loading param to CuteDSL and ConfigurableMoE load_weights by qiaoxj07 · Pull Request #12761 · NVIDIA/TensorRT-LLM

qiaoxj07 · 2026-04-04T09:49:39Z

Summary

PR [None][feat] Add DWDP (Distributed Weight Data Parallelism) support for MoE inference #12136 (DWDP) added a load_weights override in CuteDslFusedMoE that dropped the allow_partial_loading parameter from the base class signature.
ConfigurableMoE.load_weights also lacked this parameter.
This causes TypeError: CuteDslFusedMoE.load_weights() got an unexpected keyword argument 'allow_partial_loading' when qwen2_moe_weight_mapper calls module.load_weights(weights=..., allow_partial_loading=...) on models using the CuteDSL or ConfigurableMoE backend (e.g., Qwen3 MoE).
Fix: match the base class signature load_weights(self, weights, allow_partial_loading=False) in both overrides and pass through to super() / backend.

Test plan

Verify Qwen3 MoE model loads without TypeError with CuteDSL backend
Verify existing MoE backends (Cutlass, Triton, etc.) still work

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added optional allow_partial_loading parameter to weight loading in Mixture of Experts modules, enabling flexible control over partial weight loading behavior.

…nfigurableMoE load_weights PR NVIDIA#12136 (DWDP) added a load_weights override in CuteDslFusedMoE that dropped the allow_partial_loading parameter from the base class signature. ConfigurableMoE.load_weights also lacked this parameter. This causes TypeError when qwen2_moe_weight_mapper calls module.load_weights(weights=..., allow_partial_loading=...) on models using the CuteDSL or ConfigurableMoE backend (e.g., Qwen3 MoE). Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>

qiaoxj07 · 2026-04-04T09:51:13Z

/bot run --disable-fail-fast

coderabbitai · 2026-04-04T09:53:40Z

📝 Walkthrough

Walkthrough

This PR introduces an optional allow_partial_loading parameter to the load_weights method across two MoE-related modules, enabling callers to control partial weight loading behavior. The parameter is propagated through the inheritance chain to backend implementations.

Changes

Cohort / File(s)	Summary
ConfigurableMoE Enhancement `tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py`	Added optional `allow_partial_loading: bool = False` parameter to `load_weights()` method and forwarded it to the backend via `self.backend.load_weights(weights, allow_partial_loading)`.
CuteDslFusedMoE Signature Update `tensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py`	Changed `weights` parameter type from `Dict[str, torch.Tensor]` to `List[Dict]`, added optional `allow_partial_loading: bool = False` parameter, and forwarded both to the base class via `super().load_weights(weights, allow_partial_loading)`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding the missing allow_partial_loading parameter to CuteDSL and ConfigurableMoE load_weights methods.
Description check	✅ Passed	The PR description clearly explains the issue, root cause, solution, and includes a test plan, though some template sections are not filled (Test Coverage section lacks detail).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py`:
- Around line 1240-1242: Run the repository formatter (ruff/black/ruff-format as
used in CI) and commit the reformatted signature for the method load_weights in
class/function configurable_moe so the function header is line-wrapped to match
project style; specifically reflow the signature "def load_weights(self,
weights: List[Dict], allow_partial_loading: bool = False):" using the project's
formatter and commit the result so ruff-format CI passes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc295e8b-7075-434c-b821-87712fe277d3

📥 Commits

Reviewing files that changed from the base of the PR and between fd7cc85 and c81239f.

📒 Files selected for processing (2)

tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py
tensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py

tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py

tensorrt-cicd · 2026-04-04T09:57:33Z

PR_Github #41815 [ run ] triggered by Bot. Commit: c81239f Link to invocation

tensorrt-cicd · 2026-04-04T09:57:34Z

PR_Github #41815 [ run ] completed with state DISABLED
CI server is currently disabled for scheduled maintenance. Estimated completion time: 9 PM PST on 4/4.

Link to invocation

qiaoxj07 · 2026-04-04T10:51:34Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-04T10:57:41Z

PR_Github #41818 [ run ] triggered by Bot. Commit: c81239f Link to invocation

tensorrt-cicd · 2026-04-04T10:57:42Z

PR_Github #41818 [ run ] completed with state DISABLED
CI server is currently disabled for scheduled maintenance. Estimated completion time: 9 PM PST on 4/4.

Link to invocation

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>

qiaoxj07 · 2026-04-05T04:58:15Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-05T05:04:12Z

PR_Github #41850 [ run ] triggered by Bot. Commit: 5d4843f Link to invocation

tensorrt-cicd · 2026-04-05T15:31:04Z

PR_Github #41850 [ run ] completed with state SUCCESS. Commit: 5d4843f
/LLM/main/L0_MergeRequest_PR pipeline #32718 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

qiaoxj07 · 2026-04-05T16:17:56Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-05T16:23:31Z

PR_Github #41882 [ run ] triggered by Bot. Commit: 5d4843f Link to invocation

tensorrt-cicd · 2026-04-05T19:39:42Z

PR_Github #41882 [ run ] completed with state SUCCESS. Commit: 5d4843f
/LLM/main/L0_MergeRequest_PR pipeline #32747 completed with status: 'SUCCESS'

CI Report

Link to invocation

VALLIS-NERIA · 2026-04-06T05:33:44Z

please link to https://nvbugspro.nvidia.com/bug/6051275

…nfigurableMoE load_weights (NVIDIA#12761) Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>

qiaoxj07 requested a review from a team as a code owner April 4, 2026 09:49

qiaoxj07 requested a review from HuiGao-NV April 4, 2026 09:49

github-actions bot assigned qiaoxj07 Apr 4, 2026

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py Outdated Show resolved Hide resolved

[None][fix] Run ruff-format on load_weights signatures

5d4843f

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>

tianyuz-nv mentioned this pull request Apr 4, 2026

[None][fix] Fix CuteDslFusedMoE.load_weights signature to accept allow_partial_loading #12690

Closed

qiaoxj07 requested review from QiJune and xxi-nv April 6, 2026 02:06

xxi-nv approved these changes Apr 8, 2026

View reviewed changes

qiaoxj07 enabled auto-merge (squash) April 8, 2026 01:37

qiaoxj07 merged commit eb9ffd5 into NVIDIA:main Apr 8, 2026
5 checks passed

Conversation

qiaoxj07 commented Apr 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

qiaoxj07 commented Apr 4, 2026

Uh oh!

coderabbitai bot commented Apr 4, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tensorrt-cicd commented Apr 4, 2026

Uh oh!

tensorrt-cicd commented Apr 4, 2026

Uh oh!

qiaoxj07 commented Apr 4, 2026

Uh oh!

tensorrt-cicd commented Apr 4, 2026

Uh oh!

tensorrt-cicd commented Apr 4, 2026

Uh oh!

qiaoxj07 commented Apr 5, 2026

Uh oh!

tensorrt-cicd commented Apr 5, 2026

Uh oh!

tensorrt-cicd commented Apr 5, 2026

Uh oh!

qiaoxj07 commented Apr 5, 2026

Uh oh!

tensorrt-cicd commented Apr 5, 2026

Uh oh!

tensorrt-cicd commented Apr 5, 2026

Uh oh!

VALLIS-NERIA commented Apr 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qiaoxj07 commented Apr 4, 2026 •

edited by coderabbitai bot

Loading