Skip to content

[None][fix] Add missing allow_partial_loading param to CuteDSL and ConfigurableMoE load_weights#12761

Merged
qiaoxj07 merged 2 commits intoNVIDIA:mainfrom
qiaoxj07:fix/cute-dsl-load-weights-signature
Apr 8, 2026
Merged

[None][fix] Add missing allow_partial_loading param to CuteDSL and ConfigurableMoE load_weights#12761
qiaoxj07 merged 2 commits intoNVIDIA:mainfrom
qiaoxj07:fix/cute-dsl-load-weights-signature

Conversation

@qiaoxj07
Copy link
Copy Markdown
Collaborator

@qiaoxj07 qiaoxj07 commented Apr 4, 2026

Summary

  • PR [None][feat] Add DWDP (Distributed Weight Data Parallelism) support for MoE inference #12136 (DWDP) added a load_weights override in CuteDslFusedMoE that dropped the allow_partial_loading parameter from the base class signature.
  • ConfigurableMoE.load_weights also lacked this parameter.
  • This causes TypeError: CuteDslFusedMoE.load_weights() got an unexpected keyword argument 'allow_partial_loading' when qwen2_moe_weight_mapper calls module.load_weights(weights=..., allow_partial_loading=...) on models using the CuteDSL or ConfigurableMoE backend (e.g., Qwen3 MoE).
  • Fix: match the base class signature load_weights(self, weights, allow_partial_loading=False) in both overrides and pass through to super() / backend.

Test plan

  • Verify Qwen3 MoE model loads without TypeError with CuteDSL backend
  • Verify existing MoE backends (Cutlass, Triton, etc.) still work

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added optional allow_partial_loading parameter to weight loading in Mixture of Experts modules, enabling flexible control over partial weight loading behavior.

…nfigurableMoE load_weights

PR NVIDIA#12136 (DWDP) added a load_weights override in CuteDslFusedMoE that
dropped the allow_partial_loading parameter from the base class
signature. ConfigurableMoE.load_weights also lacked this parameter.
This causes TypeError when qwen2_moe_weight_mapper calls
module.load_weights(weights=..., allow_partial_loading=...) on models
using the CuteDSL or ConfigurableMoE backend (e.g., Qwen3 MoE).

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
@qiaoxj07 qiaoxj07 requested a review from a team as a code owner April 4, 2026 09:49
@qiaoxj07 qiaoxj07 requested a review from HuiGao-NV April 4, 2026 09:49
@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 4, 2026

/bot run --disable-fail-fast

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 4, 2026

📝 Walkthrough

Walkthrough

This PR introduces an optional allow_partial_loading parameter to the load_weights method across two MoE-related modules, enabling callers to control partial weight loading behavior. The parameter is propagated through the inheritance chain to backend implementations.

Changes

Cohort / File(s) Summary
ConfigurableMoE Enhancement
tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py
Added optional allow_partial_loading: bool = False parameter to load_weights() method and forwarded it to the backend via self.backend.load_weights(weights, allow_partial_loading).
CuteDslFusedMoE Signature Update
tensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py
Changed weights parameter type from Dict[str, torch.Tensor] to List[Dict], added optional allow_partial_loading: bool = False parameter, and forwarded both to the base class via super().load_weights(weights, allow_partial_loading).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding the missing allow_partial_loading parameter to CuteDSL and ConfigurableMoE load_weights methods.
Description check ✅ Passed The PR description clearly explains the issue, root cause, solution, and includes a test plan, though some template sections are not filled (Test Coverage section lacks detail).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py`:
- Around line 1240-1242: Run the repository formatter (ruff/black/ruff-format as
used in CI) and commit the reformatted signature for the method load_weights in
class/function configurable_moe so the function header is line-wrapped to match
project style; specifically reflow the signature "def load_weights(self,
weights: List[Dict], allow_partial_loading: bool = False):" using the project's
formatter and commit the result so ruff-format CI passes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc295e8b-7075-434c-b821-87712fe277d3

📥 Commits

Reviewing files that changed from the base of the PR and between fd7cc85 and c81239f.

📒 Files selected for processing (2)
  • tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py
  • tensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41815 [ run ] triggered by Bot. Commit: c81239f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41815 [ run ] completed with state DISABLED
CI server is currently disabled for scheduled maintenance. Estimated completion time: 9 PM PST on 4/4.

Link to invocation

@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 4, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41818 [ run ] triggered by Bot. Commit: c81239f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41818 [ run ] completed with state DISABLED
CI server is currently disabled for scheduled maintenance. Estimated completion time: 9 PM PST on 4/4.

Link to invocation

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 5, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41850 [ run ] triggered by Bot. Commit: 5d4843f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41850 [ run ] completed with state SUCCESS. Commit: 5d4843f
/LLM/main/L0_MergeRequest_PR pipeline #32718 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 5, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41882 [ run ] triggered by Bot. Commit: 5d4843f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41882 [ run ] completed with state SUCCESS. Commit: 5d4843f
/LLM/main/L0_MergeRequest_PR pipeline #32747 completed with status: 'SUCCESS'

CI Report

Link to invocation

@qiaoxj07 qiaoxj07 requested review from QiJune and xxi-nv April 6, 2026 02:06
@VALLIS-NERIA
Copy link
Copy Markdown
Collaborator

please link to https://nvbugspro.nvidia.com/bug/6051275

@qiaoxj07 qiaoxj07 enabled auto-merge (squash) April 8, 2026 01:37
@qiaoxj07 qiaoxj07 merged commit eb9ffd5 into NVIDIA:main Apr 8, 2026
5 checks passed
suyoggupta pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request Apr 8, 2026
…nfigurableMoE load_weights (NVIDIA#12761)

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants