[None][fix] Add missing allow_partial_loading param to CuteDSL and ConfigurableMoE load_weights#12761
Conversation
…nfigurableMoE load_weights PR NVIDIA#12136 (DWDP) added a load_weights override in CuteDslFusedMoE that dropped the allow_partial_loading parameter from the base class signature. ConfigurableMoE.load_weights also lacked this parameter. This causes TypeError when qwen2_moe_weight_mapper calls module.load_weights(weights=..., allow_partial_loading=...) on models using the CuteDSL or ConfigurableMoE backend (e.g., Qwen3 MoE). Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
/bot run --disable-fail-fast |
📝 WalkthroughWalkthroughThis PR introduces an optional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/_torch/modules/fused_moe/configurable_moe.py`:
- Around line 1240-1242: Run the repository formatter (ruff/black/ruff-format as
used in CI) and commit the reformatted signature for the method load_weights in
class/function configurable_moe so the function header is line-wrapped to match
project style; specifically reflow the signature "def load_weights(self,
weights: List[Dict], allow_partial_loading: bool = False):" using the project's
formatter and commit the result so ruff-format CI passes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: dc295e8b-7075-434c-b821-87712fe277d3
📒 Files selected for processing (2)
tensorrt_llm/_torch/modules/fused_moe/configurable_moe.pytensorrt_llm/_torch/modules/fused_moe/fused_moe_cute_dsl.py
|
PR_Github #41815 [ run ] triggered by Bot. Commit: |
|
PR_Github #41815 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #41818 [ run ] triggered by Bot. Commit: |
|
PR_Github #41818 [ run ] completed with state |
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
/bot run --disable-fail-fast |
|
PR_Github #41850 [ run ] triggered by Bot. Commit: |
|
PR_Github #41850 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #41882 [ run ] triggered by Bot. Commit: |
|
PR_Github #41882 [ run ] completed with state |
|
please link to https://nvbugspro.nvidia.com/bug/6051275 |
…nfigurableMoE load_weights (NVIDIA#12761) Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Summary
load_weightsoverride inCuteDslFusedMoEthat dropped theallow_partial_loadingparameter from the base class signature.ConfigurableMoE.load_weightsalso lacked this parameter.TypeError: CuteDslFusedMoE.load_weights() got an unexpected keyword argument 'allow_partial_loading'whenqwen2_moe_weight_mappercallsmodule.load_weights(weights=..., allow_partial_loading=...)on models using the CuteDSL or ConfigurableMoE backend (e.g., Qwen3 MoE).load_weights(self, weights, allow_partial_loading=False)in both overrides and pass through tosuper()/ backend.Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
allow_partial_loadingparameter to weight loading in Mixture of Experts modules, enabling flexible control over partial weight loading behavior.