Fix MOE layer sync test by jenchen13 · Pull Request #936 · NVIDIA/Model-Optimizer

jenchen13 · 2026-02-25T18:45:09Z

What does this PR do?

Type of change: Bug Fix

Overview: ?
Fix MOE layer sync test by initializing weights in MOE layer differently
Link to bug

Usage

# Add a code snippet demonstrating how to use this

Testing

https://github.com/NVIDIA/Model-Optimizer/actions/runs/22414958311

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Summary by CodeRabbit

Release Notes

Tests
- Enhanced quantization testing with improved expert weight initialization patterns for expert-based models.

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

coderabbitai · 2026-02-25T18:45:34Z

📝 Walkthrough

Walkthrough

The change adds per-expert weight initialization in a test file to diversify amax values across local experts by assigning unique weight constants to each expert's linear layers prior to quantization.

Changes

Cohort / File(s)	Summary
Per-Expert Weight Initialization `tests/gpu_megatron/torch/quantization/plugins/test_megatron.py`	Added initialization logic that assigns unique constants to decoder layer MLP experts' linear layer weights (linear_fc1 and linear_fc2) for each local expert to diversify amax values before quantization and synchronization.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Fix MOE layer sync test' is vague and does not clearly convey the specific change made (per-expert weight initialization). It describes what was fixed (the test) but not the actual solution.	Consider a more specific title such as 'Initialize per-expert weights to diversify amax in MOE layer sync test' to better communicate the implementation details of the fix.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch jennifchen/fix_moe_test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tests/gpu_megatron/torch/quantization/plugins/test_megatron.py (1)

771-772: Prefer torch.no_grad() over .data for test initialization.

Lines 771–772 use .data.fill_() for weight initialization. The idiomatic PyTorch pattern is to wrap parameter mutations in torch.no_grad() context and call fill_() directly on the parameter.

♻️ Proposed fix

-    for layer in model.decoder.layers:
-        for i, expert in enumerate(layer.mlp.experts.local_experts):
-            expert.linear_fc1.weight.data.fill_(0.1 + i * 0.05)
-            expert.linear_fc2.weight.data.fill_(0.2 + i * 0.05)
+    with torch.no_grad():
+        for layer in model.decoder.layers:
+            for i, expert in enumerate(layer.mlp.experts.local_experts):
+                expert.linear_fc1.weight.fill_(0.1 + i * 0.05)
+                expert.linear_fc2.weight.fill_(0.2 + i * 0.05)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/gpu_megatron/torch/quantization/plugins/test_megatron.py` around lines
771 - 772, The test initializes expert weights using the deprecated .data API;
replace those mutations with a torch.no_grad() context and call fill_() directly
on the parameter tensors (e.g., expert.linear_fc1.weight and
expert.linear_fc2.weight) so the assignments occur without tracking gradients
and avoid .data usage—wrap the two fill_() calls inside torch.no_grad(): with
torch.no_grad(): expert.linear_fc1.weight.fill_(...) and
expert.linear_fc2.weight.fill_(...).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/gpu_megatron/torch/quantization/plugins/test_megatron.py`:
- Around line 771-772: The test initializes expert weights using the deprecated
.data API; replace those mutations with a torch.no_grad() context and call
fill_() directly on the parameter tensors (e.g., expert.linear_fc1.weight and
expert.linear_fc2.weight) so the assignments occur without tracking gradients
and avoid .data usage—wrap the two fill_() calls inside torch.no_grad(): with
torch.no_grad(): expert.linear_fc1.weight.fill_(...) and
expert.linear_fc2.weight.fill_(...).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e589ac8 and e3e98fe.

📒 Files selected for processing (1)

tests/gpu_megatron/torch/quantization/plugins/test_megatron.py

codecov · 2026-02-25T19:00:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.17%. Comparing base (d78797b) to head (e3e98fe).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #936      +/-   ##
==========================================
- Coverage   73.09%   72.17%   -0.92%     
==========================================
  Files         205      207       +2     
  Lines       22301    22656     +355     
==========================================
+ Hits        16300    16352      +52     
- Misses       6001     6304     +303

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevalmorabia97

Test https://github.com/NVIDIA/Model-Optimizer/actions/runs/22414958311 passes

initialize moe experts differently

e3e98fe

Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>

jenchen13 requested a review from realAsma February 25, 2026 18:45

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

kevalmorabia97 approved these changes Feb 26, 2026

View reviewed changes

kevalmorabia97 merged commit 6b0ea4d into main Feb 26, 2026
50 of 52 checks passed

kevalmorabia97 deleted the jennifchen/fix_moe_test branch February 26, 2026 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix MOE layer sync test#936

Fix MOE layer sync test#936
kevalmorabia97 merged 1 commit intomainfrom
jennifchen/fix_moe_test

jenchen13 commented Feb 25, 2026 •

edited by kevalmorabia97

Loading

Uh oh!

coderabbitai Bot commented Feb 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

codecov Bot commented Feb 25, 2026 •

edited

Loading

Uh oh!

kevalmorabia97 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jenchen13 commented Feb 25, 2026 • edited by kevalmorabia97 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevalmorabia97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jenchen13 commented Feb 25, 2026 •

edited by kevalmorabia97

Loading

coderabbitai Bot commented Feb 25, 2026 •

edited

Loading

codecov Bot commented Feb 25, 2026 •

edited

Loading