Fix MOE layer sync test (#936)

jenchen13 · web-flow · commit 6b0ea4d43d65 · 2026-02-26T12:44:42.000+05:30
## What does this PR do? **Type of change:** Bug Fix **Overview:** ? Fix MOE layer sync test by initializing weights in MOE layer differently [Link to bug](https://github.com/NVIDIA/Model-Optimizer/actions/runs/22288772586/job/64472124733#step:7:851) ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  https://github.com/NVIDIA/Model-Optimizer/actions/runs/22414958311 ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information   ## Summary by CodeRabbit ## Release Notes * **Tests** * Enhanced quantization testing with improved expert weight initialization patterns for expert-based models.  Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>
diff --git a/tests/gpu_megatron/torch/quantization/plugins/test_megatron.py b/tests/gpu_megatron/torch/quantization/plugins/test_megatron.py
@@ -765,6 +765,12 @@ def _test_layer_sync_moe_local_experts_amax(ep_size, moe_grouped_gemm, rank, siz
         num_moe_experts=8,
         transformer_impl="modelopt",
     )
+    # Make weight initialization different across experts, otherwise experts will have similar amax values
+    for layer in model.decoder.layers:
+        for i, expert in enumerate(layer.mlp.experts.local_experts):
+            expert.linear_fc1.weight.data.fill_(0.1 + i * 0.05)
+            expert.linear_fc2.weight.data.fill_(0.2 + i * 0.05)
+
     quant_cfg = mtq.FP8_DEFAULT_CFG
     model = mtq.quantize(model, quant_cfg, get_forward(model))