Commit 8eec6d4
authored
Support EP mcore import for TE Spec and Fix mamba moe config (#1342)
### What does this PR do?
Type of change: Bug fix
- Enable EP (expert parallelism) import for HF to MCore when using TE
Spec
- Fix bug in mamba moe config which doesn't skip attention layers
properly in MCore (Mcore uses different naming for attention layers than
HF)
- Add getter for Quant Config (used in MLM modelopt examples to get
quant cfg fields)
### Usage
```python
# In Megatron-LM/examples/post_training/modelopt
MLM_EXTRA_ARGS="--export-default-te-spec --trust-remote-code --moe-router-dtype fp32" EP=4 HF_MODEL_CKPT=</path/to/hf> MLM_MODEL_SAVE=<save/path> ./convert.sh nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
```
### Testing
<!-- Mention how have you tested your change if applicable. -->
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).
- Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain
why. -->
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A
<!--- Mandatory -->
- Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory
for new features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes
or backward incompatible changes. -->
### Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Corrected expert-slice assignment so each expert-parallel rank loads
the proper expert slice.
* Improved detection of pipeline-parallel layer indices in submodule
names.
* **Improvements**
* Relaxed constraints between local and global expert counts for
grouped-local-expert imports.
* Added typed helpers for managing quantization configuration entries
and expanded quantizer disable patterns.
* Exporter now accepts an additional hybrid model type when available.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>1 parent 6d33078 commit 8eec6d4
3 files changed
Lines changed: 32 additions & 11 deletions
File tree
- modelopt/torch
- export
- plugins
- quantization
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| |||
294 | 295 | | |
295 | 296 | | |
296 | 297 | | |
297 | | - | |
298 | | - | |
299 | | - | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
300 | 305 | | |
301 | 306 | | |
302 | 307 | | |
| |||
653 | 658 | | |
654 | 659 | | |
655 | 660 | | |
656 | | - | |
657 | | - | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
658 | 664 | | |
659 | | - | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
660 | 668 | | |
661 | 669 | | |
662 | 670 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
75 | 80 | | |
76 | 81 | | |
77 | 82 | | |
| |||
121 | 126 | | |
122 | 127 | | |
123 | 128 | | |
124 | | - | |
| 129 | + | |
125 | 130 | | |
126 | 131 | | |
127 | 132 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
243 | 251 | | |
244 | 252 | | |
245 | 253 | | |
| |||
0 commit comments