Commit 02fa362
authored
Sync MOE layer input quantizer only (#903)
## What does this PR do?
**Type of change:** Bug fix
**Overview:** in MOE layer we currently sync both the weight and input
quantizers so that all experts have the same weight amaxes and
activation amaxes.
VLLM/TRTLLM actually support non-uniform weight amaxes in MOE so we only
need to sync the activation amaxes.
## Usage
<!-- You can potentially add a usage example below. -->
```python
# Add a code snippet demonstrating how to use this
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved input quantizer synchronization for Mixture of Experts models
to ensure correct amax value handling across local experts.
* **Documentation**
* Fixed typos and clarified wording in quantization documentation.
* **Tests**
* Added test coverage for Mixture of Experts quantizer synchronization
functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Jennifer Chen <jennifchen@nvidia.com>1 parent 70ffb6f commit 02fa362
File tree
3 files changed
+93
-15
lines changed- modelopt/torch/quantization
- plugins
- tests/gpu_megatron/torch/quantization/plugins
3 files changed
+93
-15
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
99 | | - | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
100 | 104 | | |
101 | 105 | | |
102 | 106 | | |
103 | 107 | | |
104 | 108 | | |
105 | 109 | | |
| 110 | + | |
106 | 111 | | |
107 | 112 | | |
108 | 113 | | |
| |||
114 | 119 | | |
115 | 120 | | |
116 | 121 | | |
117 | | - | |
| 122 | + | |
118 | 123 | | |
119 | 124 | | |
120 | 125 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
575 | 575 | | |
576 | 576 | | |
577 | 577 | | |
578 | | - | |
| 578 | + | |
579 | 579 | | |
580 | | - | |
581 | | - | |
| 580 | + | |
| 581 | + | |
582 | 582 | | |
583 | 583 | | |
584 | 584 | | |
585 | 585 | | |
586 | 586 | | |
587 | 587 | | |
588 | 588 | | |
589 | | - | |
590 | | - | |
| 589 | + | |
| 590 | + | |
591 | 591 | | |
592 | 592 | | |
593 | 593 | | |
594 | 594 | | |
595 | 595 | | |
596 | 596 | | |
597 | | - | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
598 | 602 | | |
599 | 603 | | |
600 | 604 | | |
| |||
Lines changed: 76 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
469 | 469 | | |
470 | 470 | | |
471 | 471 | | |
472 | | - | |
473 | | - | |
474 | | - | |
475 | | - | |
| 472 | + | |
476 | 473 | | |
477 | 474 | | |
478 | 475 | | |
| |||
731 | 728 | | |
732 | 729 | | |
733 | 730 | | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
734 | 806 | | |
735 | 807 | | |
736 | 808 | | |
| |||
811 | 883 | | |
812 | 884 | | |
813 | 885 | | |
814 | | - | |
815 | | - | |
816 | | - | |
817 | 886 | | |
818 | 887 | | |
819 | 888 | | |
| |||
0 commit comments