You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[For RL] Keep attrs after folding weight and fix empty extra state for Megatron (#779)
## What does this PR do?
**Type of change:** improvement
**Overview:**
- For Quantization aware reinforcement learning, after folding weight of
rollout, we want to keep the quantization attrs for next step.
- Minor fix for empty extra state
- Support getting dataloader from jsonl file, useful for using training
data as calibration data. I can separate this to another PR if
necessary.
## Usage
`mtq.fold_weight(keep_attrs=True)` will keep quantizer attrs after
folding weight,
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes <!--- If No, explain why.
-->
- **Did you write any new necessary tests?**: NA
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No <!--- Only for new features, API changes, critical bug fixes or bw
breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added support for loading dataset samples directly from JSONL/JSONL.GZ
files
* Added optional parameter to skip logits return in generation prefill
operations
* Enhanced weight folding operations to optionally preserve quantization
attributes during model optimization
* **Bug Fixes**
* Fixed handling of empty tensor states to prevent deserialization
errors in Megatron module
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Meng Xin <mxin@nvidia.com>
0 commit comments