Commit 9c24e2c
authored
Fix Deepseek transformers model loading (#740)
## What does this PR do?
**Type of change:** ? Bug fix
**Overview:** ?
For Deepseek, let's force the user to apply trust_remote_code and use
AutoModelForCausalLM for loading the model.
## Testing
python hf_ptq.py --pyt_ckpt_path <Kimi-K2-Thinking_path> --qformat nvfp4
--export_path <quantized_ckpt> --kv_cache_qformat none --calib_size 64
--trust_remote_code --dataset cnn_dailymail
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>1 parent 68d604d commit 9c24e2c
1 file changed
+6
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
352 | | - | |
353 | | - | |
354 | | - | |
355 | | - | |
356 | | - | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
357 | 358 | | |
358 | 359 | | |
359 | 360 | | |
| |||
0 commit comments