Commit eb3e6ed
authored
[fix][5875912] Fix autoquant-autodeploy example (#878)
## What does this PR do?
**Type of change:** Bug fix <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
**Overview:** ?
Please check Bug ticket
## Usage
<!-- You can potentially add a usage example below. -->
```python
# Add a code snippet demonstrating how to use this
```
## Testing
Tested with
```
./scripts/run_auto_quant_and_deploy.sh --hf_ckpt ./models/Qwen/Qwen3-8B --save_quantized_ckpt ./qwen3_8B_autoquant --quant fp8 --effective_bits 10.0
```
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Simplified LLM initialization by removing intermediate configuration
layer
* Updated attention backend from triton to flashinfer
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>1 parent 10efcb6 commit eb3e6ed
1 file changed
+3
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
24 | | - | |
| 23 | + | |
25 | 24 | | |
26 | 25 | | |
27 | 26 | | |
| |||
45 | 44 | | |
46 | 45 | | |
47 | 46 | | |
48 | | - | |
49 | | - | |
50 | 47 | | |
51 | | - | |
52 | | - | |
| 48 | + | |
53 | 49 | | |
54 | 50 | | |
55 | 51 | | |
| |||
58 | 54 | | |
59 | 55 | | |
60 | 56 | | |
61 | | - | |
| 57 | + | |
62 | 58 | | |
63 | | - | |
64 | 59 | | |
65 | 60 | | |
66 | 61 | | |
| |||
0 commit comments