Commit 82cf851
authored
[2/N] PTQ skill change for transformers 5.0 (#1229)
### What does this PR do?
Type of change: Improve <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->
<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom:
0.2em; color: rgb(97, 97, 97); font-family: -apple-system,
"system-ui", "Segoe UI", Roboto, sans-serif;
font-size: 13px; font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: 400; letter-spacing: normal;
orphans: 2; text-align: start; text-indent: 0px; text-transform: none;
widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(242, 242, 242); text-decoration-thickness:
initial; text-decoration-style: initial; text-decoration-color:
initial;"><strong>Summary:</strong></p><ul style="padding-inline-start:
2em; color: rgb(97, 97, 97); font-family: -apple-system,
"system-ui", "Segoe UI", Roboto, sans-serif;
font-size: 13px; font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: 400; letter-spacing: normal;
orphans: 2; text-align: start; text-indent: 0px; text-transform: none;
widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
white-space: normal; background-color: rgb(242, 242, 242);
text-decoration-thickness: initial; text-decoration-style: initial;
text-decoration-color: initial;"><li>Update MoE Pattern 2 for
transformers 5.0 unified fused experts (<code style="font-family:
monospace; color: rgb(163, 21, 21); background-color: rgba(0, 0, 0,
0.1); padding: 2px 4px; border-radius: 3px; word-break: break-word;
font-size:
0.9em;">_QuantFusedExperts</code><span> </span>auto-detection)</li><li>Add<span> </span><code
style="font-family: monospace; color: rgb(163, 21, 21);
background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius:
3px; word-break: break-word; font-size:
0.9em;">PIP_CONSTRAINT</code><span> </span>workaround
and<span> </span><code style="font-family: monospace; color: rgb(163,
21, 21); background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px;
border-radius: 3px; word-break: break-word; font-size:
0.9em;">PYTHONPATH</code><span> </span>guidance for NGC
containers</li><li>Add pip error diagnostic tip (<code
style="font-family: monospace; color: rgb(163, 21, 21);
background-color: rgba(0, 0, 0, 0.1); padding: 2px 4px; border-radius:
3px; word-break: break-word; font-size:
0.9em;">ResolutionImpossible</code><span> </span>≠ network
failure)</li><li>Remove duplicated warnings across files — single source
of truth per topic</li></ul><p style="white-space: pre-wrap; margin-top:
0.1em; margin-bottom: 0.2em; color: rgb(97, 97, 97); font-family:
-apple-system, "system-ui", "Segoe UI", Roboto,
sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures:
normal; font-variant-caps: normal; font-weight: 400; letter-spacing:
normal; orphans: 2; text-align: start; text-indent: 0px; text-transform:
none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(242, 242, 242); text-decoration-thickness:
initial; text-decoration-style: initial; text-decoration-color:
initial;"><strong>Changes by file:</strong></p>
<br class="Apple-interchange-newline">
File | Change
-- | --
references/slurm-setup-ptq.md | Container dependency section: PYTHONPATH
preferred, PIP_CONSTRAINT workaround, --no-deps fallback
references/unsupported-models.md | MoE Pattern 2 updated for
transformers 5.0 auto-detection. Pip install advice points to
slurm-setup-ptq.md. Pip error diagnostic added
SKILL.md | Common Pitfalls simplified — warnings point to references
instead of duplicating
### Usage
### Testing
Tested on gemma4 dense and MoE models.
### Before your PR is "*Ready for review*"
Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)
and your commits are signed (`git commit -s -S`).
Make sure you read and follow the [Security Best
Practices](https://github.com/NVIDIA/Model-Optimizer/blob/main/SECURITY.md#security-coding-practices-for-contributors)
(e.g. avoiding hardcoded `trust_remote_code=True`, `torch.load(...,
weights_only=False)`, `pickle`, etc.).
- Is this change backward compatible?: ✅ / ❌ / N/A <!--- If ❌, explain
why. -->
- If you copied code from any other sources or added a new PIP
dependency, did you follow guidance in `CONTRIBUTING.md`: ✅ / ❌ / N/A
<!--- Mandatory -->
- Did you write any new necessary tests?: ✅ / ❌ / N/A <!--- Mandatory
for new features or examples. -->
- Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?:
✅ / ❌ / N/A <!--- Only for new features, API changes, critical bug fixes
or backward incompatible changes. -->
### Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Documentation**
* Clarified Transformers-version checks (prefer config.json) and warned
container upgrades can be blocked by PIP_CONSTRAINT; added pointer to
remediation.
* Shortened Docker/NFS guidance by cross-referencing setup docs instead
of explicit commands.
* Reworked SLURM/container workflow to prefer existing images and add an
import → pull fallback.
* Added in-job dependency remediation steps and clarified MoE
auto-detection differences and pip conflict troubleshooting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Meng Xin <mxin@nvidia.com>1 parent 9050188 commit 82cf851
File tree
3 files changed
+51
-28
lines changed- .claude/skills/ptq
- references
3 files changed
+51
-28
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
124 | 124 | | |
125 | 125 | | |
126 | 126 | | |
127 | | - | |
| 127 | + | |
128 | 128 | | |
129 | | - | |
| 129 | + | |
130 | 130 | | |
131 | 131 | | |
132 | 132 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | | - | |
20 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
21 | 25 | | |
22 | 26 | | |
23 | | - | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
24 | 32 | | |
25 | 33 | | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
30 | 53 | | |
31 | | - | |
32 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
33 | 58 | | |
34 | 59 | | |
35 | 60 | | |
| |||
68 | 93 | | |
69 | 94 | | |
70 | 95 | | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
19 | 23 | | |
20 | 24 | | |
21 | 25 | | |
| |||
40 | 44 | | |
41 | 45 | | |
42 | 46 | | |
43 | | - | |
44 | | - | |
45 | | - | |
| 47 | + | |
46 | 48 | | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
52 | | - | |
| 54 | + | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
| |||
131 | 133 | | |
132 | 134 | | |
133 | 135 | | |
134 | | - | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
135 | 140 | | |
136 | | - | |
| 141 | + | |
137 | 142 | | |
138 | 143 | | |
139 | 144 | | |
140 | | - | |
141 | 145 | | |
142 | 146 | | |
143 | 147 | | |
| |||
343 | 347 | | |
344 | 348 | | |
345 | 349 | | |
| 350 | + | |
0 commit comments