NVIDIA
diff --git a/‎.claude/skills/ptq/SKILL.md‎
Lines changed: 2 additions & 2 deletions b/‎.claude/skills/ptq/SKILL.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎.claude/skills/ptq/references/slurm-setup-ptq.md‎
Lines changed: 36 additions & 18 deletions b/‎.claude/skills/ptq/references/slurm-setup-ptq.md‎
Lines changed: 36 additions & 18 deletions
diff --git a/‎.claude/skills/ptq/references/unsupported-models.md‎
Lines changed: 13 additions & 8 deletions b/‎.claude/skills/ptq/references/unsupported-models.md‎
Lines changed: 13 additions & 8 deletions
diff --git a/‎.github/CODEOWNERS‎
Lines changed: 3 additions & 0 deletions b/‎.github/CODEOWNERS‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 12 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 12 additions & 2 deletions
diff --git a/‎LICENSE‎
Lines changed: 116 additions & 0 deletions b/‎LICENSE‎
Lines changed: 116 additions & 0 deletions
@@ -124,9 +124,9 @@ Report the path and size to the user.
 
 ## Common Pitfalls
 
-- **Transformers version**: Newer models (e.g., Devstral/ministral3) may require a transformers version not yet in the container. Check `config.json` for `transformers_version` and upgrade if needed. Install ModelOpt first, then upgrade transformers **with** deps (not `--no-deps`) to pull compatible `huggingface_hub`
+- **Transformers version**: New models may need a newer version of transformers than what's installed. Check `config.json` for `transformers_version`. In containers, beware of `PIP_CONSTRAINT` blocking upgrades — see `references/slurm-setup-ptq.md` for workarounds
 - **Gated datasets**: Some calibration datasets require HF authentication. Ensure `HF_TOKEN` is set in the job environment, or use `--dataset cnn_dailymail` as a non-gated alternative
-- **NFS root_squash + Docker**: Docker runs as root, but NFS squashes root to `nobody`. Use `docker run --user $(id -u):$(id -g)`, or `chmod -R a+rwX` on needed directories as a fallback. See `skills/common/slurm-setup.md` section 5
+- **NFS root_squash + Docker**: See `skills/common/slurm-setup.md` section 5
 
 ## References
 
 
@@ -7,29 +7,54 @@ monitoring), see `skills/common/slurm-setup.md`.
 
 ## 1. Container
 
-Get the recommended image version from `examples/llm_ptq/README.md`, then look for a `.sqsh` file in the workspace and common sibling directories:
+Get the recommended image version from `examples/llm_ptq/README.md`, then look for an existing `.sqsh` file:
 
 ```bash
 ls *.sqsh ../*.sqsh ~/containers/*.sqsh 2>/dev/null
 ```
 
-If you find a `.sqsh` but aren't sure of its version, check it:
+**If a `.sqsh` exists**, use it directly with `--container-image=<path>`. Skip import.
+
+**If no `.sqsh` exists**, import with enroot (caches for subsequent smoke tests and reruns):
 
 ```bash
-srun --container-image=<path/to/container.sqsh> --ntasks=1 bash -c \
-    "pip show tensorrt-llm 2>/dev/null | grep Version || cat /VERSION 2>/dev/null || echo unknown"
+export ENROOT_CACHE_PATH=/path/to/writable/enroot-cache
+export ENROOT_DATA_PATH=/path/to/writable/enroot-data
+mkdir -p "$ENROOT_CACHE_PATH" "$ENROOT_DATA_PATH"
+enroot import --output /path/to/container.sqsh docker://nvcr.io#nvidia/tensorrt-llm/release:<version>
 ```
 
-If no `.sqsh` exists, import it with enroot. Set writable cache paths first — the default `/raid/containers` is often not writable:
+If enroot import fails (e.g., permission errors on lustre), use pyxis inline pull as fallback — pass the NGC URI directly to `--container-image="nvcr.io/nvidia/tensorrt-llm/release:<version>"`. Note this re-pulls on every job.
+
+### Container dependency pitfalls
+
+**New models may need newer transformers** than what's in the container:
 
 ```bash
-export ENROOT_CACHE_PATH=/path/to/writable/enroot-cache
-export ENROOT_DATA_PATH=/path/to/writable/enroot-data
-export TMPDIR=/path/to/writable/tmp
-mkdir -p "$ENROOT_CACHE_PATH" "$ENROOT_DATA_PATH" "$TMPDIR"
+pip install -U transformers
+```
+
+For unlisted models that need unreleased transformers (e.g., from git), see `references/unsupported-models.md` Step A.
+
+**Prefer `PYTHONPATH`** to use the synced ModelOpt source instead of installing inside the container — this avoids risking dependency conflicts (e.g., `pip install -U nvidia-modelopt[hf]` can upgrade PyTorch and break other packages):
+
+```bash
+export PYTHONPATH=/path/to/Model-Optimizer:$PYTHONPATH
+```
+
+If `PYTHONPATH` doesn't work due to missing compiled extensions, fall back to `pip install -e ".[hf]" --no-build-isolation` (run from the Model-Optimizer repo root).
+
+**Watch for pip dependency conflicts** — NGC containers set `PIP_CONSTRAINT` to pin versions, causing `ResolutionImpossible` errors. Unset it first so pip can resolve freely:
+
+```bash
+unset PIP_CONSTRAINT
+pip install -U transformers   # now upgrades and resolves with new deps included
+```
 
-enroot import --output /path/to/container.sqsh \
-    docker://nvcr.io#nvidia/tensorrt-llm/release:<version>
+If that still conflicts, fall back to `--no-deps` (skips new deps — may need to add missing ones manually):
+
+```bash
+pip install -U transformers --no-deps
 ```
 
 ---
@@ -68,10 +93,3 @@ This catches script errors cheaply before using GPU quota on a real run.
 See `skills/common/slurm-setup.md` section 2 for the smoke test partition pattern.
 
 Only submit the full calibration job after the smoke test exits cleanly.
-
----
-
-## 4. PTQ-Specific Notes
-
-- **Gated datasets**: Some calibration datasets (e.g., `nvidia/Nemotron-Post-Training-Dataset-v2`) require HF authentication. Set `HF_TOKEN` in the job environment, or use `--dataset cnn_dailymail` to use a non-gated alternative.
-- **NFS permissions**: Docker + NFS root_squash causes `PermissionError` on output/cache dirs. See `skills/common/slurm-setup.md` section 5 for fixes.
 
@@ -15,7 +15,11 @@ After download, inspect the model files on the target machine (use `remote_run`
 
 Write custom scripts locally (in `./workspaces/<model>/scripts/`), then sync to remote before running.
 
-**Then check `config.json`** (on the target machine):
+**Check transformers compatibility** (on the target machine):
+
+First, if README or `config.json` specifies a required transformers version, check if installed version satisfies it. If not, upgrade: `pip install -U "transformers>=<required_version>"`.
+
+Then try loading:
 
 ```bash
 python -c "
@@ -40,16 +44,14 @@ print(type(cfg).__name__)
 
   Read the modeling file and proceed to Step B.
 
-- **Raises `ValueError` / `OSError` (unknown architecture)** → not in the installed transformers. Determine why:
-
-  1. **Check the transformers `main` branch** (not yet released):
+- **Raises `ValueError` / `OSError` (unknown architecture)** → not in the installed transformers. Try `pip install -U transformers` first. If still not found, check the `main` branch:
 
      ```bash
      git clone --depth 1 https://github.com/huggingface/transformers.git /tmp/transformers-main --quiet
      grep -r "class <ArchName>" /tmp/transformers-main/src/transformers/models/
      ```
 
-     - **Found** → install with deps: `pip install /tmp/transformers-main`, then re-run `AutoConfig.from_pretrained()`. **Important**: if ModelOpt is already installed, its `[hf]` extras may have pinned an older transformers. Install ModelOpt first, then upgrade transformers **after** (with deps, not `--no-deps`) so compatible `huggingface_hub` and other transitive deps are pulled in.
+     - **Found** → `pip install /tmp/transformers-main`, then re-run `AutoConfig`.
      - **Not found** → ask the user: *"The checkpoint uses `<ArchName>` which isn't in released or main-branch transformers. Do you have a private fork or custom modeling code?"*
 
 - **No `config.json`** → not a standard HF checkpoint. List the directory for README or `.py` files. If nothing useful, ask the user for the modeling code.
@@ -131,13 +133,15 @@ class QuantCustomModule(OriginalModule):
 
 ## Pattern 2: MoE Models
 
-**Standard MoE** (per-expert `nn.Linear` in a `ModuleList` with `gate` + `experts`): Auto-detected by `register_sparse_moe_on_the_fly`. No custom code needed — amax sync and calibration coverage are handled automatically.
+**Most MoE models are auto-detected** — ModelOpt handles two common patterns automatically:
+
+- **transformers >= 5.0**: Unified fused experts (`gate_up_proj` + `down_proj` 3D tensors) → auto-detected by `register_fused_experts_on_the_fly`, handled by `_QuantFusedExperts`. Covers Mixtral, Qwen, DeepSeek, Jamba, OlMoE, etc.
+- **transformers < 5.0**: Sequential per-expert `nn.Linear` with `gate` + `experts` → auto-detected by `register_sparse_moe_on_the_fly`.
 
-**Custom MoE** requires patching. Read the model source to understand how expert weights are stored and computed, then find the closest pattern in the plugin (`modelopt/torch/quantization/plugins/huggingface.py`):
+**Custom MoE** (non-standard layout not matching auto-detection) requires patching. Find the closest pattern in the plugin (`modelopt/torch/quantization/plugins/huggingface.py`):
 
 | MoE design | Strategy | Plugin example |
 | --- | --- | --- |
-| Fused weights + per-expert dispatch loop | Expand to per-expert `nn.Linear` | `_QuantQwen35MoeExperts` |
 | Fused weights + `torch.bmm` | Add `TensorQuantizer` around bmm | `_QuantLlama4TextExperts` |
 | Fused weights + functional interception | Intercept matmul ops | `_QuantGptOssExperts` |
 | Fused 2D weights (experts stacked in rows) | Two-level expansion | `_QuantDbrxExpertGLU` |
@@ -343,3 +347,4 @@ tokenizer.save_pretrained(output_path)
 - **Check quantizer summary**: `mtq.print_quant_summary(model)` shows which quantizers are enabled/disabled
 - **Inspect dtypes**: After loading, iterate `model.named_parameters()` and check for unexpected FP8 tensors
 - **Watch for silent disabling**: A misconfigured wildcard pattern can silently disable quantizers — always verify the summary
+- **Read pip errors carefully**: `ResolutionImpossible` means dependency conflict (try `--no-deps`), NOT network failure. Check for `Connection refused`/`Name resolution failed` before concluding network is down
@@ -55,3 +55,6 @@ modelopt_recipes @NVIDIA/modelopt-recipes-codeowners
 /examples/vlm_ptq @NVIDIA/modelopt-examples-vlm-codeowners
 /examples/vllm_serve @NVIDIA/modelopt-examples-llm_ptq-codeowners
 /examples/windows @NVIDIA/modelopt-windows-codeowners
+
+# Requirements files are owned by the setup team regardless of location
+requirements*.txt @NVIDIA/modelopt-setup-codeowners
@@ -2,6 +2,9 @@
 
 Thanks for your interest in contributing to Model Optimizer (ModelOpt)!
 
+> [!NOTE]
+> Any contributions to this repository are only accepted under the Apache 2.0 license.
+
 ## 🛠️ Setting up your environment
 
 Ensure that Model Optimizer (ModelOpt) is installed in editable mode and that all `dev` optional requirements are installed:
@@ -64,11 +67,18 @@ If you are an external contributor, seek guidance from `@NVIDIA/modelopt-setup-c
   1. A reference link (with commit hash) to the source from which the code was copied.
   1. The original repository's Copyright / License.
   1. The NVIDIA Apache 2.0 Copyright / License header.
+- **Update `SPDX-License-Identifier`:** If the third-party code uses a different license than Apache 2.0, update the `SPDX-License-Identifier` in the NVIDIA header to reflect both licenses using SPDX expression syntax. For example, for MIT-licensed source code:
+
+  ```python
+  # SPDX-License-Identifier: Apache-2.0 AND MIT
+  ```
 
-  See [`modelopt/torch/speculative/eagle/utils.py`](./modelopt/torch/speculative/eagle/utils.py)
-  for an example of the correct license header format.
+  If the third-party code is also Apache 2.0, no change is needed (`SPDX-License-Identifier: Apache-2.0` remains correct).
+- **Update `LICENSE`:** Add the third-party copyright holder to the appropriate license section in the [`LICENSE`](./LICENSE) file under *Third-Party Software Notices*. If the third-party license is not already listed there, add a new section with the full license text.
 - **Exclude from license pre-commit hook:** Exclude copied files from the license pre-commit hook so it doesn't auto-add the NVIDIA Apache 2.0 license on top of the file. Add the file path to the `exclude` list in the `insert-license` hook in [`.pre-commit-config.yaml`](./.pre-commit-config.yaml).
 
+See [`modelopt/torch/quantization/utils/calib_utils.py`](./modelopt/torch/quantization/utils/calib_utils.py) for an example of the correct license header format.
+
 ## 📝 Writing tests
 
 We use [pytest](https://docs.pytest.org/) for all tests. For any new features / examples, make sure to add tests and that the coverage check in your PR passes. The tests are organized into the following directories:
 
@@ -199,3 +199,119 @@
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
+
+================================================================================
+Third-Party Software Notices
+================================================================================
+
+Portions of this repository contain code adapted from third-party sources.
+Each component is subject to the terms of its respective license as set out
+below.
+
+--------------------------------------------------------------------------------
+Apache License, Version 2.0
+--------------------------------------------------------------------------------
+
+Portions of this repository were adapted from code originally authored by
+the following copyright holders, licensed under the Apache License, Version 2.0
+(see full license text above):
+
+  Copyright 2021 The HuggingFace Inc. team
+  Copyright 2022 The HuggingFace Team
+  Copyright 2022, Lefebvre Dalloz Services
+  Copyright 2022 EleutherAI and the HuggingFace Inc. team
+  Copyright 2023 Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li
+  Copyright (c) 2024 Heming Xia
+  Copyright 2025 The Qwen team, Alibaba Group and the HuggingFace Inc. team
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not
+use these files except in compliance with the License. You may obtain a copy
+of the License at http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed
+under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
+CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+
+--------------------------------------------------------------------------------
+MIT License
+--------------------------------------------------------------------------------
+
+Portions of this repository were adapted from code originally authored by
+the following copyright holders, licensed under the MIT License:
+
+  Copyright (c) Andrei Panferov
+  Copyright (c) Microsoft Corporation
+  Copyright (c) 2020 EleutherAI
+  Copyright (c) 2020 Dan Hendrycks
+  Copyright (c) 2023 Deep Cognition and Language Research (DeCLaRe) Lab
+  Copyright (c) 2023 DeepSeek
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
+--------------------------------------------------------------------------------
+BSD 3-Clause License
+--------------------------------------------------------------------------------
+
+Portions of this repository were adapted from code originally authored by
+the following copyright holders, licensed under the BSD 3-Clause License:
+
+  Copyright (c) 2016-     Facebook, Inc            (Adam Paszke)
+  Copyright (c) 2014-     Facebook, Inc            (Soumith Chintala)
+  Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
+  Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
+  Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
+  Copyright (c) 2011-2013 NYU                      (Clement Farabet)
+  Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
+  Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
+  Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
+  Copyright (c) 2016-present, Facebook Inc.
+  Copyright (c) 2016 Facebook Inc.
+  Copyright (c) 2015 Google Inc.
+  Copyright (c) 2015 Yangqing Jia
+  Copyright 2019-2020 Kakao Brain
+  Copyright (c) 2022 Cruise LLC.
+  Copyright (c) 2024 Tri Dao.
+  Copyright (c) 2021, 2023-2024 Arm Limited and/or its affiliates
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright notice,
+   this list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.
+
+3. Neither the names of Facebook, Deepmind Technologies, NYU, NEC Laboratories
+   America and IDIAP Research Institute nor the names of its contributors may
+   be used to endorse or promote products derived from this software without
+   specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.