Skip to content

Commit 0667f13

Browse files
committed
update launcher guide
Signed-off-by: Meng Xin <mxin@nvidia.com>
1 parent 60e654c commit 0667f13

2 files changed

Lines changed: 36 additions & 69 deletions

File tree

.claude/skills/ptq/SKILL.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,9 @@ All format definitions: `modelopt/torch/quantization/config.py`.
4444
4545
## Step 4 — Run PTQ
4646

47-
**Goal: checkpoint on disk** (`.safetensors` + `config.json`). Always smoke test first (`--calib_size 4`), then full calibration.
47+
**Goal: checkpoint on disk** (`.safetensors` + `config.json`).
48+
49+
**IMPORTANT — sequential smoke test**: Run a smoke test first with `--calib_size 4` (or `CALIB_SIZE: "4"` in YAML). Wait for it to complete and verify it succeeded. Only then run the full calibration (`--calib_size 512`).
4850

4951
**Which path?** Based on step 1:
5052

Lines changed: 33 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,113 +1,78 @@
11
# Using the ModelOpt Launcher for PTQ
22

3-
The launcher (`tools/launcher/`) handles SLURM, Docker, and local execution. Read `tools/launcher/CLAUDE.md` for full documentation. This guide covers PTQ-specific usage.
3+
The launcher (`tools/launcher/`) handles SLURM and Docker execution. Read `tools/launcher/CLAUDE.md` for full docs.
44

55
## Quick Start
66

77
```bash
88
cd tools/launcher
9-
uv run launch.py --yaml <config.yaml> --yes
9+
uv run launch.py --yaml <config.yaml> --yes # SLURM (SLURM_HOST set)
10+
uv run launch.py --yaml <config.yaml> hf_local=<cache> --yes # Local Docker
1011
```
1112

12-
## Writing a PTQ Config
13+
## HF Transformers PTQ Config
1314

14-
### For supported models (typed task)
15-
16-
Use the `MegatronLMQuantizeTask` for clean configs:
15+
The launcher provides `common/hf_ptq/hf_ptq.sh` which wraps `hf_ptq.py`. Configure via environment variables:
1716

1817
```yaml
1918
job_name: <Model>_<Format>
2019
pipeline:
2120
task_0:
22-
_target_: common.megatron_lm.quantize.task.MegatronLMQuantizeTask
23-
config:
24-
model: <HuggingFace model ID>
25-
quant_cfg: <QUANT_CFG name, e.g., NVFP4_DEFAULT_CFG>
26-
tp: <tensor parallelism>
27-
calib_dataset: abisee/cnn_dailymail
28-
calib_size: 512
29-
hf_local: /hf-local/
21+
script: common/hf_ptq/hf_ptq.sh
22+
environment:
23+
- HF_MODEL: <HuggingFace model ID, e.g. Qwen/Qwen3-0.6B>
24+
- QFORMAT: <format, e.g. nvfp4, fp8, int4_awq>
25+
- CALIB_SIZE: "512"
26+
- EXPORT_PATH: /scratchspace/exported_model
3027
slurm_config:
3128
_factory_: "slurm_factory"
3229
nodes: 1
33-
ntasks_per_node: <tp>
34-
gpus_per_node: <tp>
30+
ntasks_per_node: 1
31+
gpus_per_node: <num_gpus>
3532
```
3633
37-
Available `quant_cfg` values — check `modelopt/torch/quantization/config.py` for the full list.
38-
39-
### For custom scripts (raw SandboxTask)
40-
41-
When using a custom PTQ script (e.g., unsupported models):
34+
Extra `hf_ptq.py` flags can be passed via `args`:
4235

4336
```yaml
44-
job_name: <Model>_custom_ptq
45-
pipeline:
46-
task_0:
47-
script: <path_to_your_script.sh>
4837
args:
49-
- --model <model_path>
50-
- --output <output_path>
51-
environment:
52-
- HF_TOKEN: <token>
53-
- CUDA_VISIBLE_DEVICES: "0"
54-
slurm_config:
55-
_factory_: "slurm_factory"
56-
nodes: 1
57-
ntasks_per_node: 1
58-
gpus_per_node: 1
38+
- --batch_size 2
39+
- --trust_remote_code
5940
```
6041

61-
Place custom scripts in `tools/launcher/common/` so the packager includes them.
62-
63-
## SLURM vs Local
64-
65-
The launcher auto-detects based on environment variables:
66-
67-
| Variable | Purpose | Example |
68-
|----------|---------|---------|
69-
| `SLURM_HOST` | Login node for SSH submission | `cluster-login.example.com` |
70-
| `SLURM_ACCOUNT` | SLURM account | `my_account` |
71-
| `SLURM_PARTITION` | SLURM partition | `batch` |
72-
| `HF_TOKEN` | HuggingFace token for gated models | `hf_abc...` |
42+
## Output Location
7343

74-
If `SLURM_HOST` is set → SLURM execution. Otherwise → local Docker.
44+
`EXPORT_PATH` controls the path inside the container (default: `/scratchspace/exported_model`). The launcher mounts `/scratchspace` to a host directory automatically — you cannot change the host path.
7545

76-
For local Docker, pass `hf_local=` to specify the model cache:
46+
To find the checkpoint on the host after completion:
7747

7848
```bash
79-
uv run launch.py --yaml <config> hf_local=/mnt/hf-local --yes
49+
find tools/launcher/local_experiments -name "config.json" -path "*/exported_model/*" 2>/dev/null
8050
```
8151

82-
## GPU Sizing Guide
52+
## SLURM vs Local Docker
8353

84-
| Model size | TP | GPUs | Nodes |
85-
|------------|-----|------|-------|
86-
| < 15B | 1 | 1 | 1 |
87-
| 15B-40B | 2-4 | 2-4 | 1 |
88-
| 40B-100B | 4-8 | 4-8 | 1 |
89-
| 100B+ | 8+ | 8+ | 2+ (use FSDP2 or multi-node) |
54+
| Condition | Mode | Invocation |
55+
| --- | --- | --- |
56+
| `SLURM_HOST` env var set | SLURM | `uv run launch.py --yaml <cfg> --yes` |
57+
| `hf_local=` passed | Local Docker | `uv run launch.py --yaml <cfg> hf_local=<cache> --yes` |
9058

91-
## Dry Run and Debug
59+
For SLURM, also set `SLURM_ACCOUNT` and optionally `SLURM_HF_LOCAL`.
9260

93-
Preview what the launcher will do without running:
61+
## Known Issues
9462

95-
```bash
96-
uv run launch.py --yaml <config> --dryrun --yes -v
97-
```
63+
- **UID mapping in Docker**: May cause `getpwuid` failures. Add `USER=user` and `LOGNAME=user` to environment.
64+
- **Megatron-LM submodule**: Only needed for `MegatronLMQuantizeTask` (Megatron models). HF PTQ via `common/hf_ptq/hf_ptq.sh` does not require it.
9865

99-
Export resolved config:
66+
## Dry Run
10067

10168
```bash
102-
uv run launch.py --yaml <config> --to-yaml resolved.yaml
69+
uv run launch.py --yaml <config> --dryrun --yes -v
10370
```
10471

105-
## Example Configs
106-
107-
Check `tools/launcher/examples/` for working configs:
72+
## Examples
10873

10974
```bash
11075
ls tools/launcher/examples/
11176
```
11277

113-
Copy and modify the closest match for your model.
78+
Copy and modify the closest match.

0 commit comments

Comments
 (0)