Improve step 4 routing and use nvidia-smi for GPU detection

mxinO · mxinO · commit 60e654c7b60a · 2026-03-25T06:44:39.000Z
Signed-off-by: Meng Xin &lt;mxin@nvidia.com&gt;
diff --git a/.claude/skills/common/environment-setup.md b/.claude/skills/common/environment-setup.md
@@ -44,9 +44,11 @@ Run these checks on the **target machine** (local, or via SSH if remote):
 ```bash
 which srun sbatch 2>/dev/null && echo "SLURM"
 docker info 2>/dev/null | grep -qi nvidia && echo "Docker+GPU"
-python -c "import torch; print(torch.cuda.device_count(), 'GPUs')" 2>/dev/null
+nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null
 ```
 
+Use `nvidia-smi` for GPU detection — it's more reliable than `torch.cuda` which depends on the Python environment having CUDA-enabled PyTorch installed.
+
 ### Execution context summary
 
 After detection, you should know which row you're in:
diff --git a/.claude/skills/ptq/SKILL.md b/.claude/skills/ptq/SKILL.md
@@ -46,7 +46,13 @@ All format definitions: `modelopt/torch/quantization/config.py`.
 
 **Goal: checkpoint on disk** (`.safetensors` + `config.json`). Always smoke test first (`--calib_size 4`), then full calibration.
 
-### 4A — Direct: supported model
+**Which path?** Based on step 1:
+
+- SLURM or Docker+GPU detected → **4B (Launcher)**
+- Bare GPU, no Docker/SLURM → **4A (Direct)**
+- Unsupported model (any env) → **4C (Custom script)**
+
+### 4A — Direct: supported model (bare GPU, no Docker/SLURM)
 
 ```bash
 pip install --no-build-isolation "nvidia-modelopt[hf]"