NVIDIA-NeMo · marta-sd · May 22, 2026 · May 21, 2026
diff --git a/packages/nemo-evaluator-launcher/.claude/skills/launching-evals/SKILL.md b/packages/nemo-evaluator-launcher/.claude/skills/launching-evals/SKILL.md
@@ -58,9 +58,9 @@ The complete evaluation workflow is divided into the following steps you should
 # Key Facts
 
 - Benchmark-specific info learned during launching/analyzing evals should be added to `references/benchmarks/`
-- **PPP** = Slurm account (the `account` field in cluster_config.yaml). When the user says "change PPP to X", update the account value (e.g., `coreai_dlalgo_compeval` → `coreai_dlalgo_llm`).
+- **SLURM account**: the `account` field in `cluster_config.yaml`. When the user asks to change it (some teams call this a "PPP"), update the value (e.g., `<account_name>` → `<new_account_name>`).
 - **Slurm job pairs**: NEL (nemo-evaluator-launcher) submits paired Slurm jobs — a RUNNING job + a PENDING restart job (for when the 4h walltime expires). Never cancel the pending restart jobs — they are expected and necessary.
-- **HF cache requirement**: For configs with `HF_HUB_OFFLINE=1`, models must be pre-downloaded to the HF cache on each cluster before launching. **Before running a model on a new cluster, always ask the user if the model is already cached there.** If not, on the cluster login node: `python3 -m venv hf_cli && source hf_cli/bin/activate && pip install huggingface_hub` then `HF_HOME=/lustre/fsw/portfolios/coreai/users/<username>/cache/huggingface hf download <model>`. Without this, vLLM will fail with `LocalEntryNotFoundError`.
+- **HF cache requirement**: For configs with `HF_HUB_OFFLINE=1`, models must be pre-downloaded to the HF cache on each cluster before launching. **Before running a model on a new cluster, always ask the user if the model is already cached there.** If not, on the cluster login node: `python3 -m venv hf_cli && source hf_cli/bin/activate && pip install huggingface_hub` then `HF_HOME=<your_hf_cache_dir> hf download <model>` (typically a shared filesystem accessible from compute nodes — e.g., a `/lustre/...` mount on multi-node clusters or `~/.cache/huggingface` for single-node setups). Without this, vLLM will fail with `LocalEntryNotFoundError`.
 - **`data_parallel_size` is per node**: `dp_size=1` with `num_nodes=8` means 8 model instances total (one per node), load-balanced by haproxy. Do NOT interpret `dp_size` as the global replica count.
 - **`payload_modifier` interceptor**: The `params_to_remove` list (e.g. `[max_tokens, max_completion_tokens]`) strips those fields from the outgoing payload, intentionally lifting output length limits so reasoning models can think as long as they need.
 - **Auto-export git workaround**: The export container (`python:3.12-slim`) lacks `git`. When installing the launcher from a git URL, set `auto_export.launcher_install_cmd` to install git first (e.g., `apt-get update -qq && apt-get install -qq -y git && pip install "nemo-evaluator-launcher[all] @ git+...#subdirectory=packages/nemo-evaluator-launcher"`).