Commit c0a32a7
[skills] evaluation: prefer checkpoint_path over hf_model_handle
hf_model_handle is not reliably mounted at /checkpoint in current NEL: with
only hf_model_handle set, `vllm serve /checkpoint` makes vLLM treat the
literal '/checkpoint' as an HF repo id and the deploy dies with
`HFValidationError: Repo id must use alphanumeric chars ... : '/checkpoint'`.
Document preferring checkpoint_path (download the HF model to the cluster via
snapshot_download first) in the evaluation SKILL Step 3 and example_eval.yaml.
Hit while running a BF16 baseline (Qwen/Qwen3.5-9B) for an NVFP4 comparison.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>1 parent f11770d commit c0a32a7
2 files changed
Lines changed: 12 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| 80 | + | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
Lines changed: 10 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | | - | |
28 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
29 | 36 | | |
30 | 37 | | |
31 | 38 | | |
| |||
0 commit comments