Skip to content

Commit 222f06b

Browse files
committed
Simplify alternative instructions for nemo evaluator
Signed-off-by: jrausch <jrausch@nvidia.com>
1 parent 7b88e66 commit 222f06b

1 file changed

Lines changed: 5 additions & 15 deletions

File tree

examples/puzzletron/evaluation/nemo_evaluator_instructions.md

Lines changed: 5 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,9 @@
33
> **Recommended approach:** Use lm-eval for direct evaluation without a
44
> deployment server. See the main [README](../README.md#evaluation) for details.
55
6-
This document describes an alternative evaluation flow using NeMo Evaluator.
7-
It deploys the checkpoint as a local OpenAI-compatible completions endpoint
8-
and runs evaluation against it.
6+
Evaluate AnyModel checkpoints by deploying a local OpenAI-compatible completions endpoint and running benchmarks against it.
97

10-
## Prerequisites
11-
12-
- NeMo container (e.g. `nemo:25.11`) with NeMo Evaluator and NeMo Export-Deploy
13-
- Ray (`pip install -r examples/puzzletron/requirements.txt`)
14-
15-
## 1. Deploy the model (2 GPUs example)
8+
**1. Deploy the model (2 GPUs example):**
169

1710
```bash
1811
# Install the AnyModel-patched deployable (first time only: backs up the original)
@@ -29,18 +22,15 @@ python /opt/Export-Deploy/scripts/deploy/nlp/deploy_ray_hf.py \
2922
--trust_remote_code --port 8083 --device_map "auto" --cuda_visible_devices "0,1"
3023
```
3124

32-
Adjust GPU counts and `cuda_visible_devices` to match your node.
33-
34-
## 2. Run MMLU
25+
**2. Run MMLU:**
3526

3627
```bash
3728
eval-factory run_eval \
3829
--eval_type mmlu \
3930
--model_id anymodel-hf \
4031
--model_type completions \
4132
--model_url http://0.0.0.0:8083/v1/completions/ \
42-
--output_dir examples/puzzletron/evals/mmlu_anymodel \
43-
--overrides "config.params.task=mmlu,config.params.extra.tokenizer=path/to/checkpoint,config.params.extra.tokenizer_backend=huggingface"
33+
--output_dir examples/puzzletron/evals/mmlu_anymodel
4434
```
4535

46-
For a quick debug run, add `,config.params.limit_samples=5` to `--overrides`.
36+
For a quick debug run, add `--overrides "config.params.limit_samples=5"`.

0 commit comments

Comments
 (0)