You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once the model is ready, you can evaluate it using [Language Model Evaluation Harness](https://pypi.org/project/lm-eval/). For example, run the following to evaluate the model on [Massive Multitask Language Understanding](https://huggingface.co/datasets/cais/mmlu) benchmark.
202
+
### Local Evaluation with NeMo Evaluator (AnyModel)
203
+
204
+
AnyModel checkpoints are currently supported via the patched NeMo Evaluator deployable
205
+
in [`examples/puzzletron/evaluation/`](./examples/puzzletron/evaluation/). This deploys a local OpenAI-style completions endpoint that evaluation can be run against.
206
+
207
+
>**Note:** This flow requires Ray. If it is missing, install it in the container/venv:
208
+
>
209
+
>```bash
210
+
> pip install ray
211
+
>```
212
+
>
213
+
**Deploy the model locally on an interactive node (2 GPUs example):**
0 commit comments