Skip to content

Commit d945cb2

Browse files
committed
[Create PR]:
- Add args description to readme
1 parent 8b7343f commit d945cb2

1 file changed

Lines changed: 23 additions & 5 deletions

File tree

examples/mmlu_benchmark/README.md

Lines changed: 23 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,10 @@ Full evaluation results look like:
1616
================================================================================
1717
Results Summary (Evaluated Tasks)
1818
================================================================================
19-
Total tasks: 100
20-
Correct: 35
21-
Accuracy (on anchor points): 0.3500
22-
Accuracy norm (on anchor points): 0.3500
23-
Built predictions tensor with shape: (1, 100, 31)
19+
Total tasks: 14042
20+
Correct: 8291
21+
Accuracy (on anchor points): 0.5904
22+
Accuracy norm (on anchor points): 0.5904
2423
```
2524

2625
## Run with DISCO (predicted full-benchmark score)
@@ -39,3 +38,22 @@ DISCO Predicted Full Benchmark Accuracy:
3938
----------------------------------------
4039
Model 0: 0.606739
4140
```
41+
42+
## Arguments
43+
44+
| Argument | Description | Default |
45+
|----------|-------------|---------|
46+
| `--model_id` | HuggingFace model identifier (e.g. `meta-llama/Llama-2-7b-hf`) | *(required)* |
47+
| `--data_path` | Path to MMLU prompts JSON file or Hugging Face dataset repo id | `arubique/flattened-MMLU` |
48+
| `--anchor_points_path` | Path to anchor points pickle file; if set, only anchor tasks are evaluated ||
49+
| `--output_dir` | Directory to save results | `./results` |
50+
| `--predictions_path` | Path to save predictions tensor as pickle (for DISCO) ||
51+
| `--limit` | Limit number of tasks to evaluate (for testing) ||
52+
| `--batch_size` | Batch size for evaluation (reserved for future use) | `1` |
53+
| `--device` | Device to run model on (e.g. `cuda:0`, `cpu`) | `cuda:0` |
54+
| `--num_workers` | Number of parallel workers for task execution | `1` |
55+
| `--disco_model_path` | If set, run DISCO prediction; path to `.pkl`, `.npz`, or Hugging Face repo id ||
56+
| `--disco_transform_path` | Path to DISCO PCA transform `.pkl` or `.npz` (for local DISCO model when using `--pca`) ||
57+
| `--pca` | PCA dimension for DISCO embeddings ||
58+
| `--pad_to_size` | Pad predictions to this size with -inf ||
59+
| `--use_lmeval_batching` | Use lm-evaluation-harness-style batching for exact numerical match with DISCO repo | off |

0 commit comments

Comments
 (0)