Commit 4727635
committed
fix(llm_eval): migrate lm_eval_hf.py to lm-eval >= 0.4.10 HarnessCLI
lm-eval 0.4.10 replaced lm_eval.__main__.{setup_parser, parse_eval_args}
with a HarnessCLI-based interface in lm_eval._cli, breaking the script's
import. Drive HarnessCLI directly: extend the run subparser with the
ModelOpt args, then move them out of the namespace into args.model_args
so EvaluatorConfig.from_cli does not reject them. Bump pinned lm-eval
versions in examples/llm_eval and examples/puzzletron requirements,
and add an end-to-end test that runs lm_eval_hf.py against a tiny qwen3.
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>1 parent e2d29c8 commit 4727635
4 files changed
Lines changed: 74 additions & 35 deletions
File tree
- examples
- llm_eval
- puzzletron
- tests/examples/llm_eval
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
| 45 | + | |
46 | 46 | | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
52 | 51 | | |
53 | 52 | | |
| 53 | + | |
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| |||
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
163 | | - | |
164 | | - | |
165 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
166 | 181 | | |
167 | 182 | | |
168 | 183 | | |
| |||
221 | 236 | | |
222 | 237 | | |
223 | 238 | | |
224 | | - | |
225 | 239 | | |
226 | 240 | | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
231 | 249 | | |
232 | | - | |
| 250 | + | |
233 | 251 | | |
234 | 252 | | |
235 | 253 | | |
236 | 254 | | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
250 | 259 | | |
251 | 260 | | |
252 | 261 | | |
253 | | - | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | 1 | | |
3 | 2 | | |
4 | 3 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | | - | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
20 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
21 | 43 | | |
22 | 44 | | |
23 | 45 | | |
24 | | - | |
| 46 | + | |
| 47 | + | |
25 | 48 | | |
26 | 49 | | |
27 | | - | |
| 50 | + | |
28 | 51 | | |
29 | 52 | | |
30 | 53 | | |
| |||
0 commit comments