Commit 3b8c1c2
fix: try local eval before slow /evaluate endpoint in evaluate_dense (#245)
51% of TRL training time wasted on 5050 evaluate timeouts (180s × 3
retries = 9 min per evaluation). The local evaluation via
evaluate_checks_local takes ~5s.
Fix: when task config has checks defined, try local eval FIRST. Only
fall through to the slow /evaluate endpoint when no local checks exist.
This eliminates the 9-minute timeout for custom YAML tasks that define
their own checks.
Before: evaluate() [9 min] → if 0.0 → local [5s]
After: local [5s] → if no checks → evaluate() [9 min]
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent d8c6187 commit 3b8c1c2
2 files changed
Lines changed: 90 additions & 22 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
602 | 602 | | |
603 | 603 | | |
604 | 604 | | |
605 | | - | |
606 | | - | |
607 | | - | |
608 | | - | |
609 | | - | |
610 | | - | |
611 | | - | |
612 | | - | |
613 | | - | |
614 | | - | |
615 | | - | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
621 | | - | |
622 | | - | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
623 | 617 | | |
624 | 618 | | |
625 | 619 | | |
| |||
628 | 622 | | |
629 | 623 | | |
630 | 624 | | |
631 | | - | |
632 | | - | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
633 | 628 | | |
634 | 629 | | |
635 | 630 | | |
636 | | - | |
637 | | - | |
| 631 | + | |
638 | 632 | | |
639 | 633 | | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
640 | 644 | | |
641 | 645 | | |
642 | 646 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
0 commit comments