Skip to content

Add mmath500:en, fix inspect backend for multilingual tasks#7

Merged
dzautner merged 2 commits intomainfrom
daniel/translate-prompts
Apr 15, 2026
Merged

Add mmath500:en, fix inspect backend for multilingual tasks#7
dzautner merged 2 commits intomainfrom
daniel/translate-prompts

Conversation

@dzautner
Copy link
Copy Markdown

Adds English MATH-500 eval (mmath500:en) alongside the existing Finnish one, and fixes the inspect backend so it can actually load multilingual tasks.

  • mmath500:en task config using HuggingFaceH4/MATH-500 with Qwen3.5-9B scorer (no reasoning)
  • Translated prompt templates for maime and mmath500
  • Deprecation warning on old math_500 task
  • Fix inspect backend hardcoding load_multilingual=False and custom_tasks=None — now respects the CLI flags like all other backends do

Daniel Zautner added 2 commits April 15, 2026 11:33
Add English MATH-500 under multilingual task definitions (mmath500:en)
with configurable scorer model via env vars, unifying naming with
mmath500:fi. Deprecate the original math_500 task with a warning.
The inspect backend hardcoded load_multilingual=False and custom_tasks=None,
making it impossible to run multilingual or custom tasks through it.
@dzautner
Copy link
Copy Markdown
Author

this also has the changes from the pr i sent upstream at huggingface#1199

@dzautner dzautner merged commit f9ab622 into main Apr 15, 2026
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants