Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

README.md

Recipes

Model Method
DeepSeek-R1-Distill-Qwen-1.5B Best-of-N w/ orginal decoding
Best-of-N w/ CyclicReflex
Beam search w/ orginal decoding
Beam search w/ CyclicReflex

Testing

Each approach can be launched by specifying the associated YAML file, for example:

export CONFIG=recipes/DeepSeek-R1-Distill-Qwen-1.5B/best_of_n_cyclical.yaml

python scripts/test_time_compute.py $CONFIG --dataset_name=HuggingFaceH4/MATH-500 --dataset_split=train

Extracting the MATH-500 accuracy numbers

To get the final numbers for the evalations, we use a fork of the Qwen2.5-Math evaluation repo. Please follow the installation and usage instructions in our fork to obtain accuracies on MATH-500.