Skip to content

Commit 4217765

Browse files
author
beanbun
committed
update README
1 parent 39e2352 commit 4217765

1 file changed

Lines changed: 9 additions & 0 deletions

File tree

README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,16 @@ Here is post-training result which **over 50% SFT data** comes from GraphGen and
9999
| Math | AIME24 | **20.6** | 16.7 |
100100
| | AIME25 | **22.7** | 7.2 |
101101

102+
### RLVR
103+
We applied reinforcement learning directly to the Qwen2.5-7B base model without any prior SFT. Here are the results.
104+
| Domain | Dataset | Ours | Qwen2.5-7B-Instruct (baseline) |
105+
|:---------:|:---------------------------------------------------------:|:--------:|:------------------------------:|
106+
| Plant | SeedBench | **66.8** | 51.5 |
107+
| law | LawBench | **55.2** | 54.76 |
108+
| Medicine | MedQA | **87.1** | 80.7 |
109+
| General | BBH | **55.3** | 49.6 |
102110

111+
More details can be found at `examples/generate/generate_masked_fill_in_blank_qa`
103112

104113
## ⚙️ Support List
105114

0 commit comments

Comments
 (0)