InternScience · ChenZiHong-Gavin · Apr 13, 2026 · Apr 13, 2026 · Apr 13, 2026 · Apr 13, 2026
diff --git a/README.md b/README.md
@@ -99,7 +99,16 @@ Here is post-training result which **over 50% SFT data** comes from GraphGen and
 |   Math    |                          AIME24                           | **20.6** |              16.7              |
 |           |                          AIME25                           | **22.7** |              7.2               |
 
+### RLVR
+We applied reinforcement learning directly to the Qwen2.5-7B base model without any prior SFT. Here are the results.
+|  Domain   |                          Dataset                          |   Ours   | Qwen2.5-7B-Instruct (baseline) |
+|:---------:|:---------------------------------------------------------:|:--------:|:------------------------------:|
+|   Plant   | [SeedBench](https://github.com/open-sciencelab/SeedBench) | **66.8** |              51.5              |
+|    Law    |                           LawBench                        | **55.2** |              54.76             |
+|  Medicine |                            MedQA                          | **87.1** |              80.7              |
+|  General  |                             BBH                           | **55.3** |              49.6              |
 
+More details can be found at [`examples/generate/generate_masked_fill_in_blank_qa`](./examples/generate/generate_masked_fill_in_blank_qa).
 
 ## ⚙️ Support List