Fix link in README for ByteDance Seed paper

ChenZiHong-Gavin · web-flow · commit 34c5d79a8f5a · 2026-03-11T19:19:01.000+08:00
diff --git a/README.md b/README.md
@@ -75,7 +75,7 @@ After data generation, you can use [LLaMA-Factory](https://github.com/hiyouga/LL
 ## Effectiveness of GraphGen
 ### Pretrain
 
-Inspired by Kimi-K2's [technical report](https://arxiv.org/pdf/2507.20534) (Improving Token Utility with Rephrasing)  and ByteDance Seed's [Reformulation for Pretraining Data Augmentation](https://arxiv.org/abs/2507.15752) (MGA framework), GraphGen added a **rephrase pipeline** — using LLM-driven reformulation to generate diverse variants of the same corpus instead of redundant repetition.
+Inspired by Kimi-K2's [technical report](https://arxiv.org/pdf/2507.20534) (Improving Token Utility with Rephrasing)  and ByteDance Seed's [Reformulation for Pretraining Data Augmentation](https://arxiv.org/abs/2502.04235) (MGA framework), GraphGen added a **rephrase pipeline** — using LLM-driven reformulation to generate diverse variants of the same corpus instead of redundant repetition.
 
 **Setup:** Qwen3-0.6B trained from scratch on [SlimPajama-6B](https://huggingface.co/datasets/DKYoon/SlimPajama-6B).