Skip to content

Commit 879a3da

Browse files
committed
Update authors in das post under
1 parent 2731b77 commit 879a3da

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

content/posts/das.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ Author: Vikranth Srivatsa, Yiying Zhang
1515

1616
*This is joint work with [Together AI](https://www.together.ai/).*
1717

18+
*Paper by Zelei Shao, Vikranth Srivatsa, Sanjana Srivastava, Qingyang Wu, Alpay Ariyak, Xiaoxia Wu, Ameen Patel, Jue Wang, Percy Liang, Tri Dao, Ce Zhang, Yiying Zhang, Ben Athiwaratkun, Chenfeng Xu, and Junxiong Wang.*
19+
1820
**TLDR**: Reinforcement learning (RL) post-training spends most of its wall-clock time in the "rollout" phase generating answers, and a few very long generations dominate every training step. We designed [DAS [MLSys '26]](https://arxiv.org/abs/2511.13841), a distribution-aware speculative decoding framework that speeds up RL rollouts without changing what the model learns. DAS uses a training-free drafter that continually rebuilds itself from recent rollouts and spends its speculation budget on the long generations that set the pace, cutting **rollout time by up to 50%** while keeping the training curve identical to the baseline.
1921

2022
## RL Post-Training Is Bottlenecked by the Rollout

0 commit comments

Comments
 (0)