Request for Agentic RL training details (thank you for open-sourcing!)

I’m trying to follow/reproduce the Agentic RL part, but some training details are not fully clear from the current docs. 

**Would you consider open-sourcing the training code/details for the Agentic RL part to make it easier for others to reproduce and build on your results?**

Thanks again for the great open-source release!