Summary
Add a user guide documenting the best practice for using the ROCK Job system to collect Agent trajectories for model distillation training.
Motivation
Trajectory distillation (using a strong Teacher model's behavioral data to train a weaker Student model) is a common use case for the Job system, but there is currently no documentation covering this workflow end-to-end.
Scope
- User guide document (Chinese + English) under
docs/versioned_docs/version-1.7.x/User Guides/
- Example code and config template under
examples/trajectory_distillation/
- Covers: quick start (end-to-end runnable example), configuration details, trajectory data reference (result.json + trajectory.json), advanced usage (async mode, rejection sampling, DPO pairs)
- Validated against a real ROCK deployment with swe-agent on SWE-bench
Summary
Add a user guide documenting the best practice for using the ROCK Job system to collect Agent trajectories for model distillation training.
Motivation
Trajectory distillation (using a strong Teacher model's behavioral data to train a weaker Student model) is a common use case for the Job system, but there is currently no documentation covering this workflow end-to-end.
Scope
docs/versioned_docs/version-1.7.x/User Guides/examples/trajectory_distillation/