documentation revision

binary-husky · binary-husky · commit 69c62eb304e5 · 2026-02-27T00:24:21.000+08:00
diff --git a/.gitignore b/.gitignore
@@ -161,6 +161,7 @@ flash_attn-2.8.*.whl
 tutorial/example_deep_finance/prepare_data/*
 tutorial/example_deep_finance/judge/analytical_sufficiency/*
 tutorial/example_deep_finance/output_report/*
+tutorial/opencode_build_countdown_agent/countdown_dataset
 dataset_gsm8k/*
 
 .dockerignore
diff --git a/README.md b/README.md
@@ -37,7 +37,7 @@ Let's begin with the simplest example: a math agent with a tool call. This is a
 
 Let's begin with the simplest AgentJet Swarm example: also a math agent. In this case, you can use any GPU-less laptop to train the model remotely.
 
-1. Start swarm server and begin swarm overwatch: `ajet-swarm start` and `ajet-swarm overwatch`.
+1. Start swarm server and begin swarm overwatch: `ajet-swarm start` and `ajet-swarm overwatch`. (Alternative: if you are a fan of docker, use our [prebuilt docker image here](docs/en/ajet-swarm-docker.md) without setting up dependencies)
 2. From your laptop (or swarm server localhost), run [this simple script](https://github.com/modelscope/AgentJet/blob/main/tutorial/example_math_swarm/math.py) to begin training:
     ```python
     AJET_SWARM_URL="http://swarm-server-ip:10086" python ./tutorial/example_math_swarm/math.py
@@ -49,12 +49,12 @@ Let's begin with the simplest AgentJet Swarm example: also a math agent. In this
 
 ## ✈️ Features
 
-We aim to build a easy-to-learn Agent tuner that unlock more possibilities for agent developers:
+We aim to build an easy-to-learn Agent tuner that unlocks more possibilities for agent developers:
 
 - **Easy and Friendly**. AgentJet helps you tune models behind your agent workflows easily, optimizing your agents for top performance with minimal effort.
 - **Rich Tutorial Library**. AgentJet provides a rich library of [examples](https://github.com/modelscope/AgentJet/tree/main/tutorial) as tutorials.
-- **Swarm Training**. [This unique feature](https://modelscope.github.io/AgentJet/en/swarm_intro_blog_english/) of AgentJet opens many possibilities: deploying distributed & self-healing rollout workers, **non-shared-parameter multi-agent** training, **multi-runtime & multi-task cocktail** training. And just like Tinker, you can use AgentJet Swarm to train **models even on **GPU-less laptop(s)**.
-- **Efficient and Scalable**. AgentJet uses [verl] as the default backbone (`--backbone=verl`). However, we also support trinity as alternative backbone, accelerating your tuning process via fully asynchronous RFT.
+- **Swarm Training**. [This unique feature](https://modelscope.github.io/AgentJet/en/swarm_intro_blog_english/) of AgentJet opens many possibilities: deploying distributed & self-healing rollout workers, **non-shared-parameter multi-agent** training, **multi-runtime & multi-task cocktail** training. And just like Tinker, you can use AgentJet Swarm to train models even on **GPU-less laptop(s)**.
+- **Efficient and Scalable**. AgentJet uses [verl] as the default backbone (`--backbone=verl`). However, we also support trinity as an alternative backbone, accelerating your tuning process via fully asynchronous RFT.
 - **Flexible and Fast**. AgentJet supports [multi-agent workflows](https://modelscope.github.io/AgentJet/en/workflow/) and adopts a context merging technique, accelerating training by 1.5x to 10x when the workflow involves multi-turn (or multi-agent) conversations.
 - **Reliability and Reproducibility**. Our team keeps track of framework performance across multiple [tasks + major-git-version + training-backbones](https://benchmark.agentjet.top/) (under construction, still gathering data, coming soon).
 
@@ -129,13 +129,13 @@ The internal system orchestrates several specialized modules to handle the compl
 * **Task Runner**: Executes the Agent workflow and calculates rewards.
 * **Model Tuner**: Forwards inference requests from the workflow to the LLM engine.
 * **Context Tracker**: Monitors LLM calls and automatically merges shared-history timelines to improve training efficiency by **1.5x to 10x**.
-* **Swarm Server**: A data interchange center that accept OpenAI-like requests and engine instructions, activated only in AgentJet Swarm mode.
+* **Swarm Server**: A data interchange center that accepts OpenAI-like requests and engine instructions, activated only in AgentJet Swarm mode.
 
 #### 3. Swarm Architecture
 
-When enabled swarm training mode, an additional component will be activated:
+When swarm training mode is enabled, an additional component will be activated:
 
-* **Swarm Data Interchange Server**: Maintains HTTP service, listen to swarm instructions and openai compatible requests. Establishing high-speed zmq communication channel to coordinate other modules.
+* **Swarm Data Interchange Server**: Maintains HTTP service, listens to swarm instructions and OpenAI compatible requests. Establishes a high-speed zmq communication channel to coordinate other modules.
 
 <div align="center">
 <img width="400" alt="image" src="https://serve.gptacademic.cn/publish/shared/Image/arch.jpg"/>
diff --git a/docs/en/ajet-swarm-docker.md b/docs/en/ajet-swarm-docker.md
@@ -10,7 +10,7 @@ This guide explains how to launch the **AgentJet Swarm Server** inside a Docker
 | Requirement | Detail |
 |---|---|
 | Docker | With GPU support (`nvidia-container-toolkit`) |
-| AgentJet Docker image | `ajet:latest` (built from the AgentJet repository) |
+| AgentJet Docker image | `ghcr.io/modelscope/agentjet:main` (built from the AgentJet repository) |
 | LLM model weights | Downloaded locally (e.g., `Qwen2.5-7B-Instruct`) |
 
 
@@ -26,7 +26,7 @@ docker run --rm -it \
   -p 10086:10086 \
   --gpus=all \
   --shm-size=32GB \
-  ajet:latest \
+  ghcr.io/modelscope/agentjet:main \
   bash -c "(ajet-swarm overwatch) & (NO_COLOR=1 LOGURU_COLORIZE=NO ajet-swarm start &>/workspace/log/swarm_server.log)"
 ```
 
@@ -48,7 +48,7 @@ And when completed, you will see a interface like this, which means the deployme
 | `-v /path/to/host/Qwen/Qwen2.5-7B-Instruct:/Qwen/Qwen2.5-7B-Instruct` | **Model mount** — mounts your local model weights directory into the container. The path inside the container must match the `model` field you configure in your training job. |
 | `-v ./swarmlog:/workspace/log` | **Log mount** — mounts a local `./swarmlog` directory to persist server logs outside the container. The VERL training log is written here. |
 | `-p 10086:10086` | **Port mapping** — exposes port `10086` so that Swarm Clients on other machines can reach the server via `http://<server-ip>:10086`. |
-| `ajet:latest` | The AgentJet Docker image. |
+| `ghcr.io/modelscope/agentjet:main` | The AgentJet Docker image. |
 | `bash -c "..."` | Runs two processes concurrently inside the container (see below). |
 
 
@@ -91,7 +91,7 @@ docker run --rm -it \
   -p 10086:10086 \
   --gpus=all \
   --shm-size=32GB \
-  ajet:latest \
+  ghcr.io/modelscope/agentjet:main \
   bash -c "(ajet-swarm overwatch) & (NO_COLOR=1 LOGURU_COLORIZE=NO ajet-swarm start &>/workspace/log/swarm_server.log)"
 ```
 
diff --git a/docs/index.md b/docs/index.md
@@ -17,7 +17,7 @@
             <h3>Get Started with Ease</h3>
         </div>
         <p class="card-desc">
-            AgentJet simplifies the process of tuning the models that power your agent workflows. It supports nearly all major agent frameworks (e.g. <b>agentscope</b>, <b>langchain</b>), as well as <b>framwork-less</b> agents built from HTTP requests.
+            AgentJet simplifies the process of tuning the models that power your agent workflows. It supports nearly all major agent frameworks (e.g. <b>agentscope</b>, <b>langchain</b>), as well as <b>framework-less</b> agents built from HTTP requests.
         </p>
     </a>
     <a href="#example-library" class="feature-card">
@@ -47,7 +47,7 @@
         </div>
         <p class="card-desc">
             Built to support advanced <b>multi-agent</b> and <b>multi-turn</b> LLM workflows,
-            AgentJet intergrates timeline-merging algorithms that
+            AgentJet integrates timeline-merging algorithms that
             automatically analyze and consolidate each agent's LLM timeline,
             <b>accelerating</b> training speed 1.5x ~ 10x.
         </p>
@@ -58,7 +58,7 @@
             <h3>High Resolution Logging</h3>
         </div>
         <p class="card-desc">
-            Log <b>token-level</b> rollout details, capturing token IDs, token <b>loss masks</b>, and token <b>log probabilities</b> with <b>web UI display</b>. This Support workflow development, agent diagnostics, and facilitate research on advanced LLM algorithm studies.
+            Log <b>token-level</b> rollout details, capturing token IDs, token <b>loss masks</b>, and token <b>log probabilities</b> with <b>web UI display</b>. This supports workflow development, agent diagnostics, and facilitates research on advanced LLM algorithm studies.
         </p>
     </a>
     <a href="en/installation/" class="feature-card">
@@ -67,7 +67,7 @@
             <h3>Any Training Engine</h3>
         </div>
         <p class="card-desc">
-            Support <b>multiple training engines</b> as backbone (<b>VERL</b> and <b>Trinity-RFT</b>). Swarm backbone support will be released soon.
+            Supports <b>multiple training engines</b> as backbone (<b>VERL</b> and <b>Trinity-RFT</b>). Swarm backbone support will be released soon.
             Choose from <b>vLLM</b> and <b>SGLang</b> as you wish. Say goodbye to training engine gaps.
         </p>
     </a>
@@ -149,7 +149,7 @@ The internal system orchestrates several specialized modules to handle the compl
 |--------|-------------|
 | **Launcher** | Manages background service processes (Ray, vLLM) and routes the backbone |
 | **Task Rollout** | Bridges LLM engines and manages the Gym environment lifecycle |
-| **Task Runner** | Executes the AgentScope workflow and calculates rewards |
+| **Task Runner** | Executes the agent workflow and calculates rewards |
 | **Model Tuner** | Forwards inference requests from the workflow to the LLM engine |
 | **Context Tracker** | Monitors LLM calls and automatically merges shared-history timelines (1.5x-10x efficiency boost) |
 
diff --git a/tutorial/opencode_build_countdown_agent/agent_roll.py b/tutorial/opencode_build_countdown_agent/agent_roll.py
@@ -7,11 +7,11 @@
 This script connects to the AgentJet Swarm server and trains the countdown agent.
 
 Usage:
-    python -m tutorial.countdown_agent.agent_roll
+    python -m tutorial.opencode_build_countdown_agent.agent_roll
 
 Before running:
     1. Start the swarm server: ajet-swarm start
-    2. Ensure the dataset is generated: python tutorial/countdown_agent/generate_countdown_dataset.py
+    2. Ensure the dataset is generated: python tutorial/opencode_build_countdown_agent/generate_countdown_dataset.py
     3. Update the configuration variables below to match your setup
 """
 
@@ -32,7 +32,7 @@
 # --------- Configurations that take effect locally -------------
 LOCAL_GRPO_N = 4  # GRPO group size (number of rollouts per task)
 LOCAL_NUM_EPOCH = 100  # Number of training epochs
-LOCAL_DATASET_PATH = "./tutorial/countdown_agent/countdown_dataset/train.jsonl"
+LOCAL_DATASET_PATH = "./tutorial/opencode_build_countdown_agent/countdown_dataset/train.jsonl"
 REMOTE_SWARM_URL = "http://localhost:10086"  # Swarm server URL
 
 # --------- Configurations that take effect remotely (on swarm server) -------------
diff --git a/tutorial/opencode_build_countdown_agent/readme.md b/tutorial/opencode_build_countdown_agent/readme.md
@@ -25,7 +25,7 @@ Possible solution: 25 * (6 - 2) = 100
 ## Project Structure
 
 ```
-tutorial/countdown_agent/
+tutorial/opencode_build_countdown_agent/
 ├── agent_run.py                    # Intelligent agent execution and reward calculation
 ├── agent_roll.py                   # Training loop script
 ├── generate_countdown_dataset.py  # Dataset generation script
@@ -97,10 +97,10 @@ First, generate the training data:
 
 ```bash
 cd /root/agentjet
-python tutorial/countdown_agent/generate_countdown_dataset.py
+python tutorial/opencode_build_countdown_agent/generate_countdown_dataset.py
 ```
 
-This will generate training data in the `tutorial/countdown_agent/countdown_dataset/` directory.
+This will generate training data in the `tutorial/opencode_build_countdown_agent/countdown_dataset/` directory.
 
 ### Step 2: Start Swarm Server
 
@@ -124,7 +124,7 @@ Edit the configuration parameters in `agent_roll.py`:
 # Local configuration
 LOCAL_GRPO_N = 4              # GRPO group size
 LOCAL_NUM_EPOCH = 100         # Number of training epochs
-LOCAL_DATASET_PATH = "./tutorial/countdown_agent/countdown_dataset/train.jsonl"
+LOCAL_DATASET_PATH = "./tutorial/opencode_build_countdown_agent/countdown_dataset/train.jsonl"
 REMOTE_SWARM_URL = "http://localhost:10086"  # Swarm server URL
 
 # Remote configuration (effective on the Swarm server)
@@ -138,7 +138,7 @@ REMOTE_TRAIN_MODEL = '/mnt/data_cpfs/model_cache/modelscope/hub/Qwen/Qwen/Qwen2.
 Run the training script:
 
 ```bash
-python -m tutorial.countdown_agent.agent_roll
+python -m tutorial.opencode_build_countdown_agent.agent_roll
 ```
 
 ## Training Configuration Instructions
@@ -197,7 +197,7 @@ You can run `agent_run.py` separately to test a single problem:
 
 ```python
 from ajet.schema.task import Task
-from tutorial.countdown_agent.agent_run import run_agent_and_compute_reward
+from tutorial.opencode_build_countdown_agent.agent_run import run_agent_and_compute_reward
 
 task = Task(
     main_query="Target: 100, Numbers: [25, 3, 6, 2, 5, 1]",
@@ -217,7 +217,7 @@ task = Task(
 View generated example data:
 
 ```bash
-cat tutorial/countdown_agent/countdown_dataset/examples.json
+cat tutorial/opencode_build_countdown_agent/countdown_dataset/examples.json
 ```
 
 ### 3. Mid-process Debugging