Skip to content

Commit c45384c

Browse files
committed
update docs
1 parent 5003207 commit c45384c

20 files changed

+66
-186
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ uv pip install flash_attn==2.8.1 --no-build-isolation --no-cache-dir
6767
You can start training your first agent with a single command using a pre-configured YAML file. Take the [Math agent](docs/en/example_math_agent.md) as an example:
6868

6969
```bash
70-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='trinity' --with-ray
70+
ajet --conf tutorial/example_math_agent/math_agent.yaml
7171
```
7272

7373
#### Example Library

docs/en/example_app_world.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ export APPWORLD_SCRIPT="bash EnvService/env_sandbox/appworld.sh"
4343
Run the training script:
4444

4545
```bash
46-
ajet --conf tutorial/example_appworld/appworld.yaml --backbone='trinity' --with-ray
46+
ajet --conf tutorial/example_appworld/appworld.yaml --with-appworld
4747
```
4848

4949
<details>
@@ -90,13 +90,13 @@ This section explains how the AppWorld example is assembled: workflow, reward, c
9090

9191
The AgentScope workflow code for the AppWorld example is located at `tutorial/example_appworld/appworld.py`.
9292

93-
The code first defines the AgentScope workflow (set the agent's `model` to `model_tuner`):
93+
The code first defines the AgentScope workflow (set the agent's `model` to `tuner.as_agentscope_model()`):
9494

9595
```python
9696
agent = ReActAgent(
9797
name="Qwen",
9898
sys_prompt=first_msg["content"],
99-
model=model_tuner,
99+
model=tuner.as_agentscope_model(),
100100
formatter=DashScopeChatFormatter(),
101101
memory=InMemoryMemory(),
102102
toolkit=None,
@@ -105,7 +105,7 @@ agent = ReActAgent(
105105

106106
env = workflow_task.gym_env
107107

108-
for step in range(model_tuner.config.ajet.rollout.multi_turn.max_steps):
108+
for step in range(tuner.config.ajet.rollout.multi_turn.max_steps):
109109
# agentscope deal with interaction message
110110
reply_message = await agent(interaction_message)
111111
# env service protocol
@@ -117,14 +117,14 @@ for step in range(model_tuner.config.ajet.rollout.multi_turn.max_steps):
117117
# is terminated?
118118
if terminate:
119119
break
120-
if model_tuner.get_context_tracker().context_overflow:
120+
if tuner.get_context_tracker().context_overflow:
121121
break
122122
```
123123

124124
In the above code:
125125

126126
- `env.step`: simulates the gym interface. It takes an action as input and returns a four-tuple `(observation, reward, terminate_flag, info)`.
127-
- `model_tuner.get_context_tracker().context_overflow`: checks whether the current context window has exceeded the token limit.
127+
- `tuner.get_context_tracker().context_overflow`: checks whether the current context window has exceeded the token limit.
128128

129129

130130
### 3.2 Reward

docs/en/example_countdown.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ See details in `tutorial/example_countdown/countdown.py`. You can create new Age
8484
self.agent = ReActAgent(
8585
name="countdown_react_agent",
8686
sys_prompt=system_prompt,
87-
model=model_tuner,
87+
model=tuner.as_agentscope_model(),
8888
formatter=DashScopeChatFormatter(),
8989
memory=InMemoryMemory(),
9090
max_iters=2,

docs/en/example_frozenlake.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,13 +23,13 @@ pip install gymnasium[toy_text]
2323
Use the provided configuration file to quickly start training:
2424

2525
```bash
26-
ajet --conf tutorial/example_frozenlake/frozenlake_easy.yaml --backbone='trinity' --with-ray
26+
ajet --conf tutorial/example_frozenlake/frozenlake_easy.yaml --backbone='verl'
2727
```
2828

2929
To try a harder setting:
3030

3131
```bash
32-
ajet --conf tutorial/example_frozenlake/frozenlake_hard.yaml --backbone=trinity --with-ray
32+
ajet --conf tutorial/example_frozenlake/frozenlake_hard.yaml --backbone='verl'
3333
```
3434

3535
<details>

docs/en/example_learning_to_ask.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ After preprocessing, you should have: `train.jsonl` and`test.jsonl`。
4242
#### 2.2 Start Training
4343

4444
```bash
45+
ajet --conf tutorial/example_learn2ask/learn2ask.yaml --backbone='verl'
46+
# or
4547
ajet --conf tutorial/example_learn2ask/learn2ask.yaml --backbone='trinity' --with-ray
4648
```
4749

docs/en/example_math_agent.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ python scripts/download_dataset.py --target=openai/gsm8k --path=/the/path/to/sto
3535
# (optional) recommended cleanup before training
3636
# ajet --kill="python|ray|vllm"
3737

38-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='trinity' --with-ray
38+
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='verl'
3939
```
4040

4141
??? tip "Quick Debugging (Optional)"
@@ -135,7 +135,7 @@ self.toolkit.register_tool_function(execute_python_code)
135135
self.agent = ReActAgent(
136136
name="math_react_agent",
137137
sys_prompt=system_prompt,
138-
model=model_tuner, # trainer-managed model wrapper
138+
model=tuner.as_agentscope_model(), # trainer-managed model wrapper
139139
formatter=DashScopeChatFormatter(),
140140
toolkit=self.toolkit,
141141
memory=InMemoryMemory(),

docs/en/example_tracing_feedback_loop.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,8 @@ ajet:
7575
When everything is ready, start the training with `launcher.py`.
7676

7777
```bash
78-
# this launch the demo
78+
ajet --conf tutorial/example_feedback_tracing/example_feedback_tracing.yaml --backbone='verl'
79+
# or
7980
ajet --conf tutorial/example_feedback_tracing/example_feedback_tracing.yaml --backbone='trinity' --with-ray
8081
```
8182

docs/en/example_werewolves.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Scenario Overview
2626
Start training with the following command:
2727
```
2828
# ( ajet --kill="python|ray|vllm" )
29-
ajet --conf tutorial/example_werewolves/werewolves.yaml --backbone='trinity' --with-ray
29+
ajet --conf tutorial/example_werewolves/werewolves.yaml --backbone='verl'
3030
```
3131

3232
<details>
@@ -72,9 +72,9 @@ When `--backbone=debug`, Ray is disabled. You can use a VSCode `.vscode/launch.j
7272
At a high level, each training iteration follows this flow:
7373
- The task reader generates a new game setup (players, role assignments, initial state).
7474
- The rollout runs the AgentScope workflow to simulate a full game.
75-
- Agents in `trainable_targets` act using the trainable model (via `model_tuner`), while opponents use the fixed model.
75+
- Agents in `trainable_targets` act by using the trainable model (via `tuner.as_agentscope_model(...)`), while opponents use the fixed model.
7676
- The environment produces rewards / outcomes for the episode.
77-
- Trajectories are collected and passed to the backbone trainer (e.g., `trinity`) to update the trainable model.
77+
- Trajectories are collected and passed to the backbone trainer (`verl` or `trinity`) to update the trainable model.
7878

7979
### 3.2 Configuration Details
8080

docs/en/intro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ We recommend using `uv` for dependency management.
5151
You can start training your first agent with a single command using a pre-configured YAML file:
5252

5353
```bash
54-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='trinity' --with-ray
54+
ajet --conf tutorial/example_math_agent/math_agent.yaml
5555
```
5656

5757
!!! example "Learn More"

docs/en/introduction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ uv pip install flash_attn==2.8.1 --no-build-isolation --no-cache-dir
5151
You can start training your first agent with a single command using a pre-configured YAML file. Take the [Math agent](./example_math_agent.md) as an example:
5252

5353
```bash
54-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='trinity' --with-ray
54+
ajet --conf tutorial/example_math_agent/math_agent.yaml
5555
```
5656

5757
#### Example Library

0 commit comments

Comments
 (0)