Skip to content

Commit b548324

Browse files
committed
Merge branch 'main' into dev/shuchang
2 parents d930ffb + 9389d31 commit b548324

File tree

90 files changed

+4747
-5414
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+4747
-5414
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,3 +149,5 @@ appworld_pack_v2.tar*
149149
saved_checkpoints
150150
data
151151
datasets
152+
tutorial2
153+
site

README.md

Lines changed: 55 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,46 @@
1-
# AgentJet
1+
# AgentJet (Beta)
22

33
[![Benchmarking](https://img.shields.io/badge/Benchmarking-0078D4?style=for-the-badge&logo=github)](https://benchmark.agent-matrix.com/)
4-
[![Docs](https://img.shields.io/badge/Docs-Read%20the%20Guide-0A7ECC?style=for-the-badge&logo=readthedocs&logoColor=white)](docs/en/installation.md)
4+
[![Docs](https://img.shields.io/badge/Docs-Read%20the%20Documents-0A7ECC?style=for-the-badge&logo=readthedocs&logoColor=white)](https://doc.agentjet.top/AgentJet)
55
[![License](https://img.shields.io/badge/License-Apache--2.0-4c1?style=for-the-badge)](LICENSE)
6-
[![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)](docs/en/installation.md#requirements)
6+
[![Python](https://img.shields.io/badge/Python-3.10+-3776AB?style=for-the-badge&logo=python&logoColor=white)](https://doc.agentjet.top/AgentJet/en/installation#requirements)
7+
8+
<div align="center">
9+
<a href="https://doc.agentjet.top/AgentJet" target="_blank">
10+
<img width="500" alt="AgentJet" src="docs/agentjet.jpg"/>
11+
</a>
12+
</div>
713

8-
**AgentJet (AJet)** is a cutting-edge, user-friendly training framework designed to optimize agents and workflows (built with OpenAI SDK, AgentScope, and even vllm http requests), fine-tuning language model weights behind the scenes.
914

10-
Simply provide your Agent workflow, training data, and reward function, and we will be ready to enhance your agents to their optimal performance!
15+
**AgentJet (AJet)** is a cutting-edge, user-friendly training framework designed to optimize agents and workflows (built with OpenAI SDK, AgentScope, Langchain, or just HTTP requests), fine-tuning language model weights behind the scenes.
1116

17+
Simply provide your agent **workflow**, training **dataset**, and **reward** function, and **AgentJet** will be ready to enhance your agents to their optimal performance!
1218

1319

14-
## 💡 Minimum Example
20+
21+
## 🛩️ Minimum Example
1522

1623
Let's begin with the simplest example: a math agent with a tool call.
1724

18-
- First, please check out the [installation guide](docs/en/installation.md) to set up the training environment.
25+
- First, please check out the [installation guide](https://doc.agentjet.top/AgentJet/en/installation/) to set up the training environment.
1926
- Then, tune your first model using the minimum example.
2027
```python
21-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='verl' --with-ray
28+
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='verl'
29+
30+
# change to --backbone='trinity' if you want to switch to trinity training engine;
31+
# or --backbone='debug' if you want to debug with only vLLM
2232
```
2333

2434

25-
## Features
35+
## 🛩️ Features
2636

2737
We aim to build a easy-to-learn Agent tuner that unlock more possibilities for agent developers:
2838

2939
- **Easy and Friendly**. AgentJet helps you tune models behind your agent workflows easily, optimizing your agents for top performance with minimal effort.
3040
- **Rich Tutorial Library**. AgentJet provides a rich library of [examples](https://github.com/modelscope/AgentJet/tree/main/tutorial) as tutorials.
3141
- **Efficient and Scalable**. AgentJet uses [verl] as the default backbone (`--backbone=verl`). However, we also support [trinity](https://github.com/modelscope/Trinity-RFT/) as alternative backbone, accelerating your tuning process via fully asynchronous RFT.
32-
- **Flexible and Fast**. AgentJet supports [multi-agent workflows](docs/en/workflow.md) and adopts a context merging technique, accelerating training by 1.5x to 20x when the workflow involves multi-turn (or multi-agent) conversations.
33-
- **Reliability and Reproducibility**. Our team keeps track of framework performance across multiple [tasks + major-git-version + training-backbones](https://benchmark.agent-matrix.com/) (under construction, still gathering data, comming soon).
42+
- **Flexible and Fast**. AgentJet supports [multi-agent workflows](https://doc.agentjet.top/AgentJet/en/workflow.md) and adopts a context merging technique, accelerating training by 1.5x to 10x when the workflow involves multi-turn (or multi-agent) conversations.
43+
- **Reliability and Reproducibility**. Our team keeps track of framework performance across multiple [tasks + major-git-version + training-backbones](https://benchmark.agent-matrix.com/) (under construction, still gathering data, coming soon).
3444

3545
For advanced researchers, AgentJet also provides high-resolution logging and debugging solutions:
3646
<!-- For advanced researchers, AgentJet provides high-resolution logging and debugging solutions that are, to our knowledge, unprecedented in other prior projects. -->
@@ -40,51 +50,35 @@ For advanced researchers, AgentJet also provides high-resolution logging and deb
4050

4151
---
4252

43-
### 🚀 Quick Start
53+
### 🛩️ Quick Start
4454

4555
#### Installation
4656

47-
We recommend using `uv` for dependency management.
48-
49-
1. **Clone the Repository**:
50-
```bash
51-
git clone https://github.com/modelscope/AgentJet.git
52-
cd AgentJet
53-
```
54-
55-
56-
2. **Set up Environment**:
57-
```bash
58-
uv venv --python=3.10.16 && source .venv/bin/activate
59-
uv pip install -e .[trinity]
60-
# Note: flash-attn must be installed after other dependencies
61-
uv pip install flash_attn==2.8.1 --no-build-isolation --no-cache-dir
62-
```
63-
57+
- **Click here to read the** [**installation guide**](https://doc.agentjet.top/AgentJet/en/installation/).
6458

6559
#### Run Training
6660

67-
You can start training your first agent with a single command using a pre-configured YAML file. Take the [Math agent](docs/en/example_math_agent.md) as an example:
61+
- You can start training your first agent with a single command using a pre-configured YAML file. Take the [Math agent](https://doc.agentjet.top/AgentJet/en/example_math_agent/) as an example:
6862

69-
```bash
70-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='trinity' --with-ray
71-
```
63+
```bash
64+
ajet --conf tutorial/example_math_agent/math_agent.yaml
65+
```
7266

7367
#### Example Library
7468

7569
Explore our rich library of examples to kickstart your journey:
7670

77-
- 🔢 [**Training a math agent that can write python code**](docs/en/example_math_agent.md).
78-
- 📱 [**Creating an AppWorld agent using AgentScope and training it**](docs/en/example_app_world.md).
79-
- 🐺 [**Developing Werewolves RPG agents and training them**](docs/en/example_werewolves.md).
80-
- 👩🏻‍⚕️ [**Learning to ask questions like a doctor**](docs/en/example_learning_to_ask.md).
81-
- 🎴 [**Writing a countdown game using AgentScope and solving it**](docs/en/example_countdown.md).
82-
- 🚶 [**Solving a frozen lake walking puzzle using AgentJet**](docs/en/example_frozenlake.md).
71+
- 🔢 [**Training a math agent that can write python code**](https://doc.agentjet.top/AgentJet/en/example_math_agent).
72+
- 📱 [**Creating an AppWorld agent using AgentScope and training it**](https://doc.agentjet.top/AgentJet/en/example_app_world).
73+
- 🐺 [**Developing Werewolves RPG agents and training them**](https://doc.agentjet.top/AgentJet/en/example_werewolves).
74+
- 👩🏻‍⚕️ [**Learning to ask questions like a doctor**](https://doc.agentjet.top/AgentJet/en/example_learning_to_ask).
75+
- 🎴 [**Writing a countdown game using AgentScope and solving it**](https://doc.agentjet.top/AgentJet/en/example_countdown).
76+
- 🚶 [**Solving a frozen lake walking puzzle using AgentJet**](https://doc.agentjet.top/AgentJet/en/example_frozenlake).
8377

8478

8579
---
8680

87-
### 🧩 Core Concepts
81+
### 🛩️ Core Concepts
8882

8983
AgentJet makes agent fine-tuning straightforward by separating the developer interface from the internal execution logic.
9084

@@ -97,9 +91,9 @@ AgentJet makes agent fine-tuning straightforward by separating the developer int
9791

9892
To optimize an agent, you provide three core inputs:
9993

100-
* [**Trainable Workflow**](docs/en/workflow.md): Define your agent logic by inheriting the Workflow class, supporting both simple agent setups and advanced multi-agent collaborations.
101-
* [**Task Reader**](docs/en/data_pipeline.md): Load training tasks from JSONL files, HuggingFace datasets, interactive environments, or auto-generate them from documents.
102-
* [**Task Judger**](docs/en/task_judger.md): Evaluates agent outputs and assigns rewards to guide training.
94+
* [**Trainable Workflow**](https://doc.agentjet.top/AgentJet/en/workflow): Define your agent logic by inheriting the Workflow class, supporting both simple agent setups and advanced multi-agent collaborations.
95+
* [**Task Reader**](https://doc.agentjet.top/AgentJet/en/data_pipeline): Load training tasks from JSONL files, HuggingFace datasets, interactive environments, or auto-generate them from documents.
96+
* [**Task Judger**](https://doc.agentjet.top/AgentJet/en/task_judger): Evaluates agent outputs and assigns rewards to guide training.
10397

10498
#### 2. Internal System Architecture
10599

@@ -110,28 +104,29 @@ The internal system orchestrates several specialized modules to handle the compl
110104
* **Task Rollout**: Bridges LLM engines and manages the Gym environment lifecycle.
111105
* **Task Runner**: Executes the Agent workflow and calculates rewards.
112106
* **Model Tuner**: Forwards inference requests from the workflow to the LLM engine.
113-
* **Context Tracker**: Monitors LLM calls and automatically merges shared-history timelines to improve training efficiency by **3x to 10x**.
107+
* **Context Tracker**: Monitors LLM calls and automatically merges shared-history timelines to improve training efficiency by **1.5x to 10x**.
108+
114109

115110

116-
---
117111

118-
### 🚦 Navigation
112+
### 🛩️ Navigation
119113

120-
* 📖 **Tutorials**: From [Installation](docs/en/installation.md) to [Tuning your first agent](docs/en/tutorial.md) — the essential path for beginners.
121-
* 🛠️ **Core Components**: Define your [Trainable Workflow](docs/en/workflow.md) and manage [Data](docs/en/data_pipeline.md) and [Reward](docs/en/tune_your_first_agent.md).
122-
* 💡 **Example**: Check the [Example Library](#example-library) above for real-world cases like [Math](docs/en/example_math_agent.md), [Werewolves game](docs/en/example_werewolves.md) and [Learning to ask task](docs/en/example_learning_to_ask.md).
123-
* ⚙️ **Deep Dive**: Master advanced [Configuration](docs/en/configuration.md).
114+
* **Tutorials**: From [Installation](https://doc.agentjet.top/AgentJet/en/installation) to [Tuning your first agent](https://doc.agentjet.top/AgentJet/en/tune_your_first_agent) — the essential path for beginners.
115+
* **Core Components**: Define your [Trainable Workflow](https://doc.agentjet.top/AgentJet/en/workflow) and manage [Data](https://doc.agentjet.top/AgentJet/en/data_pipeline) and [Reward](https://doc.agentjet.top/AgentJet/en/task_judger).
116+
* **Example**: Check the [Example Library](#example-library) above for real-world cases like [Math](https://doc.agentjet.top/AgentJet/en/example_math_agent), [Werewolves game](https://doc.agentjet.top/AgentJet/en/example_werewolves) and [Learning to ask task](https://doc.agentjet.top/AgentJet/en/example_learning_to_ask).
117+
* **Deep Dive**: Master advanced [Configuration](https://doc.agentjet.top/AgentJet/en/configuration).
124118

125-
## 🗺️ Roadmap
119+
## 🛩️ Roadmap
126120

127121
AgentJet is a constantly evolving project. We are planning to add the following features in the near future.
128122

129-
- [ ] Advanced LLM-based multi-agent reinforcement learning.
130-
- [ ] Training dataset generation from few-shot samples.
131-
- [ ] Prompt tuning.
132-
- [ ] Multi-modal training support.
133-
- [ ] Cross-process Tuner wrapper to pass though process forking.
134-
- [ ] Providing training → user feedback → data augmentation → retraining data flywheel example.
135-
- [ ] Optimize configurations for long-context adaptation on smaller GPUs.
136-
- [ ] Add LoRA training examples.
137-
- [ ] Covering LangGraph and AutoGen frameworks.
123+
| Category | Feature | Status |
124+
| :--- | :--- | :--- |
125+
| **Examples** | Covering LangGraph and AutoGen frameworks | Done & Verifying |
126+
| **Examples** | Add LoRA training examples | Todo |
127+
| **Infra** | Cross-process Tuner wrapper to pass though process forking | Done & Verifying |
128+
| **Infra** | Optimize configurations for long-context adaptation on smaller GPUs | In Progress |
129+
| **Capability** | Prompt tuning | In Progress |
130+
| **Capability** | Multi-modal training support | Todo |
131+
| **Capability** | MARL Credit assignment | Todo |
132+
| **Capability** | Training dataset generation from few-shot samples | Done & Verifying |

ajet/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
"WorkflowOutput",
1111
"AjetTuner",
1212
"AgentJetJob",
13-
"bp",
13+
"bp"
1414
]
1515

1616
__version__ = "0.1.0"

ajet/backbone/main_verl.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
import hydra
2323
import ray
2424
from beast_logger import print_dict
25+
from loguru import logger
2526
from omegaconf import OmegaConf
2627
from verl.trainer.ppo.reward import load_reward_manager
2728
from verl.utils.device import is_cuda_available
@@ -112,7 +113,7 @@ def run(self, config):
112113
from omegaconf import OmegaConf
113114
from verl.utils.fs import copy_to_local
114115

115-
print(f"TaskRunner hostname: {socket.gethostname()}, PID: {os.getpid()}")
116+
logger.info(f"TaskRunner hostname: {socket.gethostname()}, PID: {os.getpid()}")
116117
pprint(OmegaConf.to_container(config, resolve=True))
117118
OmegaConf.resolve(config)
118119

@@ -148,8 +149,6 @@ def run(self, config):
148149
from verl.workers.fsdp_workers import CriticWorker
149150
elif use_legacy_worker_impl == "disable":
150151
from verl.workers.roles import CriticWorker
151-
152-
print("Using new worker implementation")
153152
else:
154153
raise ValueError(f"Invalid use_legacy_worker_impl: {use_legacy_worker_impl}")
155154

ajet/backbone/main_vllm.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
from ajet.utils.launch_utils import set_loguru_default_color
1111
from ajet.schema.logprob import TokenAndProb
1212
from ajet.utils.core_env_vars import get_runtime_env
13+
from loguru import logger
1314

1415
set_loguru_default_color()
1516

@@ -116,12 +117,11 @@ def run(config):
116117
config.ajet.task_reader,
117118
)
118119
tasks = task_reader.get_validation_tasks()
119-
print(tasks[:2])
120+
logger.info(tasks[:n_task])
120121
ctx_tracker = parallel_env.rollout(
121122
tasks=tasks[:n_task], mode="sample", epoch="1"
122123
) # "sample" or "validate"
123124
_ = parallel_env.to_dataproto(ctx_tracker)
124-
print("Generated batch output")
125125

126126

127127
@hydra.main(
@@ -133,7 +133,6 @@ def main(config):
133133
from omegaconf import OmegaConf
134134

135135
OmegaConf.resolve(config)
136-
print("*" * 20)
137136

138137
runtime_env = get_runtime_env()
139138
os.environ.update(runtime_env["env_vars"])
@@ -147,12 +146,12 @@ def companion_launch():
147146

148147
from ajet.utils.smart_daemon import LaunchCommandWhenAbsent
149148

150-
print("Launching companion process for async LLM server...")
149+
logger.info("Launching companion process for async LLM server...")
151150
model_path = config.ajet.model.path
152151
tensor_parallel_size = config.ajet.debug.debug_tensor_parallel_size
153152
n_avail_gpus = torch.cuda.device_count()
154153
if tensor_parallel_size > n_avail_gpus:
155-
print(
154+
logger.info(
156155
f"Warning: tensor_parallel_size {tensor_parallel_size} is greater than available GPUs {n_avail_gpus}. Setting tensor_parallel_size to {n_avail_gpus}."
157156
)
158157
tensor_parallel_size = n_avail_gpus

ajet/backbone/trainer_verl.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -302,15 +302,15 @@ def check_mutually_exclusive(mbs, mbs_per_gpu, name: str):
302302
)
303303

304304
if self.config.algorithm.use_kl_in_reward and config.actor_rollout_ref.actor.use_kl_loss:
305-
print("NOTICE: You have both enabled in-reward kl and kl loss.")
305+
logger.warning("NOTICE: You have both enabled in-reward kl and kl loss.")
306306

307307
# critic
308308
if self.use_critic:
309309
critic_config = omega_conf_to_dataclass(config.critic)
310310
critic_config.validate(n_gpus, config.ajet.data.train_batch_size)
311311

312312
if config.data.get("val_batch_size", None) is not None:
313-
print(
313+
logger.warning(
314314
"WARNING: val_batch_size is deprecated."
315315
+ " Validation datasets are sent to inference engines as a whole batch,"
316316
+ " which will schedule the memory themselves."
@@ -322,7 +322,7 @@ def check_mutually_exclusive(mbs, mbs_per_gpu, name: str):
322322
config.ajet.rollout.temperature > 0
323323
), "validation gen temperature should be greater than 0 when enabling do_sample"
324324

325-
print("[validate_config] All configuration checks passed successfully!")
325+
logger.success("[validate_config] All configuration checks passed successfully!")
326326

327327
def init_workers(self):
328328
"""Initialize distributed training workers using Ray backend.
@@ -807,7 +807,7 @@ def fit(self): # noqa: C901
807807
or esi_close_to_expiration
808808
):
809809
if esi_close_to_expiration:
810-
print("Force saving checkpoint: ESI instance expiration approaching.")
810+
logger.info("Force saving checkpoint: ESI instance expiration approaching.")
811811
with marked_timer("save_checkpoint", timing_raw, color="green"):
812812
self._save_checkpoint()
813813

ajet/context_tracker/basic_tracker.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1+
import torch
12
import copy
23
from collections import defaultdict
34
from typing import List, Tuple
4-
5-
import torch
5+
from loguru import logger
66

77
from ajet.context_tracker.base_tracker import (
88
BaseTracker,
@@ -233,7 +233,7 @@ def group_tokenize_multi_group(self):
233233
sample_arr += [sample]
234234

235235
if len(sample_arr) > max_num_group:
236-
print(f"Warning: allow {max_num_group} groups, but got {len(sample_arr)} groups")
236+
logger.warning(f"Warning: allow {max_num_group} groups, but got {len(sample_arr)} groups")
237237
import random
238238

239239
sample_arr = random.sample(sample_arr, max_num_group) # preserve max_num_group groups

ajet/default_config/ajet_default.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ajet:
77

88

99
# the experimental reverse proxy feature that allows `tuner.as_oai_baseurl_apikey` feature
10-
enable_experimental_reverse_proxy: True
10+
enable_experimental_reverse_proxy: False
1111

1212
model:
1313
# which model should be trained

0 commit comments

Comments
 (0)