Skip to content

Commit 9e73442

Browse files
committed
Merge remote-tracking branch 'origin/main' into dev/shuchang_newjudge
2 parents efa7fac + 6133583 commit 9e73442

57 files changed

Lines changed: 4185 additions & 550 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 42 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -12,24 +12,39 @@
1212
</div>
1313

1414

15-
**AgentJet (AJet)** is a cutting-edge, user-friendly training framework designed to optimize agents and workflows (built with OpenAI SDK, AgentScope, Langchain, or just HTTP requests), fine-tuning language model weights behind the scenes.
15+
**AgentJet (AJet)** is a cutting-edge, user-friendly agent RL training framework designed to optimize agents and agentic workflows (supporting any agent built with OpenAI SDK, AgentScope, Langchain, or raw HTTP requests), fine-tuning LLM weights to enhance model performance.
1616

17-
Simply provide your agent **workflow**, training **dataset**, and **reward** function, and **AgentJet** will be ready to enhance your agents to their optimal performance!
17+
**AgentJet (AJet)** has fully-distributed **swarm training** capability, which means that you can **deploy `ajet-swarm start` in GPU server(s) and then start training agents in your laptop(s)**! Simply provide your agent workflow, training dataset, and reward function, and AgentJet will be ready to go!
1818

1919

2020

21-
## ✈️ Minimum Example
21+
## ✈️ Fast Introduction
2222

23-
Let's begin with the simplest example: a math agent with a tool call.
23+
### Classic Mode
2424

25-
- First, please check out the [installation guide](https://modelscope.github.io/AgentJet/en/installation/) to set up the training environment.
26-
- Then, tune your first model using the minimum example.
27-
```python
28-
ajet --conf tutorial/example_math_agent/math_agent.yaml --backbone='verl'
25+
Let's begin with the simplest example: a math agent with a tool call. This is a simple & centralized training example.
2926

30-
# change to --backbone='trinity' if you want to switch to trinity training engine;
31-
# or --backbone='debug' if you want to debug with only vLLM
32-
```
27+
1. please check out the [installation guide](https://modelscope.github.io/AgentJet/en/installation/) to set up the training environment.
28+
2. tune your first model using the minimum example.
29+
```python
30+
ajet --conf ./tutorial/example_math_agent/math_agent.yaml --backbone='verl'
31+
```
32+
<div align="center">
33+
<img width="640" alt="image" src="https://serve.gptacademic.cn/publish/shared/Image/classic+swarm+revise.jpg"/>
34+
</div>
35+
36+
### Swarm Mode
37+
38+
Let's begin with the simplest AgentJet Swarm example: also a math agent. In this case, you can use any GPU-less laptop to train the model remotely.
39+
40+
1. Start swarm server and begin swarm overwatch: `ajet-swarm start` and `ajet-swarm overwatch`.
41+
2. From your laptop (or swarm server localhost), run [this simple script](https://github.com/modelscope/AgentJet/blob/main/tutorial/example_math_swarm/math.py) to begin training:
42+
```python
43+
AJET_SWARM_URL="http://swarm-server-ip:10086" python ./tutorial/example_math_swarm/math.py
44+
```
45+
<div align="center">
46+
<img width="600" alt="image" src="https://github.com/user-attachments/assets/41ed1e71-8b18-4c4c-b5e2-833399317337"/>
47+
</div>
3348

3449

3550
## ✈️ Features
@@ -38,7 +53,8 @@ We aim to build a easy-to-learn Agent tuner that unlock more possibilities for a
3853

3954
- **Easy and Friendly**. AgentJet helps you tune models behind your agent workflows easily, optimizing your agents for top performance with minimal effort.
4055
- **Rich Tutorial Library**. AgentJet provides a rich library of [examples](https://github.com/modelscope/AgentJet/tree/main/tutorial) as tutorials.
41-
- **Efficient and Scalable**. AgentJet uses [verl] as the default backbone (`--backbone=verl`). However, we also support [trinity](https://github.com/modelscope/Trinity-RFT/) as alternative backbone, accelerating your tuning process via fully asynchronous RFT.
56+
- **Swarm Training**. [This unique feature](https://modelscope.github.io/AgentJet/en/swarm_intro_blog_english/) of AgentJet opens many possibilities: deploying distributed & self-healing rollout workers, **non-shared-parameter multi-agent** training, **multi-runtime & multi-task cocktail** training. And just like Tinker, you can use AgentJet Swarm to train **models even on **GPU-less laptop(s)**.
57+
- **Efficient and Scalable**. AgentJet uses [verl] as the default backbone (`--backbone=verl`). However, we also support trinity as alternative backbone, accelerating your tuning process via fully asynchronous RFT.
4258
- **Flexible and Fast**. AgentJet supports [multi-agent workflows](https://modelscope.github.io/AgentJet/en/workflow/) and adopts a context merging technique, accelerating training by 1.5x to 10x when the workflow involves multi-turn (or multi-agent) conversations.
4359
- **Reliability and Reproducibility**. Our team keeps track of framework performance across multiple [tasks + major-git-version + training-backbones](https://benchmark.agentjet.top/) (under construction, still gathering data, coming soon).
4460

@@ -48,6 +64,11 @@ For advanced researchers, AgentJet also provides high-resolution logging and deb
4864
- **High-Resolution Logging**: AgentJet allows users to save and inspect token-level rollout details, recording token IDs, token loss masks, and even token logprobs to facilitate workflow development and agent diagnostics.
4965
- **Fast Debugging**: AgentJet also provides the `--backbone=debug` option for the best debugging experience, shortening your wait period from minutes to seconds after code changes and enabling breakpoint debugging in IDEs.
5066

67+
<div align="center">
68+
<img width="600" alt="image" src="https://serve.gptacademic.cn/publish/shared/Image/ai-generated-1771873242388.jpg"/>
69+
</div>
70+
71+
5172
---
5273

5374
### ✈️ Quick Start
@@ -56,13 +77,6 @@ For advanced researchers, AgentJet also provides high-resolution logging and deb
5677

5778
- **Click here to read the** [**installation guide**](https://modelscope.github.io/AgentJet/en/installation/).
5879

59-
#### Run Training
60-
61-
- You can start training your first agent with a single command using a pre-configured YAML file. Take the [Math agent](https://modelscope.github.io/AgentJet/en/example_math_agent/) as an example:
62-
63-
```bash
64-
ajet --conf tutorial/example_math_agent/math_agent.yaml
65-
```
6680

6781
#### Example Library
6882

@@ -75,6 +89,11 @@ Explore our rich library of examples to kickstart your journey:
7589
- 🎴 [**Writing a countdown game using AgentScope and solving it**](https://modelscope.github.io/AgentJet/en/example_countdown).
7690
- 🚶 [**Solving a frozen lake walking puzzle using AgentJet**](https://modelscope.github.io/AgentJet/en/example_frozenlake).
7791

92+
Explore our automated benchmarking system [https://benchmark.agentjet.top/](https://benchmark.agentjet.top/):
93+
<div align="center">
94+
<img width="600" alt="image" src="https://serve.gptacademic.cn/publish/shared/Image/benchmark.gif"/>
95+
</div>
96+
7897

7998
---
8099

@@ -105,6 +124,7 @@ The internal system orchestrates several specialized modules to handle the compl
105124
* **Task Runner**: Executes the Agent workflow and calculates rewards.
106125
* **Model Tuner**: Forwards inference requests from the workflow to the LLM engine.
107126
* **Context Tracker**: Monitors LLM calls and automatically merges shared-history timelines to improve training efficiency by **1.5x to 10x**.
127+
* **Swarm Server**: A data interchange center that accept OpenAI-like requests and engine instructions, activated only in AgentJet Swarm mode.
108128

109129

110130

@@ -122,14 +142,11 @@ AgentJet is a constantly evolving project. We are planning to add the following
122142

123143
| Category | Feature | Status |
124144
| :--- | :--- | :--- |
125-
| **Examples** | Covering LangGraph and AutoGen frameworks | Done & Verifying |
126145
| **Examples** | Add LoRA training examples | Todo |
127-
| **Infra** | Cross-process Tuner wrapper to pass though process forking | Done & Verifying |
128146
| **Infra** | Optimize configurations for long-context adaptation on smaller GPUs | In Progress |
129-
| **Capability** | Prompt tuning | In Progress |
130147
| **Capability** | Multi-modal training support | Todo |
131148
| **Capability** | MARL Credit assignment | Todo |
132-
| **Capability** | Training dataset generation from few-shot samples | Done & Verifying |
149+
| **Capability** | Training dataset generation from few-shot samples | Todo |
133150

134151

135152
## ✈️ Citation
@@ -152,8 +169,9 @@ If you use AgentJet in your research, please cite:
152169

153170
---
154171
<div align="center">
172+
This project is under active development, we need your help to make it shine! <br/>
155173

156-
[⭐ Star Us](https://github.com/modelscope/AgentJet) · [Report Bug](https://github.com/modelscope/AgentJet/issues) · [Request Feature](https://github.com/modelscope/AgentJet/issues)
174+
[⭐ Star Us](https://github.com/modelscope/AgentJet) · [✈️ Report Bug](https://github.com/modelscope/AgentJet/issues) · [✈️ Request Feature](https://github.com/modelscope/AgentJet/issues)
157175
</div>
158176

159177

ajet/copilot/job.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
from types import SimpleNamespace
1313
from typing import Any, Callable, Union
1414

15-
import ray
1615
import yaml
1716
from loguru import logger
1817

@@ -45,15 +44,17 @@ def __init__(
4544
project_name="ajet-swarm",
4645
experiment_name="test",
4746
n_gpu_for_infer: int | None = None, # only for trinity backbone
48-
grpo_n: int = 8,
47+
num_repeat: int = 8,
4948
batch_size: int = 32,
5049
swarm_mode: bool = True,
50+
sample_collection_method: str = "rollout_until_finish_enough_tasks",
5151
*kwargs,
5252
) -> None:
5353
self.backbone = backbone
5454
self.exp_dir = DEFAULT_DIR
5555
self.project_name = project_name
5656
self.exp_name = experiment_name
57+
self.sample_collection_method = sample_collection_method
5758
if swarm_mode:
5859
default_yaml = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', "default_config/ajet_ts_default.yaml"))
5960
else:
@@ -66,8 +67,10 @@ def __init__(
6667
self.config.ajet.model.path = model
6768
self.config.ajet.trainer_common.n_gpus_per_node = n_gpu
6869
self.config.ajet.trainer_common.algorithm.adv_estimator = algorithm
69-
self.config.ajet.rollout.num_repeat = grpo_n
70+
self.config.ajet.rollout.num_repeat = num_repeat
7071
self.config.ajet.data.train_batch_size = batch_size
72+
self.config.ajet.enable_swarm_mode = swarm_mode
73+
self.config.ajet.swarm_mode_sample_collection_method = sample_collection_method
7174
if n_gpu_for_infer is None and backbone == "trinity":
7275
raise ValueError("Please specify `n_gpu_for_infer` (n_gpu_for_infer < n_gpu) for trinity backbone.")
7376
if (n_gpu_for_infer is not None) and backbone == "verl":
@@ -134,6 +137,7 @@ def set_data(
134137
return self
135138

136139
def tune(self, *args, **kwargs) -> "AgentJetJob":
140+
import ray
137141
ast_cfg = self.config.ajet
138142
if not ast_cfg.rollout or not ast_cfg.rollout.user_workflow:
139143
raise ValueError("Workflow must be set via set_workflow before tuning.")

0 commit comments

Comments
 (0)