Skip to content

Commit 72bc4f7

Browse files
authored
add compare table
add compare table
2 parents 4e762c4 + b43e2df commit 72bc4f7

2 files changed

Lines changed: 166 additions & 60 deletions

File tree

README.md

Lines changed: 101 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
[![Anaconda-Server Badge](https://anaconda.org/openrl/openrl/badges/downloads.svg)](https://anaconda.org/openrl/openrl)
1111
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
1212

13-
1413
[![Hits-of-Code](https://hitsofcode.com/github/OpenRL-Lab/openrl?branch=main)](https://hitsofcode.com/github/OpenRL-Lab/openrl/view?branch=main)
1514
[![codecov](https://codecov.io/gh/OpenRL-Lab/openrl_release/branch/main/graph/badge.svg?token=4FMEYMR83U)](https://codecov.io/gh/OpenRL-Lab/openrl_release)
1615

@@ -31,7 +30,8 @@
3130

3231
OpenRL-v0.1.3 is updated on Aug 18, 2023
3332

34-
The main branch is the latest version of OpenRL, which is under active development. If you just want to have a try with OpenRL, you can switch to the stable branch.
33+
The main branch is the latest version of OpenRL, which is under active development. If you just want to have a try with
34+
OpenRL, you can switch to the stable branch.
3535

3636
## Welcome to OpenRL
3737

@@ -41,37 +41,44 @@ The main branch is the latest version of OpenRL, which is under active developme
4141
Crafting Reinforcement Learning Frameworks with Passion, Your Valuable Insights Welcome. <br><br>
4242
</div>
4343

44-
OpenRL is an open-source general reinforcement learning research framework that supports training for various tasks
45-
such as single-agent, multi-agent, offline RL, self-play, and natural language.
46-
Developed based on PyTorch, the goal of OpenRL is to provide a simple-to-use, flexible, efficient and sustainable platform for the reinforcement learning research community.
44+
OpenRL is an open-source general reinforcement learning research framework that supports training for various tasks
45+
such as single-agent, multi-agent, offline RL, self-play, and natural language.
46+
Developed based on PyTorch, the goal of OpenRL is to provide a simple-to-use, flexible, efficient and sustainable
47+
platform for the reinforcement learning research community.
4748

4849
Currently, the features supported by OpenRL include:
4950

50-
- A simple-to-use universal interface that supports training for both single-agent and multi-agent
51+
- A **simple-to-use universal interface** that supports training for all tasks/environments
52+
53+
- Support for both single-agent and multi-agent tasks
5154

5255
- Support for offline RL training with expert dataset
5356

5457
- Support self-play training
5558

5659
- Reinforcement learning training support for natural language tasks (such as dialogue)
5760

58-
- Support [Arena](https://openrl-docs.readthedocs.io/en/latest/arena/index.html) , which allows convenient evaluation of various agents in a competitive environment.
61+
- Support [Arena](https://openrl-docs.readthedocs.io/en/latest/arena/index.html) , which allows convenient evaluation of
62+
various agents in a competitive environment.
5963

6064
- Importing models and datasets from [Hugging Face](https://huggingface.co/)
6165

6266
- Support for models such as LSTM, GRU, Transformer etc.
6367

64-
- Multiple training acceleration methods including automatic mixed precision training and data collecting wth half precision policy network
68+
- Multiple training acceleration methods including automatic mixed precision training and data collecting wth half
69+
precision policy network
6570

6671
- User-defined training models, reward models, training data and environment support
6772

6873
- Support for [gymnasium](https://gymnasium.farama.org/) environments
6974

70-
- Support for [Callbacks](https://openrl-docs.readthedocs.io/en/latest/callbacks/index.html), which can be used to implement various functions such as logging, saving, and early stopping
75+
- Support for [Callbacks](https://openrl-docs.readthedocs.io/en/latest/callbacks/index.html), which can be used to
76+
implement various functions such as logging, saving, and early stopping
7177

7278
- Dictionary observation space support
7379

74-
- Popular visualization tools such as [wandb](https://wandb.ai/), [tensorboardX](https://tensorboardx.readthedocs.io/en/latest/index.html) are supported
80+
- Popular visualization tools such
81+
as [wandb](https://wandb.ai/), [tensorboardX](https://tensorboardx.readthedocs.io/en/latest/index.html) are supported
7582

7683
- Serial or parallel environment training while ensuring consistent results in both modes
7784

@@ -82,6 +89,7 @@ Currently, the features supported by OpenRL include:
8289
- Compliant with Black Code Style guidelines and type checking
8390

8491
Algorithms currently supported by OpenRL (for more details, please refer to [Gallery](./Gallery.md)):
92+
8593
- [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347)
8694
- [Dual-clip PPO](https://arxiv.org/abs/1912.09729)
8795
- [Multi-agent PPO (MAPPO)](https://arxiv.org/abs/2103.01955)
@@ -96,6 +104,7 @@ Algorithms currently supported by OpenRL (for more details, please refer to [Gal
96104
- [Deep Deterministic Policy Gradient (DDPG)](https://arxiv.org/abs/1509.02971)
97105

98106
Environments currently supported by OpenRL (for more details, please refer to [Gallery](./Gallery.md)):
107+
99108
- [Gymnasium](https://gymnasium.farama.org/)
100109
- [MuJoCo](https://github.com/deepmind/mujoco)
101110
- [PettingZoo](https://pettingzoo.farama.org/)
@@ -110,10 +119,12 @@ Environments currently supported by OpenRL (for more details, please refer to [G
110119
- [Super Mario Bros](https://github.com/Kautenja/gym-super-mario-bros)
111120
- [Gym Retro](https://github.com/openai/retro)
112121

113-
This framework has undergone multiple iterations by the [OpenRL-Lab](https://github.com/OpenRL-Lab) team which has applied it in academic research.
122+
This framework has undergone multiple iterations by the [OpenRL-Lab](https://github.com/OpenRL-Lab) team which has
123+
applied it in academic research.
114124
It has now become a mature reinforcement learning framework.
115125

116-
OpenRL-Lab will continue to maintain and update OpenRL, and we welcome everyone to join our [open-source community](./CONTRIBUTING.md)
126+
OpenRL-Lab will continue to maintain and update OpenRL, and we welcome everyone to join
127+
our [open-source community](./CONTRIBUTING.md)
117128
to contribute towards the development of reinforcement learning.
118129

119130
For more information about OpenRL, please refer to the [documentation](https://openrl-docs.readthedocs.io/en/latest/).
@@ -122,6 +133,7 @@ For more information about OpenRL, please refer to the [documentation](https://o
122133

123134
- [Welcome to OpenRL](#welcome-to-openrl)
124135
- [Outline](#outline)
136+
- [Why OpenRL?](#why-openrl)
125137
- [Installation](#installation)
126138
- [Use Docker](#use-docker)
127139
- [Quick Start](#quick-start)
@@ -130,13 +142,33 @@ For more information about OpenRL, please refer to the [documentation](https://o
130142
- [Feedback and Contribution](#feedback-and-contribution)
131143
- [Maintainers](#maintainers)
132144
- [Supporters](#supporters)
133-
- [&#8627; Contributors](#-contributors)
134-
- [&#8627; Stargazers](#-stargazers)
135-
- [&#8627; Forkers](#-forkers)
145+
- [&#8627; Contributors](#-contributors)
146+
- [&#8627; Stargazers](#-stargazers)
147+
- [&#8627; Forkers](#-forkers)
136148
- [Citing OpenRL](#citing-openrl)
137149
- [License](#license)
138150
- [Acknowledgments](#acknowledgments)
139151

152+
## Why OpenRL
153+
154+
Here we provide a table for the comparison of OpenRL and existing popular RL libraries.
155+
OpenRL employs a modular design and high-level abstraction, allowing users to accomplish training for various tasks
156+
through a unified and user-friendly interface.
157+
158+
| Library | NLP/RLHF | Multi-agent | Self-Play Training | Offline RL | Bilingual Document |
159+
|:------------------------------------------------------------------:|:------------------:|:--------------------:|:--------------------:|:------------------:|:------------------:|
160+
| **[OpenRL](https://github.com/OpenRL-Lab/openrl)** | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
161+
| [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3) | :x: | :x: | :x: | :x: | :x: |
162+
| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
163+
| [DI-engine](https://github.com/opendilab/DI-engine/) | :x: | :heavy_check_mark: | not fullly supported | :heavy_check_mark: | :heavy_check_mark: |
164+
| [Tianshou](https://github.com/thu-ml/tianshou) | :x: | not fullly supported | not fullly supported | :heavy_check_mark: | :heavy_check_mark: |
165+
| [MARLlib](https://github.com/Replicable-MARL/MARLlib) | :x: | :heavy_check_mark: | not fullly supported | :x: | :x: |
166+
| [MAPPO Benchmark](https://github.com/marlbenchmark/on-policy) | :x: | :heavy_check_mark: | :x: | :x: | :x: |
167+
| [RL4LMs](https://github.com/allenai/RL4LMs) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
168+
| [trlx](https://github.com/CarperAI/trlx) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
169+
| [trl](https://github.com/huggingface/trl) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
170+
| [TimeChamber](https://github.com/inspirai/TimeChamber) | :x: | :x: | :heavy_check_mark: | :x: | :x: |
171+
140172
## Installation
141173

142174
Users can directly install OpenRL via pip:
@@ -164,30 +196,36 @@ After installation, users can check the version of OpenRL through command line:
164196
openrl --version
165197
```
166198

167-
**Tips**: No installation required, try OpenRL online through Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/15VBA-B7AJF8dBazzRcWAxJxZI7Pl9m-g?usp=sharing)
199+
**Tips**: No installation required, try OpenRL online through
200+
Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/15VBA-B7AJF8dBazzRcWAxJxZI7Pl9m-g?usp=sharing)
168201

169202
## Use Docker
170203

171-
OpenRL currently provides Docker images with and without GPU support.
172-
If the user's computer does not have an NVIDIA GPU, they can obtain an image without the GPU plugin using the following command:
204+
OpenRL currently provides Docker images with and without GPU support.
205+
If the user's computer does not have an NVIDIA GPU, they can obtain an image without the GPU plugin using the following
206+
command:
207+
173208
```bash
174209
sudo docker pull openrllab/openrl-cpu
175210
```
176211

177212
If the user wants to accelerate training with a GPU, they can obtain it using the following command:
213+
178214
```bash
179215
sudo docker pull openrllab/openrl
180216
```
181217

182218
After successfully pulling the image, users can run OpenRL's Docker image using the following commands:
219+
183220
```bash
184221
# Without GPU acceleration
185222
sudo docker run -it openrllab/openrl-cpu
186223
# With GPU acceleration
187224
sudo docker run -it --gpus all --net host openrllab/openrl
188225
```
189226

190-
Once inside the Docker container, users can check OpenRL's version and then run test cases using these commands:
227+
Once inside the Docker container, users can check OpenRL's version and then run test cases using these commands:
228+
191229
```bash
192230
# Check OpenRL version in Docker container
193231
openrl --version
@@ -197,40 +235,47 @@ openrl --mode train --env CartPole-v1
197235

198236
## Quick Start
199237

200-
OpenRL provides a simple and easy-to-use interface for beginners in reinforcement learning.
238+
OpenRL provides a simple and easy-to-use interface for beginners in reinforcement learning.
201239
Below is an example of using the PPO algorithm to train the `CartPole` environment:
240+
202241
```python
203242
# train_ppo.py
204243
from openrl.envs.common import make
205244
from openrl.modules.common import PPONet as Net
206245
from openrl.runners.common import PPOAgent as Agent
207-
env = make("CartPole-v1", env_num=9) # Create an environment and set the environment parallelism to 9.
208-
net = Net(env) # Create neural network.
209-
agent = Agent(net) # Initialize the agent.
210-
agent.train(total_time_steps=20000) # Start training and set the total number of steps to 20,000 for the running environment.
246+
247+
env = make("CartPole-v1", env_num=9) # Create an environment and set the environment parallelism to 9.
248+
net = Net(env) # Create neural network.
249+
agent = Agent(net) # Initialize the agent.
250+
agent.train(
251+
total_time_steps=20000) # Start training and set the total number of steps to 20,000 for the running environment.
211252
```
212-
Training an agent using OpenRL only requires four simple steps:
253+
254+
Training an agent using OpenRL only requires four simple steps:
213255
**Create Environment** => **Initialize Model** => **Initialize Agent** => **Start Training**!
214256

215257
For a well-trained agent, users can also easily test the agent:
258+
216259
```python
217260
# train_ppo.py
218261
from openrl.envs.common import make
219262
from openrl.modules.common import PPONet as Net
220263
from openrl.runners.common import PPOAgent as Agent
221-
agent = Agent(Net(make("CartPole-v1", env_num=9))) # Initialize trainer.
264+
265+
agent = Agent(Net(make("CartPole-v1", env_num=9))) # Initialize trainer.
222266
agent.train(total_time_steps=20000)
223267
# Create an environment for test, set the parallelism of the environment to 9, and set the rendering mode to group_human.
224268
env = make("CartPole-v1", env_num=9, render_mode="group_human")
225-
agent.set_env(env) # The agent requires an interactive environment.
226-
obs, info = env.reset() # Initialize the environment to obtain initial observations and environmental information.
269+
agent.set_env(env) # The agent requires an interactive environment.
270+
obs, info = env.reset() # Initialize the environment to obtain initial observations and environmental information.
227271
while True:
228-
action, _ = agent.act(obs) # The agent predicts the next action based on environmental observations.
272+
action, _ = agent.act(obs) # The agent predicts the next action based on environmental observations.
229273
# The environment takes one step according to the action, obtains the next observation, reward, whether it ends and environmental information.
230274
obs, r, done, info = env.step(action)
231275
if any(done): break
232-
env.close() # Close test environment
276+
env.close() # Close test environment
233277
```
278+
234279
Executing the above code on a regular laptop only takes **a few seconds**
235280
to complete the training. Below shows the visualization of the agent:
236281

@@ -240,55 +285,69 @@ to complete the training. Below shows the visualization of the agent:
240285

241286

242287
**Tips:** Users can also quickly train the `CartPole` environment by executing a command line in the terminal.
288+
243289
```bash
244290
openrl --mode train --env CartPole-v1
245291
```
246292

247-
For training tasks such as multi-agent and natural language processing, OpenRL also provides a similarly simple and easy-to-use interface.
293+
For training tasks such as multi-agent and natural language processing, OpenRL also provides a similarly simple and
294+
easy-to-use interface.
295+
296+
For information on how to perform multi-agent training, set hyperparameters for training, load training configurations,
297+
use wandb, save GIF animations, etc., please refer to:
248298

249-
For information on how to perform multi-agent training, set hyperparameters for training, load training configurations, use wandb, save GIF animations, etc., please refer to:
250299
- [Multi-Agent Training Example](https://openrl-docs.readthedocs.io/en/latest/quick_start/multi_agent_RL.html)
251300

252-
For information on natural language task training, loading models/datasets on Hugging Face, customizing training models/reward models, etc., please refer to:
301+
For information on natural language task training, loading models/datasets on Hugging Face, customizing training
302+
models/reward models, etc., please refer to:
303+
253304
- [Dialogue Task Training Example](https://openrl-docs.readthedocs.io/en/latest/quick_start/train_nlp.html)
254305

255306
For more information about OpenRL, please refer to the [documentation](https://openrl-docs.readthedocs.io/en/latest/).
256307

257308
## Gallery
258309

259-
In order to facilitate users' familiarity with the framework, we provide more examples and demos of using OpenRL in [Gallery](./Gallery.md).
310+
In order to facilitate users' familiarity with the framework, we provide more examples and demos of using OpenRL
311+
in [Gallery](./Gallery.md).
260312
Users are also welcome to contribute their own training examples and demos to the Gallery.
261313

262314
## Projects Using OpenRL
263315

264-
We have listed research projects that use OpenRL in the [OpenRL Project](./Project.md).
316+
We have listed research projects that use OpenRL in the [OpenRL Project](./Project.md).
265317
If you are using OpenRL in your research project, you are also welcome to join this list.
266318

267319
## Feedback and Contribution
268-
- If you have any questions or find bugs, you can check or ask in the [Issues](https://github.com/OpenRL-Lab/openrl/issues).
320+
321+
- If you have any questions or find bugs, you can check or ask in
322+
the [Issues](https://github.com/OpenRL-Lab/openrl/issues).
269323
- Join the QQ group: [OpenRL Official Communication Group](docs/images/qq.png)
324+
270325
<div align="center">
271326
<a href="docs/images/qq.png"><img width="250px" height="auto" src="docs/images/qq.png"></a>
272327
</div>
273328

274-
- Join the [slack](https://join.slack.com/t/openrlhq/shared_invite/zt-1tqwpvthd-Eeh0IxQ~DIaGqYXoW2IUQg) group to discuss OpenRL usage and development with us.
329+
- Join the [slack](https://join.slack.com/t/openrlhq/shared_invite/zt-1tqwpvthd-Eeh0IxQ~DIaGqYXoW2IUQg) group to discuss
330+
OpenRL usage and development with us.
275331
- Join the [Discord](https://discord.gg/guvAS2up) group to discuss OpenRL usage and development with us.
276332
- Send an E-mail to: [huangshiyu@4paradigm.com](huangshiyu@4paradigm.com)
277333
- Join the [GitHub Discussion](https://github.com/orgs/OpenRL-Lab/discussions).
278334

279-
The OpenRL framework is still under continuous development and documentation.
335+
The OpenRL framework is still under continuous development and documentation.
280336
We welcome you to join us in making this project better:
337+
281338
- How to contribute code: Read the [Contributors' Guide](./CONTRIBUTING.md)
282339
- [OpenRL Roadmap](https://github.com/OpenRL-Lab/openrl/issues/2)
283340

284341
## Maintainers
285342

286343
At present, OpenRL is maintained by the following maintainers:
344+
287345
- [Shiyu Huang](https://huangshiyu13.github.io/)([@huangshiyu13](https://github.com/huangshiyu13))
288346
- Wenze Chen([@Chen001117](https://github.com/Chen001117))
289347
- Yiwen Sun([@YiwenAI](https://github.com/YiwenAI))
290348

291-
Welcome more contributors to join our maintenance team (send an E-mail to [huangshiyu@4paradigm.com](huangshiyu@4paradigm.com)
349+
Welcome more contributors to join our maintenance team (send an E-mail
350+
to [huangshiyu@4paradigm.com](huangshiyu@4paradigm.com)
292351
to apply for joining the OpenRL team).
293352

294353
## Supporters
@@ -310,6 +369,7 @@ to apply for joining the OpenRL team).
310369
## Citing OpenRL
311370

312371
If our work has been helpful to you, please feel free to cite us:
372+
313373
```latex
314374
@misc{openrl2023,
315375
title={OpenRL},
@@ -325,9 +385,11 @@ If our work has been helpful to you, please feel free to cite us:
325385
[![Star History Chart](https://api.star-history.com/svg?repos=OpenRL-Lab/openrl&type=Date)](https://star-history.com/#OpenRL-Lab/openrl&Date)
326386

327387
## License
388+
328389
OpenRL under the Apache 2.0 license.
329390

330391
## Acknowledgments
392+
331393
The development of the OpenRL framework has drawn on the strengths of other reinforcement learning frameworks:
332394

333395
- Stable-baselines3: https://github.com/DLR-RM/stable-baselines3

0 commit comments

Comments
 (0)