You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The main branch is the latest version of OpenRL, which is under active development. If you just want to have a try with OpenRL, you can switch to the stable branch.
33
+
The main branch is the latest version of OpenRL, which is under active development. If you just want to have a try with
34
+
OpenRL, you can switch to the stable branch.
35
35
36
36
## Welcome to OpenRL
37
37
@@ -41,37 +41,44 @@ The main branch is the latest version of OpenRL, which is under active developme
41
41
Crafting Reinforcement Learning Frameworks with Passion, Your Valuable Insights Welcome. <br><br>
42
42
</div>
43
43
44
-
OpenRL is an open-source general reinforcement learning research framework that supports training for various tasks
45
-
such as single-agent, multi-agent, offline RL, self-play, and natural language.
46
-
Developed based on PyTorch, the goal of OpenRL is to provide a simple-to-use, flexible, efficient and sustainable platform for the reinforcement learning research community.
44
+
OpenRL is an open-source general reinforcement learning research framework that supports training for various tasks
45
+
such as single-agent, multi-agent, offline RL, self-play, and natural language.
46
+
Developed based on PyTorch, the goal of OpenRL is to provide a simple-to-use, flexible, efficient and sustainable
47
+
platform for the reinforcement learning research community.
47
48
48
49
Currently, the features supported by OpenRL include:
49
50
50
-
- A simple-to-use universal interface that supports training for both single-agent and multi-agent
51
+
- A **simple-to-use universal interface** that supports training for all tasks/environments
52
+
53
+
- Support for both single-agent and multi-agent tasks
51
54
52
55
- Support for offline RL training with expert dataset
53
56
54
57
- Support self-play training
55
58
56
59
- Reinforcement learning training support for natural language tasks (such as dialogue)
57
60
58
-
- Support [Arena](https://openrl-docs.readthedocs.io/en/latest/arena/index.html) , which allows convenient evaluation of various agents in a competitive environment.
61
+
- Support [Arena](https://openrl-docs.readthedocs.io/en/latest/arena/index.html) , which allows convenient evaluation of
62
+
various agents in a competitive environment.
59
63
60
64
- Importing models and datasets from [Hugging Face](https://huggingface.co/)
61
65
62
66
- Support for models such as LSTM, GRU, Transformer etc.
63
67
64
-
- Multiple training acceleration methods including automatic mixed precision training and data collecting wth half precision policy network
68
+
- Multiple training acceleration methods including automatic mixed precision training and data collecting wth half
69
+
precision policy network
65
70
66
71
- User-defined training models, reward models, training data and environment support
67
72
68
73
- Support for [gymnasium](https://gymnasium.farama.org/) environments
69
74
70
-
- Support for [Callbacks](https://openrl-docs.readthedocs.io/en/latest/callbacks/index.html), which can be used to implement various functions such as logging, saving, and early stopping
75
+
- Support for [Callbacks](https://openrl-docs.readthedocs.io/en/latest/callbacks/index.html), which can be used to
76
+
implement various functions such as logging, saving, and early stopping
71
77
72
78
- Dictionary observation space support
73
79
74
-
- Popular visualization tools such as [wandb](https://wandb.ai/), [tensorboardX](https://tensorboardx.readthedocs.io/en/latest/index.html) are supported
80
+
- Popular visualization tools such
81
+
as [wandb](https://wandb.ai/), [tensorboardX](https://tensorboardx.readthedocs.io/en/latest/index.html) are supported
75
82
76
83
- Serial or parallel environment training while ensuring consistent results in both modes
77
84
@@ -82,6 +89,7 @@ Currently, the features supported by OpenRL include:
82
89
- Compliant with Black Code Style guidelines and type checking
83
90
84
91
Algorithms currently supported by OpenRL (for more details, please refer to [Gallery](./Gallery.md)):
@@ -164,30 +196,36 @@ After installation, users can check the version of OpenRL through command line:
164
196
openrl --version
165
197
```
166
198
167
-
**Tips**: No installation required, try OpenRL online through Colab: [](https://colab.research.google.com/drive/15VBA-B7AJF8dBazzRcWAxJxZI7Pl9m-g?usp=sharing)
199
+
**Tips**: No installation required, try OpenRL online through
200
+
Colab: [](https://colab.research.google.com/drive/15VBA-B7AJF8dBazzRcWAxJxZI7Pl9m-g?usp=sharing)
168
201
169
202
## Use Docker
170
203
171
-
OpenRL currently provides Docker images with and without GPU support.
172
-
If the user's computer does not have an NVIDIA GPU, they can obtain an image without the GPU plugin using the following command:
204
+
OpenRL currently provides Docker images with and without GPU support.
205
+
If the user's computer does not have an NVIDIA GPU, they can obtain an image without the GPU plugin using the following
206
+
command:
207
+
173
208
```bash
174
209
sudo docker pull openrllab/openrl-cpu
175
210
```
176
211
177
212
If the user wants to accelerate training with a GPU, they can obtain it using the following command:
213
+
178
214
```bash
179
215
sudo docker pull openrllab/openrl
180
216
```
181
217
182
218
After successfully pulling the image, users can run OpenRL's Docker image using the following commands:
219
+
183
220
```bash
184
221
# Without GPU acceleration
185
222
sudo docker run -it openrllab/openrl-cpu
186
223
# With GPU acceleration
187
224
sudo docker run -it --gpus all --net host openrllab/openrl
188
225
```
189
226
190
-
Once inside the Docker container, users can check OpenRL's version and then run test cases using these commands:
227
+
Once inside the Docker container, users can check OpenRL's version and then run test cases using these commands:
agent.set_env(env) # The agent requires an interactive environment.
226
-
obs, info = env.reset() # Initialize the environment to obtain initial observations and environmental information.
269
+
agent.set_env(env) # The agent requires an interactive environment.
270
+
obs, info = env.reset() # Initialize the environment to obtain initial observations and environmental information.
227
271
whileTrue:
228
-
action, _ = agent.act(obs) # The agent predicts the next action based on environmental observations.
272
+
action, _ = agent.act(obs) # The agent predicts the next action based on environmental observations.
229
273
# The environment takes one step according to the action, obtains the next observation, reward, whether it ends and environmental information.
230
274
obs, r, done, info = env.step(action)
231
275
ifany(done): break
232
-
env.close() # Close test environment
276
+
env.close() # Close test environment
233
277
```
278
+
234
279
Executing the above code on a regular laptop only takes **a few seconds**
235
280
to complete the training. Below shows the visualization of the agent:
236
281
@@ -240,55 +285,69 @@ to complete the training. Below shows the visualization of the agent:
240
285
241
286
242
287
**Tips:** Users can also quickly train the `CartPole` environment by executing a command line in the terminal.
288
+
243
289
```bash
244
290
openrl --mode train --env CartPole-v1
245
291
```
246
292
247
-
For training tasks such as multi-agent and natural language processing, OpenRL also provides a similarly simple and easy-to-use interface.
293
+
For training tasks such as multi-agent and natural language processing, OpenRL also provides a similarly simple and
294
+
easy-to-use interface.
295
+
296
+
For information on how to perform multi-agent training, set hyperparameters for training, load training configurations,
297
+
use wandb, save GIF animations, etc., please refer to:
248
298
249
-
For information on how to perform multi-agent training, set hyperparameters for training, load training configurations, use wandb, save GIF animations, etc., please refer to:
250
299
-[Multi-Agent Training Example](https://openrl-docs.readthedocs.io/en/latest/quick_start/multi_agent_RL.html)
251
300
252
-
For information on natural language task training, loading models/datasets on Hugging Face, customizing training models/reward models, etc., please refer to:
301
+
For information on natural language task training, loading models/datasets on Hugging Face, customizing training
302
+
models/reward models, etc., please refer to:
303
+
253
304
-[Dialogue Task Training Example](https://openrl-docs.readthedocs.io/en/latest/quick_start/train_nlp.html)
254
305
255
306
For more information about OpenRL, please refer to the [documentation](https://openrl-docs.readthedocs.io/en/latest/).
256
307
257
308
## Gallery
258
309
259
-
In order to facilitate users' familiarity with the framework, we provide more examples and demos of using OpenRL in [Gallery](./Gallery.md).
310
+
In order to facilitate users' familiarity with the framework, we provide more examples and demos of using OpenRL
311
+
in [Gallery](./Gallery.md).
260
312
Users are also welcome to contribute their own training examples and demos to the Gallery.
261
313
262
314
## Projects Using OpenRL
263
315
264
-
We have listed research projects that use OpenRL in the [OpenRL Project](./Project.md).
316
+
We have listed research projects that use OpenRL in the [OpenRL Project](./Project.md).
265
317
If you are using OpenRL in your research project, you are also welcome to join this list.
266
318
267
319
## Feedback and Contribution
268
-
- If you have any questions or find bugs, you can check or ask in the [Issues](https://github.com/OpenRL-Lab/openrl/issues).
320
+
321
+
- If you have any questions or find bugs, you can check or ask in
322
+
the [Issues](https://github.com/OpenRL-Lab/openrl/issues).
269
323
- Join the QQ group: [OpenRL Official Communication Group](docs/images/qq.png)
- Join the [slack](https://join.slack.com/t/openrlhq/shared_invite/zt-1tqwpvthd-Eeh0IxQ~DIaGqYXoW2IUQg) group to discuss OpenRL usage and development with us.
329
+
- Join the [slack](https://join.slack.com/t/openrlhq/shared_invite/zt-1tqwpvthd-Eeh0IxQ~DIaGqYXoW2IUQg) group to discuss
330
+
OpenRL usage and development with us.
275
331
- Join the [Discord](https://discord.gg/guvAS2up) group to discuss OpenRL usage and development with us.
276
332
- Send an E-mail to: [huangshiyu@4paradigm.com](huangshiyu@4paradigm.com)
277
333
- Join the [GitHub Discussion](https://github.com/orgs/OpenRL-Lab/discussions).
278
334
279
-
The OpenRL framework is still under continuous development and documentation.
335
+
The OpenRL framework is still under continuous development and documentation.
280
336
We welcome you to join us in making this project better:
337
+
281
338
- How to contribute code: Read the [Contributors' Guide](./CONTRIBUTING.md)
0 commit comments