Skip to content

Commit 3266544

Browse files
committed
updates
Signed-off-by: cmunley1 <cmunley@nvidia.com>
1 parent 8d01a30 commit 3266544

5 files changed

Lines changed: 9 additions & 3 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -158,8 +158,8 @@ The Dataset column links to publicly available datasets (e.g., on HuggingFace).
158158
| Arc Agi | knowledge | Solve puzzles designed to test intelligence. See https://arcprize.org/arc-agi. | Improve puzzle-solving capabilities. | - || - | <a href='resources_servers/arc_agi/configs/arc_agi.yaml'>arc_agi.yaml</a> | - |
159159
| Aviary | agent | Multi-hop question answering on the HotPotQA dataset with Wikipedia search | Improve knowledge and agentic capability ||| Apache 2.0 | <a href='resources_servers/aviary/configs/hotpotqa_aviary.yaml'>hotpotqa_aviary.yaml</a> | - |
160160
| Aviary | math | GSM8k benchmark with calculator tool | Test math and agentic capability ||| Apache 2.0 | <a href='resources_servers/aviary/configs/gsm8k_aviary.yaml'>gsm8k_aviary.yaml</a> | - |
161-
| Base Gymnasium | other | Base class for Gymnasium-style servers. Not a standalone server. | - | - | - | - | <a href='resources_servers/base_gymnasium/configs/base_gymnasium.yaml'>base_gymnasium.yaml</a> | - |
162-
| Blackjack | games | Blackjack. Model hits or stands. Reward +1 win, 0 draw, -1 loss/bust. | - | - | - | - | <a href='resources_servers/blackjack/configs/blackjack.yaml'>blackjack.yaml</a> | - |
161+
| Base Gymnasium | other | Base class for Gymnasium-style servers. Not a standalone server. | Reusable base class for step/reset style environments | - | - | - | <a href='resources_servers/base_gymnasium/configs/base_gymnasium.yaml'>base_gymnasium.yaml</a> | - |
162+
| Blackjack | games | Blackjack. Model hits or stands. Reward +1 win, 0 draw, -1 loss/bust. | Example gymnasium-style multi-step environment | - | - | - | <a href='resources_servers/blackjack/configs/blackjack.yaml'>blackjack.yaml</a> | - |
163163
| Calendar | agent | Multi-turn calendar scheduling dataset. User states events and constraints in natural language; model schedules events to satisfy all constraints. | Improve multi-turn instruction following capabilities ||| Apache 2.0 | <a href='resources_servers/calendar/configs/calendar.yaml'>calendar.yaml</a> | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-agent-calendar_scheduling'>Nemotron-RL-agent-calendar_scheduling</a> |
164164
| Calendar | agent | Multi-turn calendar scheduling dataset. User states events and constraints in natural language; model schedules events to satisfy all constraints. | Improve multi-turn instruction following capabilities ||| Creative Commons Attribution 4.0 International | <a href='resources_servers/calendar/configs/calendar_v2.yaml'>calendar_v2.yaml</a> | <a href='https://huggingface.co/datasets/nvidia/Nemotron-RL-Instruction-Following-Calendar-v2'>Nemotron-RL-Instruction-Following-Calendar-v2</a> |
165165
| Circle Click | other | Click on circles in images | - | - | - | - | <a href='resources_servers/circle_click/configs/circle_click.yaml'>circle_click.yaml</a> | - |

resources_servers/base_gymnasium/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ Base class (`GymnasiumServer`) for gymnasium-style resources servers. Not a stan
66
from resources_servers.base_gymnasium import GymnasiumServer
77
```
88

9-
See `docs/resources-server/gymnasium-api.md` for usage.
9+
See `docs/resources-server/gymnasium-api.md` for usage and `resources_servers/blackjack/` for a working example.

resources_servers/base_gymnasium/configs/base_gymnasium.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ base_gymnasium:
55
domain: other
66
verified: false
77
description: Base class for Gymnasium-style servers. Not a standalone server.
8+
value: Reusable base class for step/reset style environments
89
base_gymnasium_agent:
910
responses_api_agents:
1011
gymnasium_agent:

resources_servers/blackjack/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,10 @@ Example data provided in `data/example.jsonl` (system prompt only, no verifier_m
1212
ng_run "+config_paths=[resources_servers/blackjack/configs/blackjack.yaml,responses_api_models/vllm_model/configs/vllm_model.yaml]"
1313
```
1414

15+
## Data
16+
17+
Each game is generated on the fly during `reset()`, so every row in `example.jsonl` is identical. To create more data, duplicate the row. Each rollout gets a fresh random deal. Use `num_repeats` in the YAML config or the `+num_repeats` CLI flag to control how many games per row.
18+
1519
## Collect rollouts
1620

1721
```bash

resources_servers/blackjack/configs/blackjack.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ blackjack:
55
domain: games
66
verified: false
77
description: Blackjack. Model hits or stands. Reward +1 win, 0 draw, -1 loss/bust.
8+
value: Example gymnasium-style multi-step environment
89
blackjack_gymnasium_agent:
910
responses_api_agents:
1011
gymnasium_agent:

0 commit comments

Comments
 (0)