feat: Add agent skills for NeMo Gym by lbliii · Pull Request #1061 · NVIDIA-NeMo/Gym

lbliii · 2026-04-13T22:38:53Z

Summary

Adds 7 agent skills following the agentskills.io spec: gym-review, gym-debug, gym-profile, gym-config, gym-data, gym-scaffold-agent, and updates to add-benchmark
Each skill includes evals/evals.json with 3 assertion-based evals (21 total) and a chains.yaml for multi-step workflows
gym-review is a reference implementation with a standalone deterministic Python checker (scripts/review.py), self-contained anti-pattern/fix-pattern references, and portable eval fixtures — works without the NeMo Gym repo
Skills cover the full contributor workflow: data prep, config composition, server scaffolding, code review, debugging, and reward profiling

Test plan

Run python .claude/skills/gym-review/scripts/review.py .claude/skills/gym-review/evals/files/ and verify expected findings per fixture
Verify skills load correctly in Claude Code (check / menu shows all 7)
Run with-skill vs without-skill evals per agentskills.io eval spec

Supersedes #1060 (closed due to force-push restriction on original branch for DCO fix).

🤖 Generated with Claude Code

…, config, data, and scaffolding Seven spec-compliant agent skills with evals, references, and a deterministic review script. gym-review is the S-tier reference implementation with a standalone Python checker (scripts/review.py), self-contained anti-pattern and fix-pattern references, and portable eval fixtures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

Remove unused variables (F841), unused imports (F401), sort imports (I001), and apply ruff formatting to all Python files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

Covers all 7 skills, 5 chains, skill structure, evaluation method (with-skill vs baseline), grading, and portability notes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

cmunley1 · 2026-04-13T23:06:02Z

+      {"role": "user", "content": "Problem statement here"}
+    ]
+  },
+  "verifier_metadata": {


this should have agent_ref

cmunley1 · 2026-04-13T23:06:49Z

+
+```bash
+# Validate example data (required before PR submission)
+ng_prepare_data "+config_paths=[resources_servers/my_benchmark/configs/my_benchmark.yaml]" \


should have part on train_preparation mode and agent_ref

…ures Every skill now has: - references/ with portable documentation (config patterns, JSONL schema, error patterns, diagnostic fields, metrics guide, agent patterns) - evals/files/ with bundled test fixtures (sample configs, rollouts, JSONL data, agent code with intentional bugs, clean implementations) - Updated evals referencing fixtures instead of repo paths Removed "S-tier" language from README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

The secrets detector flagged placeholder values in sample_env_config.yaml. These are intentional example values, not real secrets. Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

New skill covering env.yaml setup, config validation, server launch, health checking, smoke testing, and rollout collection. Fills the operational gap between having a configured benchmark and profiled results. Also adds a new "run" chain (gym-config > gym-run > gym-profile) and inserts gym-run into existing chains (new-benchmark, validate, external-integration). Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

- Add agent_ref to JSONL schema examples and documentation - Add train_preparation mode to data validation step - Replace GitLab-first with HuggingFace-first for dataset registry (GitLab kept as internal-only fallback) - Update references/schema.md to match Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

cmunley1 · 2026-04-13T23:31:56Z

+
+## Step 5: Multi-environment training
+
+To run multiple environments simultaneously, compose multiple config files:


should also mention nemo rl config maybe for multienv training. since we dont use ng_run for training, we use nemorl config

CRITICAL: ng_reward_profile uses +materialized_inputs_jsonl_fpath=, not +input_jsonl_fpath= — fixed in gym-profile and gym-run. Also: - gym-data: add license valid values enum, num_repeats field, artifact_fpath optionality - gym-run: add prompt_config and upload_rollouts_to_wandb params - gym-scaffold-agent: document self.server_client, self.config, aggregate_metrics() override Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

…ross-skill links Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This commit retroactively signs off commit d03e4f7: Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

I, Lawrence Lane <llane@nvidia.com>, hereby add my Signed-off-by to this commit: d03e4f7 Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

lbliii · 2026-04-13T23:53:49Z

Superseded by #1062 — rebased with DCO sign-off (force-push was blocked on the original branch).

lbliii and others added 4 commits April 13, 2026 18:38

fix: Add NVIDIA copyright headers to Python files

72696da

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

fix: Resolve ruff lint errors and formatting in skill files

bc4421f

Remove unused variables (F841), unused imports (F401), sort imports (I001), and apply ruff formatting to all Python files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

docs: Add README for agent skills directory

1281df8

Covers all 7 skills, 5 chains, skill structure, evaluation method (with-skill vs baseline), grading, and portability notes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

cmunley1 reviewed Apr 13, 2026

View reviewed changes

Comment thread .claude/skills/gym-data/SKILL.md Outdated

lbliii and others added 4 commits April 13, 2026 19:09

cmunley1 reviewed Apr 13, 2026

View reviewed changes

lbliii and others added 4 commits April 13, 2026 19:34

refactor: Pre-compile regexes and cache full_text in review.py, fix c…

d03e4f7

…ross-skill links Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: Add DCO sign-off remediation for prior commit

e23d6de

This commit retroactively signs off commit d03e4f7: Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

chore: DCO remediation

d046ff9

I, Lawrence Lane <llane@nvidia.com>, hereby add my Signed-off-by to this commit: d03e4f7 Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>

lbliii mentioned this pull request Apr 13, 2026

feat: Add agent skills for NeMo Gym #1062

Open

6 tasks

lbliii closed this Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add agent skills for NeMo Gym#1061

feat: Add agent skills for NeMo Gym#1061
lbliii wants to merge 12 commits intomainfrom
lbliii/prague-v1

lbliii commented Apr 13, 2026 •

edited

Loading

Uh oh!

cmunley1 Apr 13, 2026

Uh oh!

cmunley1 Apr 13, 2026

Uh oh!

Uh oh!

cmunley1 Apr 13, 2026

Uh oh!

lbliii commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		## Step 5: Multi-environment training

		To run multiple environments simultaneously, compose multiple config files:

Conversation

lbliii commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

cmunley1 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

cmunley1 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cmunley1 Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

lbliii commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lbliii commented Apr 13, 2026 •

edited

Loading