Closed
Conversation
…, config, data, and scaffolding Seven spec-compliant agent skills with evals, references, and a deterministic review script. gym-review is the S-tier reference implementation with a standalone Python checker (scripts/review.py), self-contained anti-pattern and fix-pattern references, and portable eval fixtures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
Remove unused variables (F841), unused imports (F401), sort imports (I001), and apply ruff formatting to all Python files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
Covers all 7 skills, 5 chains, skill structure, evaluation method (with-skill vs baseline), grading, and portability notes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
cmunley1
reviewed
Apr 13, 2026
| {"role": "user", "content": "Problem statement here"} | ||
| ] | ||
| }, | ||
| "verifier_metadata": { |
Contributor
There was a problem hiding this comment.
this should have agent_ref
cmunley1
reviewed
Apr 13, 2026
|
|
||
| ```bash | ||
| # Validate example data (required before PR submission) | ||
| ng_prepare_data "+config_paths=[resources_servers/my_benchmark/configs/my_benchmark.yaml]" \ |
Contributor
There was a problem hiding this comment.
should have part on train_preparation mode and agent_ref
cmunley1
reviewed
Apr 13, 2026
…ures Every skill now has: - references/ with portable documentation (config patterns, JSONL schema, error patterns, diagnostic fields, metrics guide, agent patterns) - evals/files/ with bundled test fixtures (sample configs, rollouts, JSONL data, agent code with intentional bugs, clean implementations) - Updated evals referencing fixtures instead of repo paths Removed "S-tier" language from README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
The secrets detector flagged placeholder values in sample_env_config.yaml. These are intentional example values, not real secrets. Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
New skill covering env.yaml setup, config validation, server launch, health checking, smoke testing, and rollout collection. Fills the operational gap between having a configured benchmark and profiled results. Also adds a new "run" chain (gym-config > gym-run > gym-profile) and inserts gym-run into existing chains (new-benchmark, validate, external-integration). Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
- Add agent_ref to JSONL schema examples and documentation - Add train_preparation mode to data validation step - Replace GitLab-first with HuggingFace-first for dataset registry (GitLab kept as internal-only fallback) - Update references/schema.md to match Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
cmunley1
reviewed
Apr 13, 2026
|
|
||
| ## Step 5: Multi-environment training | ||
|
|
||
| To run multiple environments simultaneously, compose multiple config files: |
Contributor
There was a problem hiding this comment.
should also mention nemo rl config maybe for multienv training. since we dont use ng_run for training, we use nemorl config
CRITICAL: ng_reward_profile uses +materialized_inputs_jsonl_fpath=, not +input_jsonl_fpath= — fixed in gym-profile and gym-run. Also: - gym-data: add license valid values enum, num_repeats field, artifact_fpath optionality - gym-run: add prompt_config and upload_rollouts_to_wandb params - gym-scaffold-agent: document self.server_client, self.config, aggregate_metrics() override Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
…ross-skill links Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit retroactively signs off commit d03e4f7: Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
I, Lawrence Lane <llane@nvidia.com>, hereby add my Signed-off-by to this commit: d03e4f7 Signed-off-by: Lawrence Lane <llane@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
6 tasks
Contributor
Author
|
Superseded by #1062 — rebased with DCO sign-off (force-push was blocked on the original branch). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
evals/evals.jsonwith 3 assertion-based evals (21 total) and achains.yamlfor multi-step workflowsscripts/review.py), self-contained anti-pattern/fix-pattern references, and portable eval fixtures — works without the NeMo Gym repoTest plan
python .claude/skills/gym-review/scripts/review.py .claude/skills/gym-review/evals/files/and verify expected findings per fixture/menu shows all 7)Supersedes #1060 (closed due to force-push restriction on original branch for DCO fix).
🤖 Generated with Claude Code