Skip to content

feat: port standalone trainer robustness to TRL#238

Merged
abrichr merged 1 commit into
mainfrom
feat/trl-robustness-parity
Mar 29, 2026
Merged

feat: port standalone trainer robustness to TRL#238
abrichr merged 1 commit into
mainfrom
feat/trl-robustness-parity

Conversation

@abrichr
Copy link
Copy Markdown
Member

@abrichr abrichr commented Mar 29, 2026

Summary

  • DiagnosticsCallback: Logs loss, |loss|, grad_norm, reward in scientific notation at each training step — matches the standalone trainer's diagnostic output that operators rely on for debugging
  • 19 robustness tests: Covers health check (pass/fail/empty), corrupt screenshot retry (succeed/all-fail), stuck detection (break/no-false-positive), truncation warning, diagnostics callback logging, empty rollout result shape
  • Builds on fix: critical TRL trainer bugs — wrong prompt, ignored task_ids, DSL parsing #236 which ported the core robustness features (health check, corrupt retry, stuck detection, truncation warning) to trl_rollout.py

Test plan

  • 46 TRL-related tests pass (19 new + 27 existing)
  • Tests are "light" (no torch/transformers imports, mock everything)

🤖 Generated with Claude Code

- Add DiagnosticsCallback to trl_callbacks.py: logs loss, |loss|,
  grad_norm, reward in scientific notation (matches standalone trainer
  diagnostic output)
- Register DiagnosticsCallback in trl_wrapper.py alongside TelemetryCallback
- Add test_trl_robustness.py: 19 tests covering health check, corrupt
  screenshot retry, stuck detection, truncation warning, diagnostics
  callback, and empty rollout result shape

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@abrichr abrichr merged commit d7896d5 into main Mar 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant