Skip to content

Commit 6a38956

Browse files
abrichrclaude
andauthored
test: add 10 TRL parity tests for deprecation readiness (#241)
Adds tests/test_trl_parity.py with 25 test cases covering the 10 areas identified in docs/STANDALONE_VS_TRL_COMPARISON.md as needed before the standalone GRPO trainer can be deprecated: 1. Constrained decoding — Outlines generator build + ACTION_REGEX 2. Constrained decoding ImportError — returns None, not silent success 3. Prompt format identity — TRL imports SYSTEM_PROMPT from standalone 4. DSL round-trip parsing — CLICK, TYPE, WAIT, DONE via parse_action_json 5. Thought-prefix parsing — "Thought: ...\nAction: DSL" format 6. Unsloth loading — FastVisionModel.from_pretrained + get_peft_model 7. LoRA checkpoint resume — lora_checkpoint passed through config 8. HookBridge on_step_complete — callback fires with correct args 9. HookBridge unused hooks — on_before_collect/on_rollout_complete stored 10. _AgentOutput schema — Pydantic validation, JSON schema, roundtrip All tests are light (no torch/transformers/trl imports), use unittest.mock, and pass with [dev] deps only. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 114ad0e commit 6a38956

1 file changed

Lines changed: 573 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)