You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Squashed from origin/r3-delta (tip 5c94833, which extends the earlier
3799bda with 'Support branched routed expert deltas' for cases where
the routed-experts payload diverges across siblings in a group).
Adapts delta replay to main's deferred routed-experts chunk concat:
first step starts at 0; extended steps use prefix_len - 1; row 0 fills
the boundary, remaining rows append as the new suffix. Bumps router
wheel pin to local-path. Bumps deps/verifiers gitlink to d39cc5876.
Co-Authored-By: S1ro1 <matej.sirovatka@gmail.com>
Copy file name to clipboardExpand all lines: skills/configs/SKILL.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -70,6 +70,10 @@ For rollout debugging, enable trainer-side token export under `trainer.experimen
70
70
71
71
Leave it unset for normal training. When enabled, it exports every sequence from each exporting rank.
72
72
73
+
## RLM SWE harness args
74
+
75
+
For `rlm_swe` / `rlm-swe` configs using the composable RLM harness, use current harness kwargs such as `rlm_max_turns`, `rlm_exec_timeout`, `rlm_max_depth`, `summarize_at_tokens`, `rlm_ref`, `local_checkout`, `append_to_system_prompt`, and `rlm_tools`. Do not use the stale `rlm_max_turns_in_context` key with the composable harness; it is not accepted by `rlm_harness`.
76
+
73
77
## Key files
74
78
75
79
-`packages/prime-rl-configs/src/prime_rl/` — config classes under `configs/`; `utils/config.py` re-exports `BaseConfig` and `cli`
0 commit comments