Streamline harness install spec names#1541
Conversation
ApprovabilityVerdict: Needs human review 1 blocking correctness issue found. While mostly mechanical renames, this PR includes structural changes to how version configuration flows through harness classes. Additionally, an unresolved review comment identifies a bug where user-configured program versions could be silently overwritten by defaults. You can customize Macroscope's approvability policy. Learn more. |
2fdf0ae to
48ac9e8
Compare
Dismissing prior approval to re-evaluate 48ac9e8
Dismissing prior approval to re-evaluate 4027dce
4027dce to
2a8e281
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2a8e281. Configure here.
2a8e281 to
1c81dc8
Compare

Summary
versiona first-class harness config field via basevf.HarnessConfig.version, alongside fields likemax_turnsHarness.load_program_config(config)so harnesses can resolve command programs from harness-owned configprogramconfig for OpenCode, MiniSWEAgent, Pi, and Terminus2; configs now use[eval.harness] version = "..."harness.program_configrepo_url,ref,tools,exec_timeout, andmax_depthTests
uv run --no-sync ruff formatuv run --no-sync ruff check --fixUV_FROZEN=1 uv run --no-sync pre-commit run ty --hook-stage pre-push --all-filesuv run --no-sync pytest tests/test_v1_harbor_cli.py tests/test_v1_mini_swe_agent.py tests/test_composable_env.pyuv run --no-sync pytest tests/test_opencode_harbor.py tests/test_imports.py tests/test_v1_config_extension.pyNote
Rename harness install spec fields from package/release to version across all harnesses
opencode,mini_swe_agent,pi,terminus_2,rlm) by replacing harness-specific field names (package,release,rlm_repo_ref, etc.) with a uniformversionfield on each harness config.version: str | Nonefield to the baseHarnessConfigdataclass and aload_program_confighook onHarnessso subclasses can inject version intoprogram.resolve(version=...).rlm_-prefixed fields onRLMProgramConfigandrlm_harness(e.g.rlm_tools→tools,rlm_repo_url→repo_url) for consistency.package,release,rlm_tools,rlm_repo_url, etc.) will break; no changes to runtime behavior or installed artifacts.Macroscope summarized 1c81dc8.
Note
Medium Risk
Renames public harness and RLM config/TOML fields without backward-compatible aliases, so existing eval configs may silently use defaults; rollout behavior otherwise follows the same install/resolve paths covered by updated tests.
Overview
Standardizes harness configuration around a shared
versionfield for agent installs (OpenCode, Pi, mini-SWE-agent, Terminus 2) and drops per-harness names likerelease,package, andharbor_package. Harness classes now overrideload_program_configso that top-levelversionis passed into each program’sresolve(), and tests/docs read resolved setup viaharness.program_configinstead of nested program config alone.RLM and composable RLM helpers rename prefixed knobs to shorter names (
repo_url/ref/tools/exec_timeout/max_depth, etc.).HarnessConfigalso gains an optionalversionhook for the base loader pattern.Breaking: configs or code still using the old field names will not populate the new fields (no aliases in the diff).
Reviewed by Cursor Bugbot for commit 1c81dc8. Bugbot is set up for automated code reviews on this repo. Configure here.