Skip to content

feat: operator-only test-rung route to verify a rung works#65

Merged
mabry1985 merged 1 commit into
mainfrom
feat/test-rung-route
Jul 3, 2026
Merged

feat: operator-only test-rung route to verify a rung works#65
mabry1985 merged 1 commit into
mainfrom
feat/test-rung-route

Conversation

@mabry1985

Copy link
Copy Markdown
Contributor

Problem

Companion to protoAgent#1749 (coder.solve()'s new force_rung). Verifying fusion (rung 4) actually works required contriving a task hard enough to fail greedy, best-of-k, and tree-search first before fusion is even reached — impractical for a quick sanity check.

Fix

POST /api/plugins/project_board/features/{id}/test-rung — runs exactly ONE named rung (greedy/best-of-k/tree-search/fusion) against a feature's real acceptance tests, in a throwaway worktree that's always reaped — never promoted, no PR opened, no board state touched.

POST /api/plugins/project_board/features/bd-7/test-rung
{"rung": "fusion"}

→ {"rung": "fusion", "passed": true, "gens_spent": 2, "candidates_tried": 2, "note": "...", "verdict_output": "3 passed in 0.4s"}

coder_seam.test_rung() is deliberately separate from dispatch() — that function's contract (promote the winner, raise SolveExhausted on exhaustion) is shaped for the board's real per-feature build; mixing test semantics into it would risk the real dispatch path.

Where this does NOT go — no @tool wrapper

Deliberately kept off the agent-facing tool surface. This repo already draws exactly this boundary: board_create_feature/board_mark_ready/board_list/board_retro/board_create_epic are the only 5 @tool-wrapped functions the board's own lead agent can call — /features/{id}/cancel and DELETE /features/{id} are HTTP-only, operator-reachable, with no tool. test-rung follows the same rule: the board's own lead agent has no way to call it.

Refactor

Extracted loop.py's _resolve_delegate into coder_seam.resolve_delegate (module-level) so the new route and the real dispatch path share one lookup instead of two copies.

Tests

269 passed (was 260; +9 in test_coder_seam.py for test_rung's always-reap/pass/fail/exception/fusion-forwarding behavior, +9 in test_api.py for the route's validation gates — unknown rung, unknown feature, no acceptance criteria, no coder plugin, no test command, missing delegate, missing fusion delegate — plus the happy path and a 400-not-500 on a solve() failure).

Gate: ruff check . && ruff format --check . && pytest -q — all green.

Version bumped 0.28.0 → 0.29.0 (new capability).

🤖 Generated with Claude Code

Companion to protoAgent#1749 (coder.solve()'s force_rung). Verifying fusion
actually works required contriving a task hard enough to fail greedy,
best-of-k, AND tree-search first -- impractical for a quick check.

Add POST /api/plugins/project_board/features/{id}/test-rung: runs exactly
ONE named rung against a feature's real acceptance tests, in a throwaway
worktree that is ALWAYS reaped -- never promoted, no PR opened, no board
state touched. coder_seam.test_rung() is deliberately separate from
dispatch() -- that function's contract (promote the winner, raise
SolveExhausted on exhaustion) is shaped for the board's real per-feature
build; mixing test semantics into it would risk the real dispatch path.

Deliberately NO @tool wrapper -- the board's own lead agent has no way to
call this itself, the same boundary this router already draws around
/features/{id}/cancel and DELETE /features/{id} (both operator-only,
neither exposed as an agent tool).

Extracted loop.py's _resolve_delegate into coder_seam.resolve_delegate (a
module-level function) so api.py's new route and loop.py's real dispatch
path share one lookup instead of two copies.

269 passed (was 260; +9 in test_coder_seam.py for test_rung's
always-reap/pass/fail/exception/fusion-forwarding behavior, +9 in
test_api.py for the route's validation gates + happy path + 400-not-500
on a solve failure).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@protoquinn

protoquinn Bot commented Jul 3, 2026

Copy link
Copy Markdown

👀 Quinn is reviewing — verdict (PASS / WARN / FAIL) + findings to follow.

@protoquinn protoquinn Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Audit — PR #65 | feat: operator-only test-rung route to verify a rung works

VERDICT: WARN (preliminary — CI still queued; no blocking findings from diff review)


CI Status

  • test: queued ⏳ — terminal verdict deferred until CI completes

Diff Review

  • New POST /features/{fid}/test-rung route with comprehensive validation gates (rung whitelist, feature existence, acceptance criteria, coder plugin, test command, delegate resolution) — clean and consistent with existing cancel/delete boundary
  • resolve_delegate() extracted from loop.py into coder_seam.py as a module-level function — eliminates duplication between dispatch and test-rung paths. Clean
  • test_rung() in coder_seam.py has an explicit always-reap contract. The full implementation body is truncated in the diff summary, so the always-reap logic (worktree cleanup, no promotion) could not be line-by-line verified — see Gap below

Observations

  • LOW: clawpatch structural review unavailable (HTTP 502 on checkout cache). Diff-based review only
  • MEDIUM/Gap: full test_rung() body and new test files (test_coder_seam.py +9, test_api.py +9) were truncated beyond the 200-line diff limit. The always-reap logic, _solve/_budget_cls plumbing, and test assertions were not line-by-line reviewed. Author claims 269 passing tests — CI will confirm
  • No HIGH or CRITICAL findings from the visible diff. The except Exception → HTTP 400 pattern is deliberate and appropriate for a diagnostic endpoint that wraps an unpredictable coder.solve() call
  • The @tool-wrapper exclusion is correctly applied — consistent with /cancel and DELETE /features/{id}

— Quinn, QA Engineer

@protoquinn

protoquinn Bot commented Jul 3, 2026

Copy link
Copy Markdown

Submitted COMMENT review on #65.

@mabry1985 mabry1985 merged commit ce91ad1 into main Jul 3, 2026
1 check passed
@mabry1985 mabry1985 deleted the feat/test-rung-route branch July 3, 2026 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant