Skip to content

AUTO/YOLO agent pipeline: fix CI + code review via claude -p#25

Merged
bearmug merged 1 commit into
mainfrom
feat/agent-pipeline
Mar 17, 2026
Merged

AUTO/YOLO agent pipeline: fix CI + code review via claude -p#25
bearmug merged 1 commit into
mainfrom
feat/agent-pipeline

Conversation

@bearmug
Copy link
Copy Markdown
Contributor

@bearmug bearmug commented Mar 17, 2026

Summary

Implements the full agent pipeline for PR autopilot modes (AUTO/YOLO):

  • Fix CI — when checks fail, spawn claude -p to analyze failures and push fixes
    • AUTO: --permission-mode acceptEdits --allowedTools "Bash Edit Write Read Glob Grep"
    • YOLO: --permission-mode bypassPermissions
  • Code review — when checks pass, spawn claude -p in read-only mode to review the diff
    • Outputs structured JSON findings (critical/important/minor)
  • Fix review — when Critical/Important issues found, spawn claude -p to fix them
    • Max 2 review+fix cycles, then stops

Pipeline state machine

checks_failing → spawnFixCI → pushes fix → checks re-run
checks_passing → spawnCodeReview → parse findings
has_issues → spawnFixReview → pushes fixes → re-review
clean → AUTO: wait for approval | YOLO: auto-merge

Safety

  • AgentRunning field prevents double-spawn (only one agent per PR at a time)
  • 15-minute timeout per agent, 20-minute stale state auto-clear
  • $5/$3/$5 budget caps per agent invocation
  • Review state resets on check regression (full pipeline re-runs after CI fixes)
  • Clone to /tmp for isolation — never touches local repos

TUI

  • Agent status with elapsed time in PR zoom panel
  • Review findings by severity with file:line locations
  • Accumulated cost display in PR header

Test plan

  • All daemon tests pass (14 new model tests + 11 agent tests)
  • All TUI tests pass
  • Track a PR with failing CI in AUTO → verify agent spawns
  • After green checks → verify code review runs
  • YOLO mode → verify full pipeline + auto-merge

Implements the agent pipeline for PR autopilot modes:

1. Fix CI (hammer): when checks fail, spawn `claude -p` with the failing check
   details. AUTO uses --permission-mode acceptEdits with explicit tool whitelist;
   YOLO uses --permission-mode bypassPermissions.

2. Code review: when checks pass, spawn `claude -p` in read-only mode to review
   the diff. Outputs structured JSON findings (critical/important/minor).

3. Fix review: when Critical or Important issues are found, spawn `claude -p` to
   address them, then re-review (max 2 cycles).

Pipeline state tracked on TrackedPR: AgentRunning (mutex-like string preventing
double-spawn), ReviewState, ReviewFindings, ReviewCycle, AgentCostUSD.

Stale agent timeout (20min) clears crashed processes. Review state resets on
check regression so the full pipeline re-runs after CI fixes.

TUI shows: agent status with elapsed time, review findings by severity with
file:line locations, accumulated cost in PR header.

New file: daemon/internal/pr/agent.go — claudeBin, cloneForAgent, command
builders, spawn functions, parseReviewOutput, agentComplete callback.
@bearmug bearmug merged commit 947821e into main Mar 17, 2026
1 check passed
@bearmug bearmug deleted the feat/agent-pipeline branch March 17, 2026 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant