Skip to content

feat(review): support additional reviewer harnesses#2306

Open
anirudh5harma wants to merge 2 commits into
AgentWrapper:mainfrom
anirudh5harma:feat/reviewer-harness-support
Open

feat(review): support additional reviewer harnesses#2306
anirudh5harma wants to merge 2 commits into
AgentWrapper:mainfrom
anirudh5harma:feat/reviewer-harness-support

Conversation

@anirudh5harma

@anirudh5harma anirudh5harma commented Jun 30, 2026

Copy link
Copy Markdown

Summary

Reviewer sessions can now use the two additional harnesses requested in #2298, including automatic reuse of a worker's harness when no reviewer override is configured. Both adapters reuse the existing worker launchers while applying reviewer-specific read-only controls, and Project Settings exposes the expanded reviewer choices.

Validation

  • npm run lint
  • cd backend && go build ./... && go vet ./...
  • cd backend && go test -race ./...
  • cd frontend && npm run typecheck
  • cd frontend && npm test (40 files, 371 tests)
  • cd frontend && npx vite build --config vite.renderer.config.ts
  • Live Settings flow against an isolated daemon: selected a new reviewer, saved it through the UI, and verified the persisted project config over the daemon API.
  • Verified the installed CLI resolves the inline read-only permission policy exactly as authored.

Post-Deploy Monitoring & Validation

  • Search daemon logs for no reviewer adapter, reviewer command, and reviewer runtime after review launches.
  • Healthy signal: configured reviews enter running, open a reviewer pane, and submit a verdict without modifying the worker checkout.
  • Failure signal: repeated launch errors, permission prompts that stall the pane, or a reviewer unable to post/submit its result. Roll back this PR if those failures reproduce with a supported local installation.
  • Validation window: first week after release; owner: Agent Orchestrator maintainers.

Closes #2298.


Compound Engineering
GPT-5

Testing Codex and OpenCode reviewer harnesses end-to-end

Concrete recipe for running a review with each new harness against a live daemon, requested in this comment.

Prerequisites

  • codex and/or opencode CLIs installed and on $PATH (verify with codex --version / opencode --version)
  • AO daemon running (ao start or the desktop app)
  • A worker session with at least one open PR — this is the review target

1. Set the reviewer harness for a project

Two ways:

Settings UIProject → Settings → Reviewers → Default reviewer agent. The Select now lists codex and opencode alongside claude-code.

CLI — write the config directly:

ao project set-config <project-id> --config-json '{
  "worker":       {"agent": "codex"},
  "orchestrator": {"agent": "claude-code"},
  "reviewers":    [{"harness": "codex"}]
}'
# or "harness": "opencode"

Verify it persisted:

ao project get <project-id> --json | jq '.config.reviewers'
# → [{"harness":"codex"}]

2. Trigger a review on the worker's open PR

SESSION=<worker-session-id>
curl -X POST "http://127.0.0.1:${AO_PORT:-3001}/api/v1/sessions/${SESSION}/reviews/trigger"

The daemon resolves the reviewer harness via ProjectConfig.ResolveReviewerHarness, picks the matching adapter (reviewer/codex or reviewer/opencode), and spawns a reviewer pane in the same workspace.

3. Verify the reviewer pane launched with the right sandbox

Codex — the adapter inserts --sandbox read-only and forwards AO location env vars via Codex's -c shell_environment_policy.set.*. Inspect the live process:

ps -axo pid,command | grep '[c]odex' | grep review
# Look for: --sandbox read-only
# Look for: -c shell_environment_policy.set.AO_PORT="3001"
#           -c shell_environment_policy.set.AO_DATA_DIR="/Users/.../.ao"
#           -c shell_environment_policy.set.AO_RUN_FILE="/Users/.../.ao/running.json"

Attempt an edit inside the codex pane (e.g. :!echo x >> README.md) — Codex refuses because the sandbox is read-only. git diff, gh api, and ao review submit remain available.

OpenCode — the adapter injects the permission policy via the OPENCODE_CONFIG_CONTENT env var. Inspect:

ps -axEo pid,command,env | grep '[o]pencode' | grep -o 'OPENCODE_CONFIG_CONTENT=[^ ]*'
# → JSON: "*":"deny","read":"allow","glob":"allow","grep":"allow",
#         "bash":{"*":"deny","gh api *":"allow","git diff*":"allow",
#                 "git log*":"allow","git show*":"allow","git status*":"allow",
#                 "ao review submit *":"allow"}

Inside the OpenCode pane: read/grep/glob and the listed bash commands work; every other tool and bash command is denied.

4. Verify the verdict round-trip

The reviewer pane runs ao review submit --review-id <id> --reviews <json>. Daemon writes the verdict and emits a notification:

ao session get "$SESSION" --json | jq '.reviews[-1]'
# → {"id":"...", "verdict":"approved" | "changes_requested", "harness":"codex", ...}

Unit + build coverage already in CI

  • backend/internal/adapters/reviewer/codex/codex_test.go:
    • TestReviewCommandUsesReadOnlySandbox — asserts --sandbox read-only is inserted before the prompt argument and each AO_* env var becomes a -c shell_environment_policy.set.* entry
    • TestReviewMessageReturnsTaskPrompt — confirms a follow-up message reuses the same task prompt
  • backend/internal/adapters/reviewer/opencode/opencode_test.go:
    • TestReviewCommandUsesReadOnlyPermissionPolicy — asserts the exact policy JSON travels via the OPENCODE_CONFIG_CONTENT env var and the system prompt is folded into the prompt
    • TestReviewMessageReturnsTaskPrompt
  • backend/internal/domain/projectconfig_test.goValidate table rows reject configs that name an unsupported reviewer harness and accept the two new ones
  • frontend/src/renderer/components/ProjectSettingsForm.test.tsx — Reviewer <Select> round-trips codex and opencode through save

Run locally:

go test ./backend/internal/adapters/reviewer/codex/... \
        ./backend/internal/adapters/reviewer/opencode/... \
        ./backend/internal/domain/... -race -v
cd frontend && npm test -- ProjectSettingsForm

@anirudh5harma anirudh5harma force-pushed the feat/reviewer-harness-support branch from ac03b8c to c0049ae Compare June 30, 2026 10:08

@anirudh5harma anirudh5harma left a comment

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The live run exposed daemon-environment inheritance and interactive-PTY input issues, which were corrected in df4e3df. After the fix, local lint, build, vet, race tests, frontend typecheck, 40 test files / 371 tests, and the production renderer build passed.

@neversettle17-101

Copy link
Copy Markdown
Collaborator

Can you please add testing details of running codex and opencode agent harness for reviewer?

@anirudh5harma

Copy link
Copy Markdown
Author

@neversettle17-101 added a full end-to-end testing recipe to the PR description under "Testing Codex and OpenCode reviewer harnesses end-to-end". Covers:

  • Setting the harness per project via Settings UI or ao project set-config
  • Triggering a review via POST /api/v1/sessions/{sessionId}/reviews/trigger
  • Verifying Codex runs with --sandbox read-only + the three shell_environment_policy.set.AO_* overrides
  • Verifying OpenCode picks up the inline permission policy via OPENCODE_CONFIG_CONTENT
  • Confirming the verdict round-trip via ao review submit / ao session get
  • Existing unit + integration test mapping

Let me know if you'd like a screen recording of the live runs as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support codex and opencode as agent harness for reviewer sessions.

2 participants