feat(approval): rebase HITL plan-approval gate (supersedes #50) by AbirAbbas · Pull Request #66 · Agent-Field/SWE-AF

AbirAbbas · 2026-05-07T16:39:16Z

Summary

Rebases the human-in-the-loop plan-approval gate from #50 onto current main. PR #50 was 34 commits behind and conflicting; rather than try to resolve those conflicts mechanically, this re-applies just the HITL bits cleanly on a fresh branch.

When HAX_API_KEY is set on the SWE-AF service, every build() pauses between Phase 1 (plan + git_init) and Phase 2 (execute) and posts the plan to hax-sdk for human review. Reviewer can approve, request changes (re-runs Architect → Tech Lead → Sprint Planner with the feedback, bounded by cfg.max_plan_revision_iterations), or reject. Without HAX_API_KEY, behavior is unchanged.

What's different from #50

Cleanly rebased on main — preserves _run_ci_gate, permission_mode, env-driven runtime defaults, build-id workspace scoping, and all post-PR CI / resolve() machinery that landed after feat: HITL SDK integration #50 was cut.
Drops swe_af/approval.py (467 lines of dead POC code that wasn't imported anywhere — superseded by the inline app.pause() integration).
Keeps agentfield>=0.1.77 (don't downgrade — the HITL block depends on app.pause() which arrived in 0.1.77+).
Splits into 4 focused commits instead of three POC commits.

Architecture

Workspace-broadcast by default. Omitting userId on create_request sends to all active workspace members (existing hax-sdk behavior). AGENTFIELD_APPROVAL_USER_ID overrides to a specific Hub user.
Resume via control plane, not in-process callback. SWE-AF calls client.create_request(webhook_url=<CP>/api/v1/webhooks/approval-response) and then await app.pause(...). The CP transitions the execution to waiting, and when hax-sdk fires the webhook it resumes the paused future. No aiohttp callback server in the agent process — this means the agent stays stateless and the resume path survives container restarts.
State persistence. .artifacts/approval_state.json records the current decision/feedback/request_id/revision, so a crashed-and-restarted agent can be reasoned about externally.
Auto-enabled via env. No new flag — set HAX_API_KEY to engage, leave unset to skip the gate. Matches feat: HITL SDK integration #50's UX.

Known follow-ups (not in this PR)

plan-review-v2 template not in hax-sdk yet. hax-sdk strictly validates type against its template registry and currently has no renderer for plan-review-v2. A SWE-AF build with HAX_API_KEY set will fail at client.create_request until that template ships in hax-sdk. Tracked separately.
First-responder-wins claim semantics. Workspace broadcast already works, but two reviewers can race a response. Adding atomic claimed_by/claimed_at to hax-sdk is a separate PR there.
github-buddy issue-comment polish. Surfacing the hax review URL on the originating GitHub issue while waiting is a small follow-up in github-buddy.

Test plan

python3 -c 'from swe_af import app' — module loads without error
Local: HAX_API_KEY= unset, run a build, confirm Phase 1.5 logs HAX_API_KEY=NOT SET and skips the gate
Local: with HAX_API_KEY set and hax-sdk's plan-review-v2 template available, run a build, confirm app.pause suspends the execution and the request shows up in the hub
Local: approve via hub, confirm execution resumes into Phase 2
Local: request_changes, confirm Architect → Tech Lead → Sprint Planner re-run and a new approval request is posted
Local: reject, confirm BuildResult comes back with success=False and summary carries the rejection reason
Railway end-to-end on the test deployment

🤖 Generated with Claude Code

Both unlock the HITL plan-approval gate landing in a follow-up commit: hax-sdk is the human-approval client SWE-AF posts plans to, python-dotenv lets containerised builds pick up HAX_API_KEY (and any future env-driven config) from a mounted .env without baking it into the image. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

max_plan_revision_iterations bounds the human "request changes" loop so a reviewer can't accidentally pin a build forever; default 2 mirrors the existing tech-lead review cap. approval_expires_in_hours is the wall-clock ceiling on the hax-sdk request, default 72h so reviews can span a weekend without the request expiring out from under the reviewer. Both fields are inert until the HITL block in app.py reads them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…s set Insert a checkpoint between Phase 1 (plan + git_init) and Phase 2 (execute): when HAX_API_KEY is set in the environment, post the plan to hax-sdk via plan-review-v2 and call agentfield's app.pause(), which transitions the execution to "waiting" on the control plane and blocks until the reviewer responds via webhook. No callback server in-process — the CP owns the resume path, so the agent stays stateless and survives restarts. On request_changes, re-run Architect → Tech Lead → Sprint Planner with the reviewer's feedback (skipping PM since the PRD/scope is fixed) and re-post for approval, bounded by cfg.max_plan_revision_iterations. On rejected / expired / error, return a failure BuildResult without executing. Approval requests are workspace-broadcast by default; setting AGENTFIELD_APPROVAL_USER_ID assigns to a specific Hub user. State is persisted to .artifacts/approval_state.json so a crashed-and-restarted agent can be reasoned about. Also wire load_dotenv() into both swe-planner and swe-fast entry points so HAX_API_KEY (and other env-driven config) is available before Agent() is constructed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the hand-listed ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN / GH_TOKEN passthroughs with env_file: .env. The HITL gate adds HAX_API_KEY (and optionally HAX_SDK_URL, AGENTFIELD_APPROVAL_USER_ID), and growing the hand-listed array every time a new env var is needed loses to just letting docker-compose load the whole .env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

#68) PR #66 added hax-sdk to runtime requirements.txt but left requirements-docker.txt unchanged. The Dockerfile installs from requirements-docker.txt, so the deployed image was missing hax-sdk and the HITL approval gate failed at runtime with "No module named 'hax'" whenever HAX_API_KEY was set (swe_af/app.py:622). Also bumps agentfield floor to >=0.1.77 to match the runtime list (drift left over from PR #62) and adds python-dotenv>=1.0 which was likewise missing from the docker variant. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

AbirAbbas and others added 4 commits May 7, 2026 12:37

AbirAbbas merged commit f807337 into main May 7, 2026
2 checks passed

AbirAbbas deleted the feat/hitl-rebased branch May 7, 2026 17:43

AbirAbbas mentioned this pull request May 7, 2026

fix(deps): mirror hax-sdk + python-dotenv into Docker requirements #68

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(approval): rebase HITL plan-approval gate (supersedes #50)#66

feat(approval): rebase HITL plan-approval gate (supersedes #50)#66
AbirAbbas merged 4 commits into
mainfrom
feat/hitl-rebased

AbirAbbas commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AbirAbbas commented May 7, 2026

Summary

What's different from #50

Architecture

Known follow-ups (not in this PR)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant