Skip to content

feat(approval): rebase HITL plan-approval gate (supersedes #50)#66

Merged
AbirAbbas merged 4 commits into
mainfrom
feat/hitl-rebased
May 7, 2026
Merged

feat(approval): rebase HITL plan-approval gate (supersedes #50)#66
AbirAbbas merged 4 commits into
mainfrom
feat/hitl-rebased

Conversation

@AbirAbbas
Copy link
Copy Markdown
Collaborator

Summary

Rebases the human-in-the-loop plan-approval gate from #50 onto current main. PR #50 was 34 commits behind and conflicting; rather than try to resolve those conflicts mechanically, this re-applies just the HITL bits cleanly on a fresh branch.

When HAX_API_KEY is set on the SWE-AF service, every build() pauses between Phase 1 (plan + git_init) and Phase 2 (execute) and posts the plan to hax-sdk for human review. Reviewer can approve, request changes (re-runs Architect → Tech Lead → Sprint Planner with the feedback, bounded by cfg.max_plan_revision_iterations), or reject. Without HAX_API_KEY, behavior is unchanged.

What's different from #50

  • Cleanly rebased on main — preserves _run_ci_gate, permission_mode, env-driven runtime defaults, build-id workspace scoping, and all post-PR CI / resolve() machinery that landed after feat: HITL SDK integration #50 was cut.
  • Drops swe_af/approval.py (467 lines of dead POC code that wasn't imported anywhere — superseded by the inline app.pause() integration).
  • Keeps agentfield>=0.1.77 (don't downgrade — the HITL block depends on app.pause() which arrived in 0.1.77+).
  • Splits into 4 focused commits instead of three POC commits.

Architecture

  • Workspace-broadcast by default. Omitting userId on create_request sends to all active workspace members (existing hax-sdk behavior). AGENTFIELD_APPROVAL_USER_ID overrides to a specific Hub user.
  • Resume via control plane, not in-process callback. SWE-AF calls client.create_request(webhook_url=<CP>/api/v1/webhooks/approval-response) and then await app.pause(...). The CP transitions the execution to waiting, and when hax-sdk fires the webhook it resumes the paused future. No aiohttp callback server in the agent process — this means the agent stays stateless and the resume path survives container restarts.
  • State persistence. .artifacts/approval_state.json records the current decision/feedback/request_id/revision, so a crashed-and-restarted agent can be reasoned about externally.
  • Auto-enabled via env. No new flag — set HAX_API_KEY to engage, leave unset to skip the gate. Matches feat: HITL SDK integration #50's UX.

Known follow-ups (not in this PR)

  1. plan-review-v2 template not in hax-sdk yet. hax-sdk strictly validates type against its template registry and currently has no renderer for plan-review-v2. A SWE-AF build with HAX_API_KEY set will fail at client.create_request until that template ships in hax-sdk. Tracked separately.
  2. First-responder-wins claim semantics. Workspace broadcast already works, but two reviewers can race a response. Adding atomic claimed_by/claimed_at to hax-sdk is a separate PR there.
  3. github-buddy issue-comment polish. Surfacing the hax review URL on the originating GitHub issue while waiting is a small follow-up in github-buddy.

Test plan

  • python3 -c 'from swe_af import app' — module loads without error
  • Local: HAX_API_KEY= unset, run a build, confirm Phase 1.5 logs HAX_API_KEY=NOT SET and skips the gate
  • Local: with HAX_API_KEY set and hax-sdk's plan-review-v2 template available, run a build, confirm app.pause suspends the execution and the request shows up in the hub
  • Local: approve via hub, confirm execution resumes into Phase 2
  • Local: request_changes, confirm Architect → Tech Lead → Sprint Planner re-run and a new approval request is posted
  • Local: reject, confirm BuildResult comes back with success=False and summary carries the rejection reason
  • Railway end-to-end on the test deployment

🤖 Generated with Claude Code

AbirAbbas and others added 4 commits May 7, 2026 12:37
Both unlock the HITL plan-approval gate landing in a follow-up commit:
hax-sdk is the human-approval client SWE-AF posts plans to, python-dotenv
lets containerised builds pick up HAX_API_KEY (and any future env-driven
config) from a mounted .env without baking it into the image.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
max_plan_revision_iterations bounds the human "request changes" loop so a
reviewer can't accidentally pin a build forever; default 2 mirrors the
existing tech-lead review cap. approval_expires_in_hours is the wall-clock
ceiling on the hax-sdk request, default 72h so reviews can span a weekend
without the request expiring out from under the reviewer.

Both fields are inert until the HITL block in app.py reads them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s set

Insert a checkpoint between Phase 1 (plan + git_init) and Phase 2 (execute):
when HAX_API_KEY is set in the environment, post the plan to hax-sdk via
plan-review-v2 and call agentfield's app.pause(), which transitions the
execution to "waiting" on the control plane and blocks until the reviewer
responds via webhook. No callback server in-process — the CP owns the
resume path, so the agent stays stateless and survives restarts.

On request_changes, re-run Architect → Tech Lead → Sprint Planner with the
reviewer's feedback (skipping PM since the PRD/scope is fixed) and re-post
for approval, bounded by cfg.max_plan_revision_iterations. On rejected /
expired / error, return a failure BuildResult without executing.

Approval requests are workspace-broadcast by default; setting
AGENTFIELD_APPROVAL_USER_ID assigns to a specific Hub user. State is
persisted to .artifacts/approval_state.json so a crashed-and-restarted
agent can be reasoned about.

Also wire load_dotenv() into both swe-planner and swe-fast entry points so
HAX_API_KEY (and other env-driven config) is available before Agent() is
constructed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the hand-listed ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN /
GH_TOKEN passthroughs with env_file: .env. The HITL gate adds HAX_API_KEY
(and optionally HAX_SDK_URL, AGENTFIELD_APPROVAL_USER_ID), and growing the
hand-listed array every time a new env var is needed loses to just letting
docker-compose load the whole .env.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@AbirAbbas AbirAbbas merged commit f807337 into main May 7, 2026
2 checks passed
@AbirAbbas AbirAbbas deleted the feat/hitl-rebased branch May 7, 2026 17:43
AbirAbbas added a commit that referenced this pull request May 7, 2026
#68)

PR #66 added hax-sdk to runtime requirements.txt but left
requirements-docker.txt unchanged. The Dockerfile installs from
requirements-docker.txt, so the deployed image was missing hax-sdk
and the HITL approval gate failed at runtime with "No module named
'hax'" whenever HAX_API_KEY was set (swe_af/app.py:622).

Also bumps agentfield floor to >=0.1.77 to match the runtime list
(drift left over from PR #62) and adds python-dotenv>=1.0 which
was likewise missing from the docker variant.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant