Skip to content

Commit 210ee6e

Browse files
authored
Merge pull request #11 from aws-samples/pr-iteration
feat(project): add task types for pr review and pr iteration
2 parents 538602a + c03e244 commit 210ee6e

53 files changed

Lines changed: 2833 additions & 197 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ To get started and understand the developer flow, follow the [Developer guide](.
2424
- **`scripts/`** (root) — Optional cross-package helpers; **`scripts/ci-build.sh`** runs the full monorepo build (same as CI).
2525
- **`cdk/`** — CDK app package (`@abca/cdk`): `cdk/src/`, `cdk/test/`, `cdk/cdk.json`, `cdk/tsconfig.json`, `cdk/tsconfig.dev.json`, and `cdk/.eslintrc.json`.
2626
- **`cli/`**`@backgroundagent/cli` — CLI tool for interacting with the deployed REST API (see below).
27-
- **`agent/`** — Python code that runs inside the agent compute environment (entrypoint, server, system prompt, Dockerfile, requirements).
27+
- **`agent/`** — Python code that runs inside the agent compute environment (entrypoint, server, system prompt, Dockerfile, requirements). The system prompt is refactored into `agent/prompts/` with a shared base template and per-task-type workflow variants (`new_task`, `pr_iteration`, `pr_review`).
2828
- **`docs/`** — Authoritative Markdown in `guides/` (developer, user, roadmap, prompt) and `design/`; assets in `diagrams/`, `imgs/`. The Starlight docs site lives here (`astro.config.mjs`, `package.json`); `src/content/docs/` is refreshed via `docs/scripts/sync-starlight.mjs`.
2929
- **`CONTRIBUTING.md`** — Contribution guidelines at the repository root.
3030
- **`package.json`** (root), **`yarn.lock`** — Yarn workspace root (minimal manifest); dependencies live in **`cdk/`**, **`cli/`**, and **`docs/`** package manifests.
@@ -40,7 +40,7 @@ The `@backgroundagent/cli` package provides the `bgagent` executable for submitt
4040
- `src/api-client.ts` — HTTP client wrapping `fetch` with auth header injection
4141
- `src/auth.ts` — Cognito login, token caching (`~/.bgagent/credentials.json`), auto-refresh
4242
- `src/config.ts` — Read/write `~/.bgagent/config.json`
43-
- `src/types.ts` — API request/response types (mirrored from `cdk/src/handlers/shared/types.ts`)
43+
- `src/types.ts` — API request/response types (mirrored from `cdk/src/handlers/shared/types.ts`), including `TaskType` (`new_task` | `pr_iteration` | `pr_review`)
4444
- `src/format.ts` — Output formatting (table, detail view, JSON)
4545
- `src/debug.ts` — Verbose/debug logging (`--verbose` flag)
4646
- `src/errors.ts``CliError` and `ApiError` classes

agent/Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ RUN uv sync --frozen --no-dev --directory /app
5151
# Copy agent code (ARG busts cache so file edits are always picked up)
5252
ARG CACHE_BUST=0
5353
COPY entrypoint.py system_prompt.py server.py task_state.py observability.py memory.py /app/
54+
COPY prompts/ /app/prompts/
5455
COPY prepare-commit-msg.sh /app/
5556
COPY test_sdk_smoke.py test_subprocess_threading.py /app/
5657

agent/README.md

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Agent Runtime
22

3-
The agent runtime container for ABCA. Each agent instance clones a GitHub repo, works on a task using Claude, and opens a pull request. Runs as a Docker container with two modes:
3+
The agent runtime container for ABCA. Each agent instance clones a GitHub repo, works on a task using Claude, and delivers a result — a new pull request (`new_task`), updates to an existing PR (`pr_iteration`), or structured review comments on a PR (`pr_review`). Runs as a Docker container with two modes:
44

55
- **Local mode** — batch execution via `run.sh` with AgentCore-matching constraints (2 vCPU, 8 GB RAM)
66
- **AgentCore mode** — FastAPI server on port 8080 with `/invocations` and `/ping` endpoints, deployable to AWS Bedrock AgentCore Runtime
@@ -224,6 +224,12 @@ bgagent submit --repo owner/repo --task "update the rfc issue template"
224224
# Submit with a GitHub issue
225225
bgagent submit --repo owner/repo --issue 42
226226

227+
# Iterate on a PR (address review feedback)
228+
bgagent submit --repo owner/repo --pr 42
229+
230+
# Review a PR (read-only — posts structured review comments)
231+
bgagent submit --repo owner/repo --review-pr 55
232+
227233
# Submit and wait for completion
228234
bgagent submit --repo owner/repo --issue 42 --wait
229235
```
@@ -252,18 +258,18 @@ The `run.sh` script prints these commands when it starts.
252258

253259
## What It Does
254260

255-
The agent pipeline (shared by both modes):
261+
The agent pipeline (shared by both modes). Behavior varies by task type (`new_task`, `pr_iteration`, `pr_review`):
256262

257263
1. **Config validation** — checks required parameters
258-
2. **Context hydration** — fetches the GitHub issue (title, body, comments) if an issue number is provided
259-
3. **Prompt assembly** — combines the system prompt (behavioral contract) with the issue context and task description
260-
4. **Deterministic pre-hooks** — clones repo, creates branch, configures git auth, runs `mise trust`, `mise install`, `mise run build`, and `mise run lint`
264+
2. **Context hydration** — fetches the GitHub issue (title, body, comments) if an issue number is provided; for `pr_iteration` and `pr_review`, fetches PR context (diff, description, review comments)
265+
3. **Prompt assembly** — combines the system prompt (behavioral contract, selected by task type from `prompts/`) with the issue/PR context and task description
266+
4. **Deterministic pre-hooks** — clones repo, creates or checks out branch, configures git auth, runs `mise trust`, `mise install`, `mise run build`, and `mise run lint`
261267
5. **Agent execution** — invokes the Claude Agent SDK via the `ClaudeSDKClient` class (connect/query/receive_response pattern) in unattended mode. The agent:
262268
- Understands the codebase
263-
- Makes changes, runs tests and linters
264-
- Commits and pushes after each unit of work
265-
- Creates a pull request with summary, testing notes, and decisions
266-
6. **Deterministic post-hooks** — verifies `mise run build` and `mise run lint`, ensures a PR exists (creates one if the agent did not)
269+
- **`new_task`**: Makes changes, runs tests and linters, commits and pushes after each unit of work, creates a pull request
270+
- **`pr_iteration`**: Reads review feedback, addresses it with focused changes, commits and pushes, posts a summary comment on the PR
271+
- **`pr_review`**: Analyzes changes read-only (no `Write` or `Edit` tools available), composes structured review findings, posts a batch review via the GitHub Reviews API
272+
6. **Deterministic post-hooks** — verifies `mise run build` and `mise run lint`, ensures a PR exists (creates one if the agent did not). For `pr_review`, build status is informational only and the commit/push steps are skipped.
267273
7. **Metrics** — returns duration, disk usage, turn count, cost, and PR URL
268274

269275
## Metrics
@@ -322,9 +328,16 @@ agent/
322328
├── task_state.py Best-effort DynamoDB task status (no-op if TASK_TABLE_NAME unset)
323329
├── observability.py OpenTelemetry helpers (e.g. AgentCore session id)
324330
├── memory.py Optional memory / episode integration for the agent
331+
├── prompts/ Per-task-type system prompt workflows
332+
│ ├── __init__.py Prompt registry — assembles base template + workflow for each task type
333+
│ ├── base.py Shared base template (environment, rules, placeholders)
334+
│ ├── new_task.py Workflow for new_task (create branch, implement, open PR)
335+
│ ├── pr_iteration.py Workflow for pr_iteration (read feedback, address, push)
336+
│ └── pr_review.py Workflow for pr_review (read-only analysis, structured review comments)
325337
├── system_prompt.py Behavioral contract (PRD Section 11)
326338
├── prepare-commit-msg.sh Git hook (Task-Id / Prompt-Version trailers on commits)
327339
├── run.sh Build + run helper for local/server mode with AgentCore constraints
340+
├── tests/ pytest unit tests for pure functions and prompt assembly
328341
├── test_sdk_smoke.py Diagnostic: minimal SDK smoke test (ClaudeSDKClient → CLI → Bedrock)
329342
└── test_subprocess_threading.py Diagnostic: subprocess-in-background-thread verification
330343
```

agent/entrypoint.py

Lines changed: 127 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
import memory as agent_memory
3232
import task_state
3333
from observability import task_span
34+
from prompts import get_system_prompt
3435
from system_prompt import SYSTEM_PROMPT
3536

3637
# ---------------------------------------------------------------------------
@@ -39,6 +40,9 @@
3940

4041
AGENT_WORKSPACE = os.environ.get("AGENT_WORKSPACE", "/workspace")
4142

43+
# Task types that operate on an existing pull request.
44+
PR_TASK_TYPES = frozenset(("pr_iteration", "pr_review"))
45+
4246

4347
def resolve_github_token() -> str:
4448
"""Resolve GitHub token from Secrets Manager or environment variable.
@@ -77,6 +81,9 @@ def build_config(
7781
dry_run: bool = False,
7882
task_id: str = "",
7983
system_prompt_overrides: str = "",
84+
task_type: str = "new_task",
85+
branch_name: str = "",
86+
pr_number: str = "",
8087
) -> dict:
8188
"""Build and validate configuration from explicit parameters.
8289
@@ -94,6 +101,9 @@ def build_config(
94101
"max_turns": max_turns,
95102
"max_budget_usd": max_budget_usd,
96103
"system_prompt_overrides": system_prompt_overrides,
104+
"task_type": task_type,
105+
"branch_name": branch_name,
106+
"pr_number": pr_number,
97107
}
98108

99109
errors = []
@@ -103,7 +113,10 @@ def build_config(
103113
errors.append("github_token is required")
104114
if not config["aws_region"]:
105115
errors.append("aws_region is required for Bedrock")
106-
if not config["issue_number"] and not config["task_description"]:
116+
if config["task_type"] in PR_TASK_TYPES:
117+
if not config["pr_number"]:
118+
errors.append("pr_number is required for pr_iteration/pr_review task type")
119+
elif not config["issue_number"] and not config["task_description"]:
107120
errors.append("Either issue_number or task_description is required")
108121

109122
if errors:
@@ -303,15 +316,19 @@ def setup_repo(config: dict) -> dict:
303316
repo_dir = f"{AGENT_WORKSPACE}/{config['task_id']}"
304317
setup: dict[str, str | list[str] | bool] = {"repo_dir": repo_dir, "notes": []}
305318

306-
# Derive branch slug from issue title or task description
307-
title = ""
308-
if config.get("issue"):
309-
title = config["issue"]["title"]
310-
if not title:
311-
title = config["task_description"]
312-
slug = slugify(title)
313-
branch = f"bgagent/{config['task_id']}/{slug}"
314-
setup["branch"] = branch
319+
if config.get("task_type") in PR_TASK_TYPES and config.get("branch_name"):
320+
branch = config["branch_name"]
321+
setup["branch"] = branch
322+
else:
323+
# Derive branch slug from issue title or task description
324+
title = ""
325+
if config.get("issue"):
326+
title = config["issue"]["title"]
327+
if not title:
328+
title = config["task_description"]
329+
slug = slugify(title)
330+
branch = f"bgagent/{config['task_id']}/{slug}"
331+
setup["branch"] = branch
315332

316333
# Mark the repo directory as safe for git. On persistent session storage
317334
# the mount may be owned by a different UID than the container user,
@@ -343,9 +360,22 @@ def setup_repo(config: dict) -> dict:
343360
cwd=repo_dir,
344361
)
345362

346-
# Create branch
347-
log("SETUP", f"Creating branch: {branch}")
348-
run_cmd(["git", "checkout", "-b", branch], label="create-branch", cwd=repo_dir)
363+
# Branch setup
364+
if config.get("task_type") in PR_TASK_TYPES and config.get("branch_name"):
365+
log("SETUP", f"Checking out existing PR branch: {branch}")
366+
run_cmd(
367+
["git", "fetch", "origin", branch],
368+
label="fetch-pr-branch",
369+
cwd=repo_dir,
370+
)
371+
run_cmd(
372+
["git", "checkout", "-b", branch, f"origin/{branch}"],
373+
label="checkout-pr-branch",
374+
cwd=repo_dir,
375+
)
376+
else:
377+
log("SETUP", f"Creating branch: {branch}")
378+
run_cmd(["git", "checkout", "-b", branch], label="create-branch", cwd=repo_dir)
349379

350380
# Trust mise config files in the cloned repo (required before mise install)
351381
run_cmd(
@@ -402,7 +432,11 @@ def setup_repo(config: dict) -> dict:
402432
setup["lint_before"] = True
403433

404434
# Detect default branch
405-
setup["default_branch"] = detect_default_branch(config["repo_url"], repo_dir)
435+
# For PR tasks (pr_iteration, pr_review): use base_branch from orchestrator if available
436+
if config.get("task_type") in PR_TASK_TYPES and config.get("base_branch"):
437+
setup["default_branch"] = config["base_branch"]
438+
else:
439+
setup["default_branch"] = detect_default_branch(config["repo_url"], repo_dir)
406440

407441
# Install prepare-commit-msg hook for code attribution
408442
_install_commit_hook(repo_dir)
@@ -620,6 +654,10 @@ def ensure_pr(
620654
) -> str | None:
621655
"""Check if a PR exists for the branch; if not, create one.
622656
657+
For ``new_task``: creates a new PR if needed.
658+
For ``pr_iteration``: pushes commits, then resolves the existing PR URL.
659+
For ``pr_review``: resolves the existing PR URL without pushing (read-only).
660+
623661
Returns the PR URL, or None if there are no commits beyond the default
624662
branch or PR creation failed. ``build_passed`` and ``lint_passed`` control
625663
the verification status shown in the PR body.
@@ -628,6 +666,40 @@ def ensure_pr(
628666
branch = setup["branch"]
629667
default_branch = setup.get("default_branch", "main")
630668

669+
# PR iteration/review: skip PR creation — just resolve existing PR URL
670+
if config.get("task_type") in PR_TASK_TYPES:
671+
if config.get("task_type") == "pr_iteration":
672+
if not ensure_pushed(repo_dir, branch):
673+
log("WARN", "Failed to push commits before resolving PR URL")
674+
else:
675+
log("POST", "pr_review task — skipping push (read-only)")
676+
log("POST", f"{config.get('task_type')} — returning existing PR URL")
677+
result = subprocess.run(
678+
[
679+
"gh",
680+
"pr",
681+
"view",
682+
branch,
683+
"--repo",
684+
config["repo_url"],
685+
"--json",
686+
"url",
687+
"-q",
688+
".url",
689+
],
690+
cwd=repo_dir,
691+
capture_output=True,
692+
text=True,
693+
timeout=60,
694+
)
695+
if result.returncode == 0 and result.stdout.strip():
696+
pr_url = result.stdout.strip()
697+
log("POST", f"Existing PR: {pr_url}")
698+
return pr_url
699+
stderr_msg = result.stderr.strip() if result.stderr else "(no stderr)"
700+
log("WARN", f"Could not resolve existing PR URL (rc={result.returncode}): {stderr_msg}")
701+
return None
702+
631703
# Check if the agent already created a PR for this branch
632704
log("POST", "Checking for existing PR...")
633705
result = subprocess.run(
@@ -1274,10 +1346,15 @@ def _on_stderr(line: str) -> None:
12741346
else:
12751347
log("WARN", "claude CLI not found on PATH")
12761348

1349+
if config.get("task_type") == "pr_review":
1350+
allowed_tools = ["Bash", "Read", "Glob", "Grep", "WebFetch"]
1351+
else:
1352+
allowed_tools = ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "WebFetch"]
1353+
12771354
options = ClaudeAgentOptions(
12781355
model=config["anthropic_model"],
12791356
system_prompt=system_prompt,
1280-
allowed_tools=["Bash", "Read", "Write", "Edit", "Glob", "Grep", "WebFetch"],
1357+
allowed_tools=allowed_tools,
12811358
permission_mode="bypassPermissions",
12821359
cwd=cwd,
12831360
max_turns=config["max_turns"],
@@ -1482,7 +1559,13 @@ def _build_system_prompt(
14821559
overrides: str,
14831560
) -> str:
14841561
"""Assemble the system prompt with task-specific values and memory context."""
1485-
system_prompt = SYSTEM_PROMPT.replace("{repo_url}", config["repo_url"])
1562+
task_type = config.get("task_type", "new_task")
1563+
try:
1564+
system_prompt = get_system_prompt(task_type)
1565+
except ValueError:
1566+
log("ERROR", f"Unknown task_type {task_type!r} — falling back to default system prompt")
1567+
system_prompt = SYSTEM_PROMPT
1568+
system_prompt = system_prompt.replace("{repo_url}", config["repo_url"])
14861569
system_prompt = system_prompt.replace("{task_id}", config["task_id"])
14871570
system_prompt = system_prompt.replace("{workspace}", AGENT_WORKSPACE)
14881571
system_prompt = system_prompt.replace("{branch_name}", setup["branch"])
@@ -1513,6 +1596,14 @@ def _build_system_prompt(
15131596
memory_context_text = "\n".join(mc_parts)
15141597
system_prompt = system_prompt.replace("{memory_context}", memory_context_text)
15151598

1599+
# Substitute PR-specific placeholders
1600+
pr_number_val = config.get("pr_number", "")
1601+
if pr_number_val:
1602+
system_prompt = system_prompt.replace("{pr_number}", str(pr_number_val))
1603+
elif "{pr_number}" in system_prompt:
1604+
log("WARN", "System prompt contains {pr_number} placeholder but no pr_number in config")
1605+
system_prompt = system_prompt.replace("{pr_number}", "(unknown)")
1606+
15161607
# Append Blueprint system_prompt_overrides after all placeholder
15171608
# substitutions (avoids double-substitution if overrides contain
15181609
# template placeholders like {repo_url}).
@@ -1628,6 +1719,9 @@ def run_task(
16281719
system_prompt_overrides: str = "",
16291720
prompt_version: str = "",
16301721
memory_id: str = "",
1722+
task_type: str = "new_task",
1723+
branch_name: str = "",
1724+
pr_number: str = "",
16311725
) -> dict:
16321726
"""Run the full agent pipeline and return a result dict.
16331727
@@ -1652,6 +1746,9 @@ def run_task(
16521746
aws_region=aws_region,
16531747
task_id=task_id,
16541748
system_prompt_overrides=system_prompt_overrides,
1749+
task_type=task_type,
1750+
branch_name=branch_name,
1751+
pr_number=pr_number,
16551752
)
16561753

16571754
log("TASK", f"Task ID: {config['task_id']}")
@@ -1678,6 +1775,8 @@ def run_task(
16781775
prompt = hydrated_context["user_prompt"]
16791776
if hydrated_context.get("issue"):
16801777
config["issue"] = hydrated_context["issue"]
1778+
if hydrated_context.get("resolved_base_branch"):
1779+
config["base_branch"] = hydrated_context["resolved_base_branch"]
16811780
if hydrated_context.get("truncated"):
16821781
log("WARN", "Context was truncated by orchestrator token budget")
16831782
else:
@@ -1765,8 +1864,11 @@ def run_task(
17651864

17661865
# Post-hooks
17671866
with task_span("task.post_hooks") as post_span:
1768-
# Safety net: commit any uncommitted tracked changes
1769-
safety_committed = ensure_committed(setup["repo_dir"])
1867+
# Safety net: commit any uncommitted tracked changes (skip for read-only tasks)
1868+
if config.get("task_type") == "pr_review":
1869+
safety_committed = False
1870+
else:
1871+
safety_committed = ensure_committed(setup["repo_dir"])
17701872
post_span.set_attribute("safety_net.committed", safety_committed)
17711873

17721874
build_passed = verify_build(setup["repo_dir"])
@@ -1810,8 +1912,13 @@ def run_task(
18101912
# Default True = assume build was green before, so a post-agent
18111913
# failure IS counted as a regression (conservative).
18121914
build_before = setup.get("build_before", True)
1813-
build_ok = build_passed or not build_before
1814-
if not build_passed and not build_before:
1915+
if config.get("task_type") == "pr_review":
1916+
build_ok = True # Review task — build status is informational only
1917+
if not build_passed:
1918+
log("INFO", "pr_review: build failed — informational only, not gating")
1919+
else:
1920+
build_ok = build_passed or not build_before
1921+
if not build_passed and not build_before and config.get("task_type") != "pr_review":
18151922
log(
18161923
"WARN",
18171924
"Post-agent build failed, but build was already failing before "

0 commit comments

Comments
 (0)