aws-samples
diff --git a/‎AGENTS.md‎
Lines changed: 2 additions & 2 deletions b/‎AGENTS.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎agent/Dockerfile‎
Lines changed: 1 addition & 0 deletions b/‎agent/Dockerfile‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎agent/README.md‎
Lines changed: 22 additions & 9 deletions b/‎agent/README.md‎
Lines changed: 22 additions & 9 deletions
diff --git a/‎agent/entrypoint.py‎
Lines changed: 127 additions & 20 deletions b/‎agent/entrypoint.py‎
Lines changed: 127 additions & 20 deletions
@@ -24,7 +24,7 @@ To get started and understand the developer flow, follow the [Developer guide](.
 - **`scripts/`** (root) — Optional cross-package helpers; **`scripts/ci-build.sh`** runs the full monorepo build (same as CI).
 - **`cdk/`** — CDK app package (`@abca/cdk`): `cdk/src/`, `cdk/test/`, `cdk/cdk.json`, `cdk/tsconfig.json`, `cdk/tsconfig.dev.json`, and `cdk/.eslintrc.json`.
 - **`cli/`** — `@backgroundagent/cli` — CLI tool for interacting with the deployed REST API (see below).
-- **`agent/`** — Python code that runs inside the agent compute environment (entrypoint, server, system prompt, Dockerfile, requirements).
+- **`agent/`** — Python code that runs inside the agent compute environment (entrypoint, server, system prompt, Dockerfile, requirements). The system prompt is refactored into `agent/prompts/` with a shared base template and per-task-type workflow variants (`new_task`, `pr_iteration`, `pr_review`).
 - **`docs/`** — Authoritative Markdown in `guides/` (developer, user, roadmap, prompt) and `design/`; assets in `diagrams/`, `imgs/`. The Starlight docs site lives here (`astro.config.mjs`, `package.json`); `src/content/docs/` is refreshed via `docs/scripts/sync-starlight.mjs`.
 - **`CONTRIBUTING.md`** — Contribution guidelines at the repository root.
 - **`package.json`** (root), **`yarn.lock`** — Yarn workspace root (minimal manifest); dependencies live in **`cdk/`**, **`cli/`**, and **`docs/`** package manifests.
@@ -40,7 +40,7 @@ The `@backgroundagent/cli` package provides the `bgagent` executable for submitt
 - `src/api-client.ts` — HTTP client wrapping `fetch` with auth header injection
 - `src/auth.ts` — Cognito login, token caching (`~/.bgagent/credentials.json`), auto-refresh
 - `src/config.ts` — Read/write `~/.bgagent/config.json`
-- `src/types.ts` — API request/response types (mirrored from `cdk/src/handlers/shared/types.ts`)
+- `src/types.ts` — API request/response types (mirrored from `cdk/src/handlers/shared/types.ts`), including `TaskType` (`new_task` | `pr_iteration` | `pr_review`)
 - `src/format.ts` — Output formatting (table, detail view, JSON)
 - `src/debug.ts` — Verbose/debug logging (`--verbose` flag)
 - `src/errors.ts` — `CliError` and `ApiError` classes
 
@@ -51,6 +51,7 @@ RUN uv sync --frozen --no-dev --directory /app
 # Copy agent code (ARG busts cache so file edits are always picked up)
 ARG CACHE_BUST=0
 COPY entrypoint.py system_prompt.py server.py task_state.py observability.py memory.py /app/
+COPY prompts/ /app/prompts/
 COPY prepare-commit-msg.sh /app/
 COPY test_sdk_smoke.py test_subprocess_threading.py /app/
 
 
@@ -1,6 +1,6 @@
 # Agent Runtime
 
-The agent runtime container for ABCA. Each agent instance clones a GitHub repo, works on a task using Claude, and opens a pull request. Runs as a Docker container with two modes:
+The agent runtime container for ABCA. Each agent instance clones a GitHub repo, works on a task using Claude, and delivers a result — a new pull request (`new_task`), updates to an existing PR (`pr_iteration`), or structured review comments on a PR (`pr_review`). Runs as a Docker container with two modes:
 
 - **Local mode** — batch execution via `run.sh` with AgentCore-matching constraints (2 vCPU, 8 GB RAM)
 - **AgentCore mode** — FastAPI server on port 8080 with `/invocations` and `/ping` endpoints, deployable to AWS Bedrock AgentCore Runtime
@@ -224,6 +224,12 @@ bgagent submit --repo owner/repo --task "update the rfc issue template"
 # Submit with a GitHub issue
 bgagent submit --repo owner/repo --issue 42
 
+# Iterate on a PR (address review feedback)
+bgagent submit --repo owner/repo --pr 42
+
+# Review a PR (read-only — posts structured review comments)
+bgagent submit --repo owner/repo --review-pr 55
+
 # Submit and wait for completion
 bgagent submit --repo owner/repo --issue 42 --wait
 ```
@@ -252,18 +258,18 @@ The `run.sh` script prints these commands when it starts.
 
 ## What It Does
 
-The agent pipeline (shared by both modes):
+The agent pipeline (shared by both modes). Behavior varies by task type (`new_task`, `pr_iteration`, `pr_review`):
 
 1. **Config validation** — checks required parameters
-2. **Context hydration** — fetches the GitHub issue (title, body, comments) if an issue number is provided
-3. **Prompt assembly** — combines the system prompt (behavioral contract) with the issue context and task description
-4. **Deterministic pre-hooks** — clones repo, creates branch, configures git auth, runs `mise trust`, `mise install`, `mise run build`, and `mise run lint`
+2. **Context hydration** — fetches the GitHub issue (title, body, comments) if an issue number is provided; for `pr_iteration` and `pr_review`, fetches PR context (diff, description, review comments)
+3. **Prompt assembly** — combines the system prompt (behavioral contract, selected by task type from `prompts/`) with the issue/PR context and task description
+4. **Deterministic pre-hooks** — clones repo, creates or checks out branch, configures git auth, runs `mise trust`, `mise install`, `mise run build`, and `mise run lint`
 5. **Agent execution** — invokes the Claude Agent SDK via the `ClaudeSDKClient` class (connect/query/receive_response pattern) in unattended mode. The agent:
    - Understands the codebase
-   - Makes changes, runs tests and linters
-   - Commits and pushes after each unit of work
-   - Creates a pull request with summary, testing notes, and decisions
-6. **Deterministic post-hooks** — verifies `mise run build` and `mise run lint`, ensures a PR exists (creates one if the agent did not)
+   - **`new_task`**: Makes changes, runs tests and linters, commits and pushes after each unit of work, creates a pull request
+   - **`pr_iteration`**: Reads review feedback, addresses it with focused changes, commits and pushes, posts a summary comment on the PR
+   - **`pr_review`**: Analyzes changes read-only (no `Write` or `Edit` tools available), composes structured review findings, posts a batch review via the GitHub Reviews API
+6. **Deterministic post-hooks** — verifies `mise run build` and `mise run lint`, ensures a PR exists (creates one if the agent did not). For `pr_review`, build status is informational only and the commit/push steps are skipped.
 7. **Metrics** — returns duration, disk usage, turn count, cost, and PR URL
 
 ## Metrics
@@ -322,9 +328,16 @@ agent/
 ├── task_state.py        Best-effort DynamoDB task status (no-op if TASK_TABLE_NAME unset)
 ├── observability.py     OpenTelemetry helpers (e.g. AgentCore session id)
 ├── memory.py            Optional memory / episode integration for the agent
+├── prompts/             Per-task-type system prompt workflows
+│   ├── __init__.py      Prompt registry — assembles base template + workflow for each task type
+│   ├── base.py          Shared base template (environment, rules, placeholders)
+│   ├── new_task.py      Workflow for new_task (create branch, implement, open PR)
+│   ├── pr_iteration.py  Workflow for pr_iteration (read feedback, address, push)
+│   └── pr_review.py     Workflow for pr_review (read-only analysis, structured review comments)
 ├── system_prompt.py     Behavioral contract (PRD Section 11)
 ├── prepare-commit-msg.sh Git hook (Task-Id / Prompt-Version trailers on commits)
 ├── run.sh               Build + run helper for local/server mode with AgentCore constraints
+├── tests/               pytest unit tests for pure functions and prompt assembly
 ├── test_sdk_smoke.py    Diagnostic: minimal SDK smoke test (ClaudeSDKClient → CLI → Bedrock)
 └── test_subprocess_threading.py  Diagnostic: subprocess-in-background-thread verification
 ```
 
@@ -31,6 +31,7 @@
 import memory as agent_memory
 import task_state
 from observability import task_span
+from prompts import get_system_prompt
 from system_prompt import SYSTEM_PROMPT
 
 # ---------------------------------------------------------------------------
@@ -39,6 +40,9 @@
 
 AGENT_WORKSPACE = os.environ.get("AGENT_WORKSPACE", "/workspace")
 
+# Task types that operate on an existing pull request.
+PR_TASK_TYPES = frozenset(("pr_iteration", "pr_review"))
+
 
 def resolve_github_token() -> str:
     """Resolve GitHub token from Secrets Manager or environment variable.
@@ -77,6 +81,9 @@ def build_config(
     dry_run: bool = False,
     task_id: str = "",
     system_prompt_overrides: str = "",
+    task_type: str = "new_task",
+    branch_name: str = "",
+    pr_number: str = "",
 ) -> dict:
     """Build and validate configuration from explicit parameters.
 
@@ -94,6 +101,9 @@ def build_config(
         "max_turns": max_turns,
         "max_budget_usd": max_budget_usd,
         "system_prompt_overrides": system_prompt_overrides,
+        "task_type": task_type,
+        "branch_name": branch_name,
+        "pr_number": pr_number,
     }
 
     errors = []
@@ -103,7 +113,10 @@ def build_config(
         errors.append("github_token is required")
     if not config["aws_region"]:
         errors.append("aws_region is required for Bedrock")
-    if not config["issue_number"] and not config["task_description"]:
+    if config["task_type"] in PR_TASK_TYPES:
+        if not config["pr_number"]:
+            errors.append("pr_number is required for pr_iteration/pr_review task type")
+    elif not config["issue_number"] and not config["task_description"]:
         errors.append("Either issue_number or task_description is required")
 
     if errors:
@@ -303,15 +316,19 @@ def setup_repo(config: dict) -> dict:
     repo_dir = f"{AGENT_WORKSPACE}/{config['task_id']}"
     setup: dict[str, str | list[str] | bool] = {"repo_dir": repo_dir, "notes": []}
 
-    # Derive branch slug from issue title or task description
-    title = ""
-    if config.get("issue"):
-        title = config["issue"]["title"]
-    if not title:
-        title = config["task_description"]
-    slug = slugify(title)
-    branch = f"bgagent/{config['task_id']}/{slug}"
-    setup["branch"] = branch
+    if config.get("task_type") in PR_TASK_TYPES and config.get("branch_name"):
+        branch = config["branch_name"]
+        setup["branch"] = branch
+    else:
+        # Derive branch slug from issue title or task description
+        title = ""
+        if config.get("issue"):
+            title = config["issue"]["title"]
+        if not title:
+            title = config["task_description"]
+        slug = slugify(title)
+        branch = f"bgagent/{config['task_id']}/{slug}"
+        setup["branch"] = branch
 
     # Mark the repo directory as safe for git.  On persistent session storage
     # the mount may be owned by a different UID than the container user,
@@ -343,9 +360,22 @@ def setup_repo(config: dict) -> dict:
         cwd=repo_dir,
     )
 
-    # Create branch
-    log("SETUP", f"Creating branch: {branch}")
-    run_cmd(["git", "checkout", "-b", branch], label="create-branch", cwd=repo_dir)
+    # Branch setup
+    if config.get("task_type") in PR_TASK_TYPES and config.get("branch_name"):
+        log("SETUP", f"Checking out existing PR branch: {branch}")
+        run_cmd(
+            ["git", "fetch", "origin", branch],
+            label="fetch-pr-branch",
+            cwd=repo_dir,
+        )
+        run_cmd(
+            ["git", "checkout", "-b", branch, f"origin/{branch}"],
+            label="checkout-pr-branch",
+            cwd=repo_dir,
+        )
+    else:
+        log("SETUP", f"Creating branch: {branch}")
+        run_cmd(["git", "checkout", "-b", branch], label="create-branch", cwd=repo_dir)
 
     # Trust mise config files in the cloned repo (required before mise install)
     run_cmd(
@@ -402,7 +432,11 @@ def setup_repo(config: dict) -> dict:
         setup["lint_before"] = True
 
     # Detect default branch
-    setup["default_branch"] = detect_default_branch(config["repo_url"], repo_dir)
+    # For PR tasks (pr_iteration, pr_review): use base_branch from orchestrator if available
+    if config.get("task_type") in PR_TASK_TYPES and config.get("base_branch"):
+        setup["default_branch"] = config["base_branch"]
+    else:
+        setup["default_branch"] = detect_default_branch(config["repo_url"], repo_dir)
 
     # Install prepare-commit-msg hook for code attribution
     _install_commit_hook(repo_dir)
@@ -620,6 +654,10 @@ def ensure_pr(
 ) -> str | None:
     """Check if a PR exists for the branch; if not, create one.
 
+    For ``new_task``: creates a new PR if needed.
+    For ``pr_iteration``: pushes commits, then resolves the existing PR URL.
+    For ``pr_review``: resolves the existing PR URL without pushing (read-only).
+
     Returns the PR URL, or None if there are no commits beyond the default
     branch or PR creation failed. ``build_passed`` and ``lint_passed`` control
     the verification status shown in the PR body.
@@ -628,6 +666,40 @@ def ensure_pr(
     branch = setup["branch"]
     default_branch = setup.get("default_branch", "main")
 
+    # PR iteration/review: skip PR creation — just resolve existing PR URL
+    if config.get("task_type") in PR_TASK_TYPES:
+        if config.get("task_type") == "pr_iteration":
+            if not ensure_pushed(repo_dir, branch):
+                log("WARN", "Failed to push commits before resolving PR URL")
+        else:
+            log("POST", "pr_review task — skipping push (read-only)")
+        log("POST", f"{config.get('task_type')} — returning existing PR URL")
+        result = subprocess.run(
+            [
+                "gh",
+                "pr",
+                "view",
+                branch,
+                "--repo",
+                config["repo_url"],
+                "--json",
+                "url",
+                "-q",
+                ".url",
+            ],
+            cwd=repo_dir,
+            capture_output=True,
+            text=True,
+            timeout=60,
+        )
+        if result.returncode == 0 and result.stdout.strip():
+            pr_url = result.stdout.strip()
+            log("POST", f"Existing PR: {pr_url}")
+            return pr_url
+        stderr_msg = result.stderr.strip() if result.stderr else "(no stderr)"
+        log("WARN", f"Could not resolve existing PR URL (rc={result.returncode}): {stderr_msg}")
+        return None
+
     # Check if the agent already created a PR for this branch
     log("POST", "Checking for existing PR...")
     result = subprocess.run(
@@ -1274,10 +1346,15 @@ def _on_stderr(line: str) -> None:
     else:
         log("WARN", "claude CLI not found on PATH")
 
+    if config.get("task_type") == "pr_review":
+        allowed_tools = ["Bash", "Read", "Glob", "Grep", "WebFetch"]
+    else:
+        allowed_tools = ["Bash", "Read", "Write", "Edit", "Glob", "Grep", "WebFetch"]
+
     options = ClaudeAgentOptions(
         model=config["anthropic_model"],
         system_prompt=system_prompt,
-        allowed_tools=["Bash", "Read", "Write", "Edit", "Glob", "Grep", "WebFetch"],
+        allowed_tools=allowed_tools,
         permission_mode="bypassPermissions",
         cwd=cwd,
         max_turns=config["max_turns"],
@@ -1482,7 +1559,13 @@ def _build_system_prompt(
     overrides: str,
 ) -> str:
     """Assemble the system prompt with task-specific values and memory context."""
-    system_prompt = SYSTEM_PROMPT.replace("{repo_url}", config["repo_url"])
+    task_type = config.get("task_type", "new_task")
+    try:
+        system_prompt = get_system_prompt(task_type)
+    except ValueError:
+        log("ERROR", f"Unknown task_type {task_type!r} — falling back to default system prompt")
+        system_prompt = SYSTEM_PROMPT
+    system_prompt = system_prompt.replace("{repo_url}", config["repo_url"])
     system_prompt = system_prompt.replace("{task_id}", config["task_id"])
     system_prompt = system_prompt.replace("{workspace}", AGENT_WORKSPACE)
     system_prompt = system_prompt.replace("{branch_name}", setup["branch"])
@@ -1513,6 +1596,14 @@ def _build_system_prompt(
             memory_context_text = "\n".join(mc_parts)
     system_prompt = system_prompt.replace("{memory_context}", memory_context_text)
 
+    # Substitute PR-specific placeholders
+    pr_number_val = config.get("pr_number", "")
+    if pr_number_val:
+        system_prompt = system_prompt.replace("{pr_number}", str(pr_number_val))
+    elif "{pr_number}" in system_prompt:
+        log("WARN", "System prompt contains {pr_number} placeholder but no pr_number in config")
+        system_prompt = system_prompt.replace("{pr_number}", "(unknown)")
+
     # Append Blueprint system_prompt_overrides after all placeholder
     # substitutions (avoids double-substitution if overrides contain
     # template placeholders like {repo_url}).
@@ -1628,6 +1719,9 @@ def run_task(
     system_prompt_overrides: str = "",
     prompt_version: str = "",
     memory_id: str = "",
+    task_type: str = "new_task",
+    branch_name: str = "",
+    pr_number: str = "",
 ) -> dict:
     """Run the full agent pipeline and return a result dict.
 
@@ -1652,6 +1746,9 @@ def run_task(
         aws_region=aws_region,
         task_id=task_id,
         system_prompt_overrides=system_prompt_overrides,
+        task_type=task_type,
+        branch_name=branch_name,
+        pr_number=pr_number,
     )
 
     log("TASK", f"Task ID: {config['task_id']}")
@@ -1678,6 +1775,8 @@ def run_task(
                     prompt = hydrated_context["user_prompt"]
                     if hydrated_context.get("issue"):
                         config["issue"] = hydrated_context["issue"]
+                    if hydrated_context.get("resolved_base_branch"):
+                        config["base_branch"] = hydrated_context["resolved_base_branch"]
                     if hydrated_context.get("truncated"):
                         log("WARN", "Context was truncated by orchestrator token budget")
                 else:
@@ -1765,8 +1864,11 @@ def run_task(
 
             # Post-hooks
             with task_span("task.post_hooks") as post_span:
-                # Safety net: commit any uncommitted tracked changes
-                safety_committed = ensure_committed(setup["repo_dir"])
+                # Safety net: commit any uncommitted tracked changes (skip for read-only tasks)
+                if config.get("task_type") == "pr_review":
+                    safety_committed = False
+                else:
+                    safety_committed = ensure_committed(setup["repo_dir"])
                 post_span.set_attribute("safety_net.committed", safety_committed)
 
                 build_passed = verify_build(setup["repo_dir"])
@@ -1810,8 +1912,13 @@ def run_task(
             # Default True = assume build was green before, so a post-agent
             # failure IS counted as a regression (conservative).
             build_before = setup.get("build_before", True)
-            build_ok = build_passed or not build_before
-            if not build_passed and not build_before:
+            if config.get("task_type") == "pr_review":
+                build_ok = True  # Review task — build status is informational only
+                if not build_passed:
+                    log("INFO", "pr_review: build failed — informational only, not gating")
+            else:
+                build_ok = build_passed or not build_before
+            if not build_passed and not build_before and config.get("task_type") != "pr_review":
                 log(
                     "WARN",
                     "Post-agent build failed, but build was already failing before "