Skip to content

Commit a365bf5

Browse files
feat: add GitHub Action for automated PR review via AgentCore Harness (#934)
* feat: add GitHub Action for automated PR review via AgentCore Harness Adds a workflow that reviews PRs using Bedrock AgentCore Harness. The harness runs an AI agent in an isolated microVM with gh, git, and pre-cloned repos that fetches PR diffs and posts review comments. Workflow: - Triggers on PR open/reopen for agentcore-cli-devs team members - Supports manual workflow_dispatch for any PR URL - Adds/removes ai-reviewing label during review - Authenticates via GitHub OIDC to assume AWS role Files: - .github/workflows/pr-ai-review.yml — main workflow - .github/scripts/python/harness_review.py — harness invocation script - .github/scripts/python/harness_config.py — config from env vars - .github/scripts/models/ — local boto3 service model (InvokeHarness not yet in standard boto3) Required secrets: - HARNESS_AWS_ROLE_ARN — IAM role ARN for OIDC - HARNESS_ACCOUNT_ID — AWS account ID - HARNESS_ID — Harness ID * refactor: replace local service model with raw HTTP + SigV4 signing Eliminates the 220KB bundled service model by using direct HTTP requests with SigV4 authentication to invoke the harness endpoint. No extra dependencies needed — urllib3, SigV4Auth, and EventStreamBuffer are all part of botocore/boto3. Rejected: invoke_agent_runtime API | server rejects harness ARNs with ResourceNotFoundException Confidence: high Scope-risk: moderate * refactor: inline harness config into review script Remove separate harness_config.py — env vars are read directly in harness_review.py. One less file to maintain, config is still driven entirely by environment variables set in the GitHub workflow. * refactor: extract invoke_harness helper for cleaner main flow * refactor: simplify config and improve script readability - Replace HARNESS_ACCOUNT_ID + HARNESS_ID with single HARNESS_ARN env var - Extract prompts into separate .md files in .github/scripts/prompts/ - Extract stream parsing into print_stream() function - Add close_group() helper to deduplicate ::group:: bookkeeping * refactor: separate event parsing from display logic Extract parse_events() generator to handle binary stream decoding, keeping print_stream() focused on formatting and log groups. * docs: add explanatory comments to harness review functions * refactor: derive region from HARNESS_ARN instead of separate env var Eliminates HARNESS_REGION env var — the region is extracted from the ARN directly, so there's no risk of a mismatch causing confusing SigV4 auth errors. * chore: rename label to agentcore-harness-reviewing * refactor: move auth check to job level so entire review is skipped early Split into authorize + ai-review jobs. The ai-review job only runs if the PR author is authorized (team member or write access) or if triggered via workflow_dispatch. Removes repeated if conditions from every step. * chore: exclude AI prompt templates from prettier Prompt markdown files use intentional formatting that prettier would reflow, breaking the prompt structure.
1 parent 12275c3 commit a365bf5

5 files changed

Lines changed: 407 additions & 0 deletions

File tree

.github/scripts/prompts/review.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Review this GitHub PR: {pr_url}
2+
3+
You have tools to fetch the PR diff, read files, search the web, and post comments on the PR.
4+
5+
You have these repos cloned locally for context:
6+
- /opt/workspace/agentcore-cli — aws/agentcore-cli
7+
- /opt/workspace/agentcore-l3-cdk-constructs — aws/agentcore-l3-cdk-constructs
8+
9+
Before reviewing, read all existing comments on the PR to understand what has already been discussed. Do not repeat or re-post issues that have already been raised in existing comments.
10+
11+
Review the PR. If there are any serious issues that require code changes before merging, post a comment on the PR for each issue explaining the problem. If there are multiple ways to fix an issue, list the options so the author can choose. Skip style nits and minor suggestions — only flag things that actually need to change.
12+
13+
If all serious issues have already been raised in existing comments, or if you found no new issues, post a single comment on the PR saying it looks good to merge (or that all issues have already been flagged).

.github/scripts/prompts/system.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# AgentCore CLI Development Workspace
2+
3+
This workspace contains two repos for developing and testing the AgentCore CLI.
4+
5+
## Repositories
6+
7+
### agentcore-cli/ (`aws/agentcore-cli`)
8+
9+
The terminal experience for creating, developing, and deploying AI agents to AgentCore. Node.js/TypeScript CLI built with Ink (React-based TUI).
10+
11+
### agentcore-l3-cdk-constructs/ (`aws/agentcore-l3-cdk-constructs`)
12+
13+
AWS CDK L3 constructs for declaring and deploying AgentCore infrastructure. Used by agentcore-cli to vend CDK projects when users run `agentcore create`.
14+
15+
## How they relate
16+
17+
`agentcore-cli` is the main product. It vends CDK projects using constructs from `agentcore-l3-cdk-constructs`.
18+
19+
## Testing with a bundled distribution
20+
21+
Run `npm run bundle` in `agentcore-cli/` to create a tar distribution that includes the packaged `agentcore-l3-cdk-constructs`. You can then install it globally with `npm install -g <path-to-tar>` to test the CLI end-to-end.
Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
"""Invoke Bedrock AgentCore Harness to review a GitHub PR.
2+
3+
Reads PR_URL from the environment. Streams harness output to stdout.
4+
Uses raw HTTP with SigV4 signing — no custom service model needed.
5+
"""
6+
7+
import json
8+
import os
9+
import sys
10+
import time
11+
import uuid
12+
13+
import boto3
14+
from botocore.auth import SigV4Auth
15+
from botocore.awsrequest import AWSRequest
16+
from botocore.eventstream import EventStreamBuffer
17+
from urllib.parse import quote
18+
import urllib3
19+
20+
# ANSI color codes
21+
CYAN = "\033[36m"
22+
YELLOW = "\033[33m"
23+
GREEN = "\033[32m"
24+
RED = "\033[31m"
25+
DIM = "\033[2m"
26+
RESET = "\033[0m"
27+
28+
SCRIPTS_DIR = os.path.join(os.path.dirname(__file__), "..")
29+
30+
31+
def read_prompt(filename):
32+
"""Read a prompt template from the prompts directory."""
33+
path = os.path.join(SCRIPTS_DIR, "prompts", filename)
34+
with open(path) as f:
35+
return f.read()
36+
37+
38+
def invoke_harness(harness_arn, body, region):
39+
"""Send a SigV4-signed request to the harness invoke endpoint. Returns a streaming response.
40+
41+
InvokeHarness is not in standard boto3, so we call the REST API directly.
42+
boto3 is only used to resolve AWS credentials (from env vars, OIDC, etc.)
43+
and sign the request with SigV4. The response is an AWS binary event stream.
44+
"""
45+
session = boto3.Session(region_name=region)
46+
credentials = session.get_credentials().get_frozen_credentials()
47+
url = f"https://bedrock-agentcore.{region}.amazonaws.com/harnesses/invoke?harnessArn={quote(harness_arn, safe='')}"
48+
request = AWSRequest(method="POST", url=url, data=body, headers={
49+
"Content-Type": "application/json",
50+
"Accept": "application/vnd.amazon.eventstream",
51+
})
52+
SigV4Auth(credentials, "bedrock-agentcore", region).add_auth(request)
53+
return urllib3.PoolManager().urlopen(
54+
"POST", url, body=body,
55+
headers=dict(request.headers),
56+
preload_content=False,
57+
timeout=urllib3.Timeout(connect=10, read=600),
58+
)
59+
60+
61+
def parse_events(http_response):
62+
"""Yield (event_type, payload) tuples from the harness binary event stream.
63+
64+
The response arrives as raw bytes in AWS binary event stream format.
65+
EventStreamBuffer reassembles complete events from the 4KB chunks,
66+
and we decode each event's JSON payload before yielding it.
67+
"""
68+
event_buffer = EventStreamBuffer()
69+
for chunk in http_response.stream(4096):
70+
event_buffer.add_data(chunk)
71+
for event in event_buffer:
72+
if event.headers.get(":message-type") == "exception":
73+
payload = json.loads(event.payload.decode("utf-8"))
74+
print(f"\n{RED}ERROR: {payload}{RESET}", file=sys.stderr)
75+
sys.exit(1)
76+
event_type = event.headers.get(":event-type", "")
77+
if event.payload:
78+
yield event_type, json.loads(event.payload.decode("utf-8"))
79+
80+
81+
def print_stream(http_response):
82+
"""Display harness events with GitHub Actions log groups.
83+
84+
The harness streams events as the agent works:
85+
contentBlockStart — a new block begins (text or tool call)
86+
contentBlockDelta — incremental chunks of text or tool input JSON
87+
contentBlockStop — block complete, we now have full tool input to display
88+
messageStop — agent finished
89+
internalServerException — server error
90+
91+
Tool calls are wrapped in ::group::/::endgroup:: for collapsible sections
92+
in the GitHub Actions log UI. Agent reasoning text is printed inline in dim.
93+
"""
94+
start_time = time.time()
95+
iteration = 0
96+
tool_name = None
97+
tool_input = ""
98+
tool_start = 0.0
99+
in_group = False
100+
had_text = False
101+
102+
def close_group():
103+
nonlocal in_group
104+
if in_group:
105+
print("::endgroup::", flush=True)
106+
in_group = False
107+
108+
for event_type, payload in parse_events(http_response):
109+
110+
if event_type == "contentBlockStart":
111+
start = payload.get("start", {})
112+
if "toolUse" in start:
113+
tool_name = start["toolUse"].get("name", "unknown")
114+
tool_input = ""
115+
tool_start = time.time()
116+
iteration += 1
117+
118+
elif event_type == "contentBlockDelta":
119+
delta = payload.get("delta", {})
120+
if "text" in delta:
121+
close_group()
122+
print(flush=True)
123+
print(f"{DIM}{delta['text']}{RESET}", end="", flush=True)
124+
had_text = True
125+
if "toolUse" in delta:
126+
tool_input += delta["toolUse"].get("input", "")
127+
128+
elif event_type == "contentBlockStop":
129+
if tool_name:
130+
elapsed = time.time() - tool_start
131+
try:
132+
parsed = json.loads(tool_input)
133+
except (json.JSONDecodeError, TypeError):
134+
parsed = tool_input
135+
136+
close_group()
137+
if had_text:
138+
print("\n", flush=True)
139+
had_text = False
140+
141+
cmd = parsed.get("command") if isinstance(parsed, dict) else None
142+
header = f"{CYAN}[{iteration}]{RESET} {YELLOW}{tool_name}{RESET} {DIM}({elapsed:.1f}s){RESET}"
143+
if cmd:
144+
header += f": $ {cmd}"
145+
146+
print(f"::group::{header}", flush=True)
147+
in_group = True
148+
149+
if isinstance(parsed, dict):
150+
for k, v in parsed.items():
151+
if k != "command":
152+
print(f" {DIM}{k}:{RESET} {str(v)[:300]}", flush=True)
153+
154+
tool_name = None
155+
tool_input = ""
156+
157+
elif event_type == "messageStop":
158+
close_group()
159+
if payload.get("stopReason") == "end_turn":
160+
total = time.time() - start_time
161+
print(f"\n\n{GREEN}{'=' * 50}", flush=True)
162+
print(f" Done ({int(total // 60)}m {int(total % 60)}s)", flush=True)
163+
print(f"{'=' * 50}{RESET}", flush=True)
164+
165+
elif event_type == "internalServerException":
166+
close_group()
167+
print(f"\n{RED}ERROR: {payload}{RESET}", file=sys.stderr)
168+
sys.exit(1)
169+
170+
close_group()
171+
total = time.time() - start_time
172+
print(f"\n{GREEN}Review complete.{RESET} {DIM}({iteration} tool calls, {int(total)}s total){RESET}")
173+
174+
175+
# --- Main ---
176+
177+
# All config comes from environment variables (set via GitHub secrets/workflow)
178+
MODEL_ID = os.environ.get("HARNESS_MODEL_ID", "us.anthropic.claude-opus-4-7")
179+
HARNESS_ARN = os.environ.get("HARNESS_ARN", "")
180+
PR_URL = os.environ.get("PR_URL", "")
181+
182+
for name, val in [("HARNESS_ARN", HARNESS_ARN), ("PR_URL", PR_URL)]:
183+
if not val:
184+
print(f"{RED}ERROR: {name} environment variable is required{RESET}", file=sys.stderr)
185+
sys.exit(1)
186+
187+
# Extract region from the ARN (arn:aws:bedrock-agentcore:{region}:{account}:harness/{id})
188+
REGION = HARNESS_ARN.split(":")[3]
189+
SESSION_ID = str(uuid.uuid4()).upper()
190+
191+
print(f"{CYAN}Session:{RESET} {SESSION_ID}")
192+
print(f"{CYAN}PR:{RESET} {PR_URL}")
193+
print(f"{CYAN}Harness:{RESET} {HARNESS_ARN}")
194+
print()
195+
196+
SYSTEM_PROMPT = read_prompt("system.md")
197+
REVIEW_PROMPT = read_prompt("review.md").format(pr_url=PR_URL)
198+
199+
request_body = json.dumps({
200+
"runtimeSessionId": SESSION_ID,
201+
"systemPrompt": [{"text": SYSTEM_PROMPT}],
202+
"messages": [{"role": "user", "content": [{"text": REVIEW_PROMPT}]}],
203+
"model": {"bedrockModelConfig": {"modelId": MODEL_ID}},
204+
})
205+
206+
http_response = invoke_harness(HARNESS_ARN, request_body, REGION)
207+
208+
if http_response.status != 200:
209+
error = http_response.read().decode("utf-8")
210+
print(f"{RED}ERROR: HTTP {http_response.status}: {error}{RESET}", file=sys.stderr)
211+
sys.exit(1)
212+
213+
print_stream(http_response)

0 commit comments

Comments
 (0)