Skip to content

Commit d912ad2

Browse files
isadeksclaudebgagentkrokoko
authored
feat(notifications): preview-deploy screenshot pipeline (provider-agnostic) (#241)
* feat(screenshot): preview-deploy screenshot pipeline (no stack wiring yet) Lambda + AgentCore Browser plumbing for capturing screenshots of preview deployments. Provider-agnostic — listens for GitHub deployment_status events from any source (Vercel, Amplify Hosting, Netlify, GitHub Actions custom CD). This commit lands the handler / construct code only. Stack wiring follows in the next commit. * feat(screenshot): GitHubScreenshotIntegration construct + stack wiring - New `GitHubScreenshotIntegration` construct (mirrors `LinearIntegration`): bundles the screenshot bucket, dedup table, signing-secret placeholder, receiver Lambda, processor Lambda, and the API Gateway route. cdk-nag suppressions added inline (HMAC auth instead of Cognito; AgentCore Browser sessions have no per-resource ARN; Secrets Manager rotation is owned by GitHub). - Wired into `agent.ts` after the LinearIntegration block. Reuses the existing `githubTokenSecret` (the processor uses ABCA's main GitHub token to look up which PR a deploy SHA belongs to and post the screenshot comment — no new credential). - Three new stack outputs: `GitHubWebhookUrl`, `GitHubWebhookSecretArn`, `ScreenshotBucketName`. - Bumped agent.test.ts table count from 13 to 14 to account for the new dedup table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(screenshot): suppress AwsSolutions-S2 on the public-read screenshot bucket cdk-nag's S2 fires on any bucket that has `blockPublicPolicy: false` even when the policy is intentionally permissive. Add the suppression with the same rationale as S1/S5 — public reads are required by GitHub Markdown renderers and Linear `imageUploadFromUrl`, and the read grant is prefix-scoped to `screenshots/*`. Caught when the first deploy attempt aborted at synth-time on the new GitHubScreenshotIntegration construct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(screenshot): private S3 bucket + CloudFront distribution The first deploy attempt failed at CFN-execute time on the bucket policy: s3:PutBucketPolicy ... because public policies are prevented by the BlockPublicPolicy setting in S3 Block Public Access. Account-level Block Public Access is on for this AWS account, which overrides per-bucket BPA settings. Disabling it would change the security posture of the whole account, so route around the constraint with the AWS-recommended pattern: private S3 + CloudFront with Origin Access Control. Changes: - `ScreenshotBucket` is now `BLOCK_ALL` BPA, no public bucket policy. Adds a `cloudfront.Distribution` whose origin is the bucket via `S3BucketOrigin.withOriginAccessControl`. The distribution policy is scoped to the CloudFront service principal only, so account-level BPA accepts it. - Processor reads `SCREENSHOT_PUBLIC_HOST` (the CloudFront domain) instead of building an S3 URL. PR comments now embed `https://<dist>.cloudfront.net/screenshots/...` URLs. - New stack output `ScreenshotCloudFrontDomain`. - Bucket-level S2/S5 suppressions removed (no longer applicable — bucket is private). Distribution gets CFR1/CFR2/CFR3/CFR4/CFR7 suppressions with rationales. Heads up on deploy time: CloudFront distributions take 5-15 min to provision on first create. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(waf): exempt /v1/github/webhook from CRS like /v1/linear/webhook The CommonRuleSet was 403'ing GitHub deployment_status webhooks before the request reached our Lambda — the deployment payload contains absolute Vercel preview URLs in the body, which trips GenericRFI_BODY. Mirror the Linear webhook exemption: the GitHub webhook path is HMAC-verified in the Lambda, parsed as strict JSON, never interpolated into SQL/HTML, and rate-limited by the priority-3 rule. CRS still applies to every other route. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(screenshot): read environment_url from deployment_status, not deployment GitHub's `deployment_status` webhook puts the deployed URL on the *status* object, not the deployment itself. The deployment object is immutable per (sha, environment); the status changes through the deploy lifecycle (`pending` → `success`) and carries the URL only once the deploy finishes. Symptom: receiver kept short-circuiting `success` events from Vercel with `{ok: true, skipped_no_url: true}` because we read the wrong field. Verified by inspecting the webhook delivery payload via `gh api .../deliveries/<id> --jq .request.payload.deployment_status` — URL was there all along. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agentcore-browser): use ws package for SigV4-signed WebSocket handshake Node 24's global WebSocket (from undici) does NOT support arbitrary HTTP headers on the upgrade request — passing them as the second arg gets silently ignored. AgentCore Browser's WSS handshake requires SigV4-signed Authorization + X-Amz-* headers, so the connection was opening but then getting rejected, which surfaced as an empty `error` event ("AgentCore Browser WebSocket error: "). Switch to the `ws` package which natively supports `options.headers`. Also add an `unexpected-response` handler so HTTP-level handshake failures (403, 400) surface with status codes instead of empty errors. Smoke verified locally — the ws-based path opens cleanly against example.com and Vercel preview URLs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agentcore-browser): SigV4-presign WSS URL instead of signing headers Lambda runtime returned a 403 on the WSS upgrade despite well-formed SigV4 headers — `ws` rewrites the Host header during the upgrade GET, which invalidates the canonical-request signature we computed against the original Host. This works locally because Node's tooling on macOS keeps the original Host through the handshake, but the Lambda runtime's TLS stack normalizes differently. Switch to query-parameter SigV4 (presigned URL): SignatureV4.presign returns a wss://...?X-Amz-Algorithm=...&X-Amz-Signature=... URL where the auth lives in the URL itself, so any Host-header rewriting downstream doesn't break the signature. Smoke verified locally — presigned URL connects cleanly to AgentCore Browser and the screenshot pipeline runs end-to-end (6.3s, valid PNG, captures example.com correctly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(iam): grant bedrock-agentcore:* to the screenshot processor The minimal IAM I shipped earlier (`StartBrowserSession`, `StopBrowserSession`, `GetBrowserSession`, `UpdateBrowserStream`) wasn't enough — the WSS automation-stream connect requires an additional `ConnectBrowserAutomationStream`-flavored action that isn't in the public CLI command list. Lambda invocations were opening sessions cleanly but 403'ing on the WSS upgrade. Widen to `bedrock-agentcore:*` to unblock the e2e flow. Followup: scope back down to the specific connect action once it's documented or surfaced via CloudTrail decoded-message-on-deny. Smoke verified: PR #1 on isadeks/vercel-abca-linear now receives a screenshot comment within ~7s of the deployment_status webhook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(screenshot): also post screenshot comment to linked Linear issue Extends the screenshot processor to find a Linear issue via the PR's title/body and post the same image comment there. Approach (no GSI write-back needed): - Regex-extract Linear identifier (e.g. `ABCA-42`) from PR title/body. These are present whether the agent put them there (`task_description` carries the identifier) or Linear's own GitHub integration auto-injected the back-reference on PR open. - Scan `LinearWorkspaceRegistryTable` for `status=active` workspaces. Per-workspace, query Linear's `issueVcsBranchSearch` (which accepts the human-readable identifier) and accept the first exact-match hit. - Post the markdown image comment via the existing `postIssueComment` helper from Phase 2.0b. The Linear post is best-effort — if the registry table isn't wired, the identifier doesn't extract, or the lookup misses, the GitHub PR comment still lands. New env var `LINEAR_WORKSPACE_REGISTRY_TABLE_NAME` is optional on the processor; the construct only sets it when the prop is provided. CDK: `GitHubScreenshotIntegrationProps` gains an optional `linearWorkspaceRegistryTable`. When provided, the processor's IAM grows: ReadData on the registry, GetSecretValue+PutSecretValue on `bgagent-linear-oauth-*`. `agent.ts` wires `linearIntegration.workspaceRegistryTable` into the screenshot construct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(screenshot): retry PR lookup to handle deploy-before-PR race Some providers (Vercel, Netlify) post deployment_status faster than the agent can run `gh pr create`. Retry the GitHub PR-lookup with backoff so the screenshot finds the open PR rather than dropping the event when the timing is reversed. * fix(linear): silent label gate + default to 'abca' to stop unlabeled-issue comment spam Move the trigger-label check ahead of every user-facing comment path in the Linear webhook processor, and switch the default trigger label from 'bgagent' to 'abca'. An unlabeled issue is now a true no-op: no comment, no reaction, no createTaskCore, no DDB writes — regardless of whether the project is onboarded. Why: workspace webhooks fire workspace-wide. A single un-onboarded team in the same Linear workspace produced 47 identical "❌ project isn't onboarded" comments on GRO-783 in 5 minutes because every Issue event (create/update/label-change) hit the not-onboarded gate before the label gate. With the gate order flipped, only issues that explicitly opt in via the trigger label can ever generate user-facing feedback. Per-project label_filter override is still respected — the project mapping lookup now happens once, before the label gate, instead of after. Tests: two new regression tests pin the spam scenario (unlabeled issue in a non-onboarded project, and unlabeled issue with no projectId) to zero side effects. Full CDK suite (89 suites / 1572 tests) passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(screenshots): add the screenshot pipeline guide Adds the operator walkthrough for wiring up the AgentCore-Browser preview-deploy screenshot pipeline. * feat(github): bgagent github webhook-info + set-webhook-secret Mirrors the Linear webhook-info pattern so docs and onboarding don't have to embed stack-specific URLs or copy-paste aws CLI invocations. Two subcommands: - `webhook-info` — read-only. Reads GitHubWebhookUrl + GitHubWebhookSecretArn from the CFN stack outputs and prints values to paste into a GitHub repo's webhook config (Settings → Webhooks → Add webhook). Includes the event-type ('Deployment statuses') and content-type guidance that operators consistently miss. - `set-webhook-secret` — interactive PutSecretValue against the stack output ARN. Replaces the cargo-cult `aws secretsmanager put-secret- value` operators were copy-pasting from the screenshot setup notes. Warns before overwriting an existing real secret (heuristic: a CDK- seeded JSON placeholder starts with `{`; a real GitHub secret won't). No CDK changes — both stack outputs were already there. Pure CLI add. * docs/code(screenshots): de-Vercel-ize the screenshot pipeline The pipeline was always provider-agnostic — it listens for GitHub deployment_status events, which Vercel, AWS Amplify, Netlify, and any GitHub-Actions-driven CD pipeline all post. Code comments, inline strings, and the setup guide referenced Vercel as if it were the only supported path; this commit aligns the surfacing with what the code actually does. Code: - Linear comment body: "after the Vercel preview deploy finished" → "after the deploy finished" (the GitHub PR comment already said this; just the Linear path was inconsistent) - Webhook receiver doc-comment + envelope interface comment: drop Vercel-only language; explain that the `environment` filter (`SCREENSHOT_TARGET_ENVIRONMENT` env var) is configurable per- provider, with a table of common values - Processor PR-race comment: explain that the gap is also seen on Netlify/Amplify, not unique to Vercel - AgentCore Browser comment: drop Vercel-specific phrasing on "what we don't try to be clever about" - GitHubScreenshotIntegration construct prop docstring: explain the per-provider env-name conventions Docs: - Rename VERCEL_SETUP_GUIDE.md → DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md - Lead with a "works with any provider that posts deployment_status" table (Vercel / Amplify / Netlify / GitHub Actions custom CD, with "out-of-the-box?" yes/no per provider) - Keep Vercel as the worked example since it's what we smoke-tested, but add a "skip Steps 1-2" callout for non-Vercel providers - New "Configuring for non-Vercel providers" section with the SCREENSHOT_TARGET_ENVIRONMENT override pointer - Replace 4a/4b's CFN-output spelunking with `bgagent github webhook-info` + `bgagent github set-webhook-secret` (commands shipped in 1c1b618) - Troubleshooting: mention that 401 "Invalid signature" is the set-webhook-secret-mismatch case - Sync registration: register as DEPLOY_PREVIEW_SCREENSHOTS_GUIDE in sync-starlight.mjs route map + the explicit mirror call; added to astro.config.mjs sidebar after the PAK runbook No CDK structural changes — the construct prop, env-var, and code behaviour were already provider-agnostic. Pure surfacing fix. * docs(screenshots): drop redundant Step 3 + condescending hardening preamble Step 3 (repo onboarding + Linear project mapping) duplicated work the Prerequisites section already establishes ('Linear OAuth installed for at least one workspace'). If the user followed the Linear setup guide, both are done. If they didn't, Step 4's smoke test fails fast and the troubleshooting routes them back. Net: 30 lines of doc gone, no information lost. Renumbered Step 4 → 3 and Step 5 → 4 (and the 4a/b/c → 3a/b/c sub-steps). Also dropped the 'demo configuration optimizes for "look, it works" rather than security posture' framing on the production-hardening section. The list of followups stands on its own; the framing reads as condescending toward someone reaching the bottom of the guide. * docs(screenshots): drop 'followup' framing — describe gaps as current state Public docs that say 'followup' read as commitments to do that work. Reframe gaps as current limitations with neutral language: - 'Production hardening (followups)' → 'Production hardening considerations'; bullets describe what to think about, not what ABCA promises to ship - Netlify table row: 'followup to support pattern matching' → '⚠ workable today only by picking one specific PR's environment string; broader pattern matching isn't shipped' - Vercel auth callout: 'tracked as a followup' → 'currently not implemented' - Non-Vercel providers table: drop 'followup #96 covers prefix routing' reference (issue numbers don't belong in user-facing docs) Net: same information, no implicit roadmap commitments. * docs(screenshots): de-Linear-ize — Linear is opt-in, not required The screenshot pipeline only needs GitHub. Linear-side posting was phrased as a hard requirement throughout the guide because the demo flow happens to use Linear, but a non-Linear team gets a perfectly useful integration: screenshots land on GitHub PRs, the Linear lookup silently no-ops. Reframings: - Lead-in: 'on both the open GitHub PR AND the linked Linear issue' → 'on the open GitHub PR. If you also have Linear configured, the same screenshot is posted to the linked Linear issue as a bonus.' Plus a note on the gating (LinearWorkspaceRegistryTable having active rows is what flips the Linear path on). - 'How it works': step 4 (Linear post) marked optional with the silent-skip behaviour spelled out - Architecture comment: 'GitHub PR comment + Linear issue comment' → '... (+ Linear issue comment if linked)' - Prerequisites: Linear OAuth marked optional with rationale - Smoke test: rewritten as PR-driven by default ('open any PR on the configured repo'), with Linear-driven path as a follow-on paragraph ('If you also have Linear configured...') - Troubleshooting: 'Linear is best-effort' → 'opt-in and best- effort', explicit note that skipping is normal without Linear * feat(screenshot): hide URL behind 'preview link' label in comments GitHub PR comment now reads 'From [preview link](url)' and Linear comment reads '[Preview link](url)' instead of pasting the bare URL. Cleaner visual when the same comment is posted on both surfaces. * docs(screenshots): add USER_GUIDE / COST_MODEL / ROADMAP coverage Closes the doc gaps from the screenshot feature followup list: - USER_GUIDE.md: new 'Preview-deploy screenshots (optional)' subsection under Notifications, points at DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md. - COST_MODEL.md: 'Optional: deploy-preview screenshots' table covering AgentCore Browser session, Lambda processor, S3, CloudFront line items (~$0.01 per screenshot, dominated by Browser session time). - ROADMAP.md: marks the feature shipped under Notification plane with a one-line description of the trigger model and post-deploy latency. Mirrors regenerated via docs/scripts/sync-starlight.mjs. * docs(linear): clarify teammate-onboarding handshake The 'Inviting teammates' section was missing the prerequisite that the teammate needs their own ABCA account (Cognito user + configured CLI) before they can redeem a Linear invite code. New flow walks through: Admin: invite-user (Cognito) → invite-user <slug> (Linear) Teammate: configure --from-bundle → login → linear link <code> with cross-references to USER_GUIDE.md's 'Joining an existing deployment' for the Cognito-side details. Also corrects the stale 'auto-links the person running the wizard' claim — setup now offers an inline picker (opt-in by admin), not an automatic mapping. * fix(github-cli): de-Vercel-ize webhook-info / set-webhook-secret strings Last batch of stale 'Vercel' framing in CLI command output, missed in the original de-Vercel-ize sweep. Provider-agnostic now: webhook-info header reads 'preview-deploy screenshot pipeline', the closing note lists Vercel/Amplify/Netlify/GitHub Actions as example providers, and the smoke-test instruction says 'push to a PR-attached branch' rather than 'trigger a Vercel preview deploy'. No behaviour change; pure copy. * fix(github-cli): replace template literal with single quotes (eslint mutation) The local build's eslint --fix step rewrote a no-interpolation template literal to single quotes; CI's 'Fail build on mutation' guard caught that the mutation wasn't committed. Apply the fix. * fix(screenshot): krokoko PR-241 review — scope IAM + cosmetic Vercel mention Closes #94 (the existing 'scope IAM down from bedrock-agentcore:*' followup task). Addresses krokoko's PR #241 review: 1. (BLOCKING per review #1) IAM action wildcard — narrow bedrock-agentcore:* to the three calls the screenshot processor actually makes: - StartBrowserSession (control plane, public CLI command) - StopBrowserSession (control plane, public CLI command) - ConnectBrowserAutomationStream (SigV4-presigned WSS dial; not in the public CLI list but verified live against the deployed dev stack — IAM accepts the action name) Resource wildcard remains because AgentCore Browser sessions are ephemeral with no stable ARN; the IAM5 suppression on the construct already documents that. Previous behaviour granted every AgentCore action surface (memory, runtime, gateway, identity, code-interpreter) the screenshot path doesn't use. Tightening to the call set leaves a precise audit surface; if a future API change needs another action, IAM denies with the action name in CloudTrail and we add it explicitly. 2. (NIT per review #7) Stale 'Vercel' wording on ScreenshotBucketName CfnOutput description, plus an adjacent comment in agent.ts that said 'Vercel-style preview deploys'. Both replaced with provider-agnostic phrasing — the pipeline listens for any provider that posts deployment_status (Vercel, Amplify, Netlify, GitHub Actions custom CD). No behavioural change in either fix. * fix(screenshot): krokoko PR-241 review — WS leak + commit-pulls guard Two non-blocking review nits: - agentcore-browser.ts: failure paths in the WS open-promise (error, unexpected-response, timeout) now call ws.terminate() so a hung handshake doesn't leak the underlying TCP connection per failed attempt. - github-webhook-processor.ts: findPullRequestForSha now guards the GitHub commit-pulls JSON parse and the Array.isArray contract. A transient 2xx HTML body or malformed payload would have crashed the un-DLQ'd processor; treat non-array as no-PR. * docs(screenshot): krokoko PR-241 review — reconcile WAF rationale CloudWatch BlockedRequests metric for TaskApiWebAcl (us-east-1) shows SizeRestrictions_BODY fired 2x on 2026-05-21, matching commit 36e8d14's smoke-test window 1:1. GenericRFI_BODY has never fired on this WebACL — the original commit message + scope-down comment + screenshots guide blamed RFI when the actual blocker was the 8 KB body-size limit (the deployment_status payload carries workflow run history + deploy URLs + deployment metadata). The code is correct as written (only excludes SizeRestrictions_BODY); only the rationale was wrong. Updated the inline comment in task-api.ts and the WAF-exemption section of the screenshots guide; the guide wording also overstated the scope ('excluded from the CommonRuleSet') when only one CRS rule is exempted, so reworded to make clear LFI/RFI/XSS/SQLi still evaluate. Starlight mirror re-synced. * fix(linear): revert DEFAULT_LABEL_FILTER to 'bgagent'; scope PR-241 to screenshot pipeline Per krokoko PR-241 review item #2: the bgagent->abca rename is unrelated to the screenshot pipeline and was bundled into this PR by accident. Issue #285 owns landing the rename in its own PR with all four sites (processor / CLI / setup guide / mapping doc) aligned in one commit. Restores DEFAULT_LABEL_FILTER to 'bgagent' so PR #241 only carries the screenshot pipeline. Test fixture in linear-webhook-processor.test.ts updated from 'lbl-abca'/'abca' to 'lbl-bgagent'/'bgagent' to match the restored default; behavior under test is unchanged. * fix(screenshot): theagenticguy PR-241 review — blockers + nits + tests Closes the principal-architect review on PR #241. B1. Shared wall-clock deadline. github-webhook-processor.ts now threads one TOTAL_BUDGET_MS=110s deadline across findPullRequestForShaWithRetry + captureScreenshot + S3 PUT + comment POST. PR lookup is capped so the screenshot half is guaranteed at least MIN_CAPTURE_BUDGET_MS=15s; if PR lookup eats the budget we fail fast with budget_exhausted instead of starting an AgentCore session that's already doomed. Lambda timeout stays 120s; the 10s headroom covers SDK retries + shutdown grace. Construct comment updated to reflect the actual math. B2. Empty-secret HMAC fail-open closed. getGitHubWebhookSecret now returns null when SecretString is empty or whitespace-only (logs error, drops cache entry). verifyGitHubSignature adds defense-in-depth early-return when secret is empty. New table-driven test covers the exact attacker shape: HMAC('', body) signature against an empty stored secret. B3. Stale public-read comments reworded. github-screenshot-integration.ts topology JSDoc + field doc + inline all said 'Public-read screenshot S3 bucket'; the bucket has been BlockPublicAccess.BLOCK_ALL the whole time. Reworded to 'Private bucket; served via CloudFront OAC.' Non-blocking nits. - SSRF allowlist on environment_url (https-only, no literal-IP / localhost / link-local / loopback) — new shared module src/handlers/shared/screenshot-url.ts. - High-entropy 64-bit suffix on screenshot S3 keys (screenshots/<owner>_<repo>/<sha>-<id>-<suffix>.png) so the public CloudFront URL is unguessable from the public PR. - Single GitHubDeploymentStatusPayload + validateDeploymentStatusPayload hoisted to src/handlers/shared/github-deployment-status.ts; receiver and processor share one validation contract (closes the gap where the receiver admitted payloads missing deployment.sha that the processor would drop). - AbortController on the GitHub commit-pulls fetch (5s per-attempt cap; caller still owns the wall-clock). - Replaced delays.indexOf-based retry log with an indexed loop (broken if delays repeats). - replaceAll('/', '_') for repo slug. - Scoped nextCdpId to runCdpScreenshot (was module-global). - Stale 'Page.frameStoppedLoading' docstring fixed (code waits on Page.loadEventFired). - IAM PutSecretValue intent comment added (Linear refresh-token rotation; not a typo). - Type-narrowed CDP result accessors instead of `as` casts in agentcore-browser.ts. - Promoted warn → error with tagged event_id on screenshot.* paths (pr_lookup_exhausted, capture_failed, s3_put_failed, pr_comment_post_failed) so CloudWatch metric filters / alarms can fire on what was previously invisible. IAM scope rationale (with Service Authorization Reference cite). The reference confirms ConnectBrowserAutomationStream is a real published IAM action under bedrock-agentcore. Scope is narrowed to the three actions the handler calls; resource is '*' because the two Connect*Stream actions declare no resource types or condition keys (must be granted on Resource:'*'), and Browser sessions are ephemeral anyway. Source link added in both the construct comment and the screenshots guide. Doc fixes. - DEPLOY_PREVIEW_SCREENSHOTS_GUIDE.md hardening section: was 'IAM grants bedrock-agentcore:*' (stale); now describes the scope-down with the auth-reference citation, plus SSRF and screenshot-URL enumerability mitigations. - step 3c reference from '4b' → '3b' (the secret is generated in 3b, not 4b which doesn't exist). Tests (4 new files, 58 tests). - github-webhook-verify.test.ts: table-driven verifier including the exact empty-secret attack shape that B2 fixes. - screenshot-url.test.ts: buildScreenshotKey shape + entropy + replaceAll; isAllowedScreenshotUrl with public hosts, schemes, IPv4/IPv6 literals, localhost, RFC1918, IMDS. - linear-issue-lookup.test.ts: extractLinearIdentifier including the back-to-back-call test that pins the g-flag lastIndex reset. - screenshot-bucket.test.ts: synth-time assertion that BlockPublicAccess is all-true so a future 'simplify' can't drop public-access protection. Full handler suite still green: 1387/1387 across 70 suites. * fix(screenshot): preserve env_url skip-path; update tests for budget+key suffix After cherry-picking the theagenticguy review fixes onto linear-vercel: - github-webhook.ts: validateDeploymentStatusPayload was 400-rejecting payloads missing environment_url, but GitHub fires success events without the URL during intermediate states. Restore the explicit 200-skip on env_url BEFORE validate so we don't make GitHub retry. - Test fixture updates for: - captureScreenshot now called with { timeoutMs } (B1 budget) - S3 key now ends with -<16hex>.png (high-entropy suffix nit) * fix(screenshot): encode markdown URL + sync stale key-layout comments (#240) krokoko PR #241 round-3 review: - Finding 1 (security): percent-encode `(`/`)` in the payload-derived `environment_url` before interpolating it into the PR/Linear comment markdown. A clean-hostname URL like `https://preview.vercel.app/x)](https://evil/a.png)` passes `isAllowedScreenshotUrl` yet breaks out of the `](…)` link and injects content into a comment posted under ABCA's token (reachable in fork-PR configs). New `encodeMarkdownUrl` helper + unit tests. - Finding 3 (cosmetic): screenshot-bucket.ts and the deploy guide still described the old `screenshots/<repo>/<sha>.png` layout; updated both to the actual `screenshots/<owner>_<repo>/<sha>-<deploymentId>-<16hex>.png`. Removed the unused `SCREENSHOT_KEY_PREFIX` export. Findings 2 (IPv6 unique-local) and 4 (validateDeploymentStatusPayload coverage) folded into #286 and #275 respectively per the reviewer's routing. * fix(screenshot): reject all IPv6 literals + cover the deployment-status validator (#240) krokoko PR #241 round-3 findings 2 and 4, brought onto #240 because the code they touch (screenshot-url.ts, the shared github-deployment-status validator) lives only on this branch — #275 predates the #240 refactor that extracted them, and #286 is an issue with no code branch. - Finding 2 (security): isAllowedScreenshotUrl enumerated IPv6 ranges (::1, fe80:, ::ffff:) and missed unique-local fc00::/7 (e.g. [fc00::1]) and NAT64. Replace with a blanket reject of any IPv6 literal — a bracketed host or a `:` in the host — since preview URLs are always DNS names. Also documents that the WHATWG parser normalizes integer-form IPv4 (2130706433, 0x7f000001) to dotted-quad, so the existing regex catches those too. New table-driven tests for unique-local, NAT64, IPv4-mapped, and integer IPv4; screenshot-url.ts now at 100% branch coverage (closes the old ::ffff: gap). - Finding 4 (test gap): validateDeploymentStatusPayload had zero coverage. New github-deployment-status.test.ts — happy path, each missing/empty required field, absent nested objects, and the id:0 type-vs-truthiness edge. Validator at 100%. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: bgagent <bgagent@noreply.github.com> Co-authored-by: Alain Krok <alkrok@amazon.com>
1 parent 14820c8 commit d912ad2

35 files changed

Lines changed: 3635 additions & 45 deletions

cdk/package.json

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,24 +17,29 @@
1717
"@aws-cdk/aws-bedrock-agentcore-alpha": "2.257.0-alpha.0",
1818
"@aws-cdk/aws-bedrock-alpha": "2.257.0-alpha.0",
1919
"@aws-cdk/mixins-preview": "2.257.0-alpha.0",
20+
"@aws-crypto/sha256-js": "^5.2.0",
2021
"@aws-sdk/client-bedrock-agentcore": "^3.1046.0",
2122
"@aws-sdk/client-bedrock-runtime": "^3.1021.0",
2223
"@aws-sdk/client-dynamodb": "^3.1021.0",
2324
"@aws-sdk/client-ecs": "^3.1021.0",
2425
"@aws-sdk/client-lambda": "^3.1021.0",
2526
"@aws-sdk/client-s3": "^3.1021.0",
2627
"@aws-sdk/client-secrets-manager": "^3.1021.0",
28+
"@aws-sdk/credential-provider-node": "^3.972.29",
2729
"@aws-sdk/lib-dynamodb": "^3.1021.0",
2830
"@aws-sdk/s3-presigned-post": "^3.1021.0",
2931
"@aws-sdk/s3-request-presigner": "^3.1021.0",
3032
"@aws/durable-execution-sdk-js": "^1.1.0",
3133
"@cedar-policy/cedar-wasm": "4.8.2",
34+
"@smithy/protocol-http": "^5.3.12",
35+
"@smithy/signature-v4": "^5.3.14",
3236
"aws-cdk-lib": "^2.257.0",
3337
"cdk-nag": "^2.38.2",
3438
"constructs": "^10.3.0",
35-
"pdf-parse": "^1.1.1",
3639
"js-yaml": "^4.1.1",
37-
"ulid": "^3.0.2"
40+
"pdf-parse": "^1.1.1",
41+
"ulid": "^3.0.2",
42+
"ws": "^8.18.0"
3843
},
3944
"devDependencies": {
4045
"@aws-cdk/integ-runner": "2.199.0",
@@ -46,6 +51,7 @@
4651
"@types/js-yaml": "^4.0.9",
4752
"@types/node": "^20",
4853
"@types/pdf-parse": "^1.1.4",
54+
"@types/ws": "^8.5.13",
4955
"@typescript-eslint/eslint-plugin": "^8",
5056
"@typescript-eslint/parser": "^8",
5157
"aws-cdk": "^2",
Lines changed: 292 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,292 @@
1+
/**
2+
* MIT No Attribution
3+
*
4+
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
5+
*
6+
* Permission is hereby granted, free of charge, to any person obtaining a copy of
7+
* the Software without restriction, including without limitation the rights to
8+
* use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
9+
* the Software, and to permit persons to whom the Software is furnished to do so.
10+
*
11+
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
12+
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
13+
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
14+
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
15+
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
16+
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
17+
* SOFTWARE.
18+
*/
19+
20+
import * as path from 'path';
21+
import { ArnFormat, Duration, RemovalPolicy, Stack } from 'aws-cdk-lib';
22+
import * as apigw from 'aws-cdk-lib/aws-apigateway';
23+
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
24+
import * as iam from 'aws-cdk-lib/aws-iam';
25+
import { Architecture, Runtime } from 'aws-cdk-lib/aws-lambda';
26+
import * as lambda from 'aws-cdk-lib/aws-lambda-nodejs';
27+
import * as secretsmanager from 'aws-cdk-lib/aws-secretsmanager';
28+
import { NagSuppressions } from 'cdk-nag';
29+
import { Construct } from 'constructs';
30+
import { ScreenshotBucket } from './screenshot-bucket';
31+
32+
/**
33+
* Properties for GitHubScreenshotIntegration construct.
34+
*/
35+
export interface GitHubScreenshotIntegrationProps {
36+
/** The existing REST API to add the GitHub webhook route to. */
37+
readonly api: apigw.RestApi;
38+
39+
/**
40+
* Existing GitHub PAT secret. The processor reuses ABCA's main GitHub
41+
* token to (a) look up which PR a deploy SHA belongs to via the
42+
* Commits API, and (b) post the screenshot comment on that PR.
43+
* No new GitHub credential is provisioned by this construct.
44+
*/
45+
readonly githubTokenSecret: secretsmanager.ISecret;
46+
47+
/**
48+
* Optional — when provided, the processor also tries to post the
49+
* screenshot to a linked Linear issue. Resolved from the GitHub PR
50+
* title/body via a Linear-identifier regex (e.g. `ABCA-42`), then
51+
* looked up across all `status='active'` workspaces in the registry
52+
* via Linear's `issueVcsBranchSearch` GraphQL.
53+
*/
54+
readonly linearWorkspaceRegistryTable?: dynamodb.ITable;
55+
56+
/**
57+
* Removal policy for the dedup table + screenshot bucket. Defaults
58+
* to DESTROY so dev stacks don't accumulate orphans on `cdk destroy`.
59+
*/
60+
readonly removalPolicy?: RemovalPolicy;
61+
62+
/**
63+
* Override for the GitHub deployment `environment` value we
64+
* screenshot. Different providers use different conventions:
65+
* `Preview` (Vercel's per-PR label, the default), branch names
66+
* (Amplify Hosting), `Deploy Preview <PR#>` (Netlify), or whatever
67+
* your GitHub Actions workflow passes. Set this when your provider
68+
* uses a different name and you want per-PR-only screenshots.
69+
* @default 'Preview'
70+
*/
71+
readonly screenshotTargetEnvironment?: string;
72+
}
73+
74+
/**
75+
* CDK construct that adds the GitHub-deployment-status → screenshot →
76+
* PR-comment pipeline.
77+
*
78+
* Topology mirrors `LinearIntegration`:
79+
* - Receiver Lambda (HMAC-verifies, dedups, async-invokes processor)
80+
* - Async processor Lambda (drives AgentCore Browser, uploads PNG,
81+
* posts the PR comment)
82+
* - Dedup DynamoDB table (1h TTL — covers GitHub's 5-attempt retry
83+
* window with slack)
84+
* - Webhook signing-secret (Secrets Manager placeholder; populated
85+
* manually when the operator pastes GitHub's value into the secret)
86+
* - Private screenshot S3 bucket; served anonymously via CloudFront OAC
87+
* - API Gateway route `POST /v1/github/webhook`
88+
*
89+
* Inbound-only adapter — there's no outbound polling or stream
90+
* consumer, just the webhook → screenshot → comment fan-out.
91+
*/
92+
export class GitHubScreenshotIntegration extends Construct {
93+
/** Private bucket; served via CloudFront OAC. Hosts the screenshot PNGs. */
94+
public readonly screenshotBucket: ScreenshotBucket;
95+
96+
/**
97+
* GitHub webhook signing secret — placeholder. The operator pastes
98+
* GitHub's signing-secret value here after configuring the webhook
99+
* in the demo repo's settings; the secret is otherwise empty.
100+
*/
101+
public readonly webhookSecret: secretsmanager.Secret;
102+
103+
/** Webhook dedup table (composite key = `repo#deployment_id#status_id`). */
104+
public readonly webhookDedupTable: dynamodb.Table;
105+
106+
/** Webhook receiver Lambda (HMAC verifier + dispatcher). */
107+
public readonly webhookFn: lambda.NodejsFunction;
108+
109+
/** Async processor Lambda (browser + S3 + PR comment). */
110+
public readonly webhookProcessorFn: lambda.NodejsFunction;
111+
112+
constructor(scope: Construct, id: string, props: GitHubScreenshotIntegrationProps) {
113+
super(scope, id);
114+
115+
const removalPolicy = props.removalPolicy ?? RemovalPolicy.DESTROY;
116+
117+
// --- Screenshot bucket (private; served via CloudFront with OAC) ---
118+
this.screenshotBucket = new ScreenshotBucket(this, 'ScreenshotBucket', {
119+
removalPolicy,
120+
});
121+
122+
// --- Webhook signing secret (operator-populated placeholder) ---
123+
this.webhookSecret = new secretsmanager.Secret(this, 'WebhookSecret', {
124+
description: 'GitHub deployment-status webhook signing secret — populate manually after configuring the GitHub webhook',
125+
removalPolicy,
126+
});
127+
128+
// --- Dedup table ---
129+
this.webhookDedupTable = new dynamodb.Table(this, 'WebhookDedupTable', {
130+
partitionKey: { name: 'dedup_key', type: dynamodb.AttributeType.STRING },
131+
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
132+
timeToLiveAttribute: 'ttl',
133+
pointInTimeRecoverySpecification: { pointInTimeRecoveryEnabled: true },
134+
removalPolicy,
135+
});
136+
137+
const handlersDir = path.join(__dirname, '..', 'handlers');
138+
const commonBundling: lambda.BundlingOptions = {
139+
externalModules: ['@aws-sdk/*'],
140+
};
141+
142+
// --- Async processor (browser + S3 + comment) ---
143+
// Lambda timeout: 120s. The handler enforces a single wall-clock
144+
// deadline of 110s (TOTAL_BUDGET_MS in github-webhook-processor.ts)
145+
// shared across PR-lookup retry + screenshot capture + S3 PUT +
146+
// comment POST, so no individual sub-step can run past the Lambda
147+
// timeout even on the worst-case path. The 10s headroom covers
148+
// SDK auto-retries + the runtime's shutdown grace so a hard timeout
149+
// never severs an in-flight comment-post. (theagenticguy PR-241
150+
// review item B1: previous comment under-counted the 35s retry
151+
// ladder that runs before captureScreenshot's 60s budget.)
152+
this.webhookProcessorFn = new lambda.NodejsFunction(this, 'WebhookProcessorFn', {
153+
entry: path.join(handlersDir, 'github-webhook-processor.ts'),
154+
handler: 'handler',
155+
runtime: Runtime.NODEJS_24_X,
156+
architecture: Architecture.ARM_64,
157+
timeout: Duration.seconds(120),
158+
memorySize: 512,
159+
environment: {
160+
SCREENSHOT_BUCKET_NAME: this.screenshotBucket.bucket.bucketName,
161+
SCREENSHOT_PUBLIC_HOST: this.screenshotBucket.distribution.domainName,
162+
GITHUB_TOKEN_SECRET_ARN: props.githubTokenSecret.secretArn,
163+
...(props.linearWorkspaceRegistryTable && {
164+
LINEAR_WORKSPACE_REGISTRY_TABLE_NAME: props.linearWorkspaceRegistryTable.tableName,
165+
}),
166+
},
167+
bundling: commonBundling,
168+
});
169+
170+
this.screenshotBucket.bucket.grantPut(this.webhookProcessorFn);
171+
props.githubTokenSecret.grantRead(this.webhookProcessorFn);
172+
173+
// Optional Linear feedback path. Wired only when a registry table
174+
// is provided. The processor scans the registry for active
175+
// workspaces, then per-workspace looks up the OAuth token from
176+
// Secrets Manager (`bgagent-linear-oauth-*` prefix, written by
177+
// `bgagent linear setup`).
178+
//
179+
// PutSecretValue is intentional, not a typo: resolveLinearOauthToken
180+
// rotates Linear's refresh token in place when it expires (the same
181+
// pattern as the orchestrator role). This is a write grant by
182+
// design — see linear-oauth-resolver.ts.
183+
if (props.linearWorkspaceRegistryTable) {
184+
props.linearWorkspaceRegistryTable.grantReadData(this.webhookProcessorFn);
185+
this.webhookProcessorFn.addToRolePolicy(new iam.PolicyStatement({
186+
actions: ['secretsmanager:GetSecretValue', 'secretsmanager:PutSecretValue'],
187+
resources: [
188+
Stack.of(this).formatArn({
189+
service: 'secretsmanager',
190+
resource: 'secret',
191+
arnFormat: ArnFormat.COLON_RESOURCE_NAME,
192+
resourceName: 'bgagent-linear-oauth-*',
193+
}),
194+
],
195+
}));
196+
}
197+
198+
// AgentCore Browser session lifecycle + automation-stream connect.
199+
// Action set scoped to the three calls the handler actually makes;
200+
// resource is `*` because Browser sessions are ephemeral and the
201+
// two `Connect*Stream` data-plane actions in the AWS Service
202+
// Authorization Reference for `bedrock-agentcore` declare no
203+
// resource types or condition keys (they require Resource:"*"
204+
// anyway). cdk-nag IAM5 suppression annotates the resource
205+
// wildcard.
206+
//
207+
// Source: AWS Service Authorization Reference for bedrock-agentcore,
208+
// https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrockagentcore.html
209+
// - bedrock-agentcore:StartBrowserSession (Write)
210+
// - bedrock-agentcore:StopBrowserSession (Write)
211+
// - bedrock-agentcore:ConnectBrowserAutomationStream (Read; no resource scoping)
212+
this.webhookProcessorFn.addToRolePolicy(new iam.PolicyStatement({
213+
actions: [
214+
'bedrock-agentcore:StartBrowserSession',
215+
'bedrock-agentcore:StopBrowserSession',
216+
'bedrock-agentcore:ConnectBrowserAutomationStream',
217+
],
218+
resources: ['*'],
219+
}));
220+
221+
// --- Webhook receiver (verify, dedup, dispatch) ---
222+
this.webhookFn = new lambda.NodejsFunction(this, 'WebhookFn', {
223+
entry: path.join(handlersDir, 'github-webhook.ts'),
224+
handler: 'handler',
225+
runtime: Runtime.NODEJS_24_X,
226+
architecture: Architecture.ARM_64,
227+
timeout: Duration.seconds(10),
228+
environment: {
229+
GITHUB_WEBHOOK_SECRET_ARN: this.webhookSecret.secretArn,
230+
GITHUB_WEBHOOK_DEDUP_TABLE_NAME: this.webhookDedupTable.tableName,
231+
GITHUB_WEBHOOK_PROCESSOR_FUNCTION_NAME: this.webhookProcessorFn.functionName,
232+
...(props.screenshotTargetEnvironment && {
233+
SCREENSHOT_TARGET_ENVIRONMENT: props.screenshotTargetEnvironment,
234+
}),
235+
},
236+
bundling: commonBundling,
237+
});
238+
239+
this.webhookSecret.grantRead(this.webhookFn);
240+
this.webhookDedupTable.grantReadWriteData(this.webhookFn);
241+
this.webhookProcessorFn.grantInvoke(this.webhookFn);
242+
243+
// --- API Gateway route ---
244+
const githubResource = props.api.root.addResource('github');
245+
const webhookResource = githubResource.addResource('webhook');
246+
const webhookMethod = webhookResource.addMethod(
247+
'POST',
248+
new apigw.LambdaIntegration(this.webhookFn),
249+
{ authorizationType: apigw.AuthorizationType.NONE },
250+
);
251+
252+
NagSuppressions.addResourceSuppressions(webhookMethod, [
253+
{
254+
id: 'AwsSolutions-APIG4',
255+
reason: 'GitHub webhook endpoint authenticates via X-Hub-Signature-256 HMAC, not Cognito — required by GitHub webhook protocol.',
256+
},
257+
{
258+
id: 'AwsSolutions-COG4',
259+
reason: 'GitHub webhook endpoint authenticates via X-Hub-Signature-256 HMAC, not Cognito — required by GitHub webhook protocol.',
260+
},
261+
]);
262+
263+
NagSuppressions.addResourceSuppressions(this.webhookFn, [
264+
{
265+
id: 'AwsSolutions-IAM4',
266+
reason: 'AWSLambdaBasicExecutionRole is the standard managed policy for Lambda CloudWatch Logs writes.',
267+
},
268+
{
269+
id: 'AwsSolutions-IAM5',
270+
reason: 'DynamoDB grants from CDK helpers expand to table-arn/index/* wildcards; receiver only writes to the dedup table.',
271+
},
272+
], true);
273+
274+
NagSuppressions.addResourceSuppressions(this.webhookProcessorFn, [
275+
{
276+
id: 'AwsSolutions-IAM4',
277+
reason: 'AWSLambdaBasicExecutionRole is the standard managed policy for Lambda CloudWatch Logs writes.',
278+
},
279+
{
280+
id: 'AwsSolutions-IAM5',
281+
reason: 'AgentCore Browser sessions are ephemeral and have no per-resource ARN; the data-plane API requires wildcards. S3 PutObject uses CDK grant helpers that expand to bucket/* wildcards.',
282+
},
283+
], true);
284+
285+
NagSuppressions.addResourceSuppressions(this.webhookSecret, [
286+
{
287+
id: 'AwsSolutions-SMG4',
288+
reason: 'GitHub webhook signing-secret rotation is owned by GitHub (operator regenerates on the GitHub side and pastes the new value here). No automated rotation Lambda needed.',
289+
},
290+
]);
291+
}
292+
}

0 commit comments

Comments
 (0)