Intelligent AWS Site-to-Site VPN Tunnel Investigation with Amazon DevOps Agent by rahulnarya · Pull Request #15 · aws-samples/sample-aws-genai-ops-demos

rahulnarya · 2026-04-21T00:29:30Z

Summary

Adds a new demo under observability/vpn-tunnel-investigation-devops-agent/ that deploys a fully self-contained Site-to-Site VPN environment and lets users inject 10 realistic failure scenarios to watch Amazon DevOps Agent automatically investigate each one.

What's included

CloudFormation template (2 VPCs, VPN, SNS, webhook Lambda)
Libreswan (IPsec) + GoBGP (BGP) on Amazon Linux 2023
10 failure scenarios: 5 IKE, 3 BGP, 1 route withdrawal, 1 throughput
4 CloudWatch alarms (per-tunnel, throughput, route-withdrawn)
MCP server (Lambda + API Gateway) for business context enrichment
Full README with Quick Start, architecture diagram, and troubleshooting

Testing

All 10 scenarios tested against a live environment.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

bquintas · 2026-04-23T18:08:17Z

PR Review Feedback

Thanks for the VPN tunnel investigation demo — great work overall! I've been testing it end-to-end and have some feedback below.

1. Please split this into two PRs

This PR mixes the new VPN demo with changes to the EKS demo (deploy-all.sh, deploy-all.ps1, README.md), the shared scripts (check-prerequisites.sh), and the docs landing page (docs/). These are independent changes with different owners/reviewers.

Could you split this into:

PR A: The VPN tunnel investigation demo (everything under observability/vpn-tunnel-investigation-devops-agent/)
PR B: The EKS deploy script fixes (assumed-role ARN conversion), shared script changes, and docs landing page update

This way the VPN demo and EKS fixes can be reviewed and merged independently.

2. `setup-devops-agent.sh` — Webapp URL not output

In step 4 (Enable Operator App), the script does:

OPERATOR_URL=$(aws devops-agent enable-operator-app ... --query 'operatorApp.url' --output text)

The enable-operator-app API does not return a url field — it only returns the IAM role ARN. So $OPERATOR_URL will always be empty, and the final summary shows a blank "Operator App" line.

The URL follows the pattern https://<agent-space-id>.aidevops.global.app.aws/. The script should construct it from $AGENT_SPACE_ID instead of trying to query it from the API response.

3. Webhook and MCP registration — "Operator App" vs "AWS DevOps Agent console"

The README and setup script both direct users to the Operator App to create webhooks and register MCP servers. However, these configuration tasks are done in the AWS DevOps Agent console (the management console at console.aws.amazon.com/devops-agent/), not the Operator App (which is the investigation web app at *.aidevops.global.app.aws).

Affected locations:

setup-devops-agent.sh step 5: "Open the Operator App URL above" → should say AWS DevOps Agent console
README step 2.5: "Open the Operator App URL in your browser" → same
README step 4c: "Open the Operator App URL (from step 2)" → same

4. README step 4c — Missing `x-api-key` header instruction

When registering the MCP server in the DevOps Agent console, users are asked for a header name in addition to the endpoint URL and API key value. The README doesn't mention this.

The API Gateway is configured with ApiKeyRequired: true, so the header should be:

Header name: x-api-key
Header value: (the API key retrieved via CLI)

Please add this to step 4c.

5. Quick Start — wrong clone URL

The README Quick Start section has:

git clone https://github.com/aws-samples/aws-site-to-site-vpn-devops-agent-demo.git

This should point to the actual repo:

git clone https://github.com/aws-samples/sample-aws-genai-ops-demos.git

Let me know if you have questions on any of these. Happy to re-review once updated!

…vOps Agent

rahulnarya · 2026-04-23T20:16:05Z

Hi bquintas,

Thank you for the review. Here's my plan for each item:

1. Split into two PRs

This PR only contains files under observability/vpn-tunnel-investigation-devops-agent/ (15 files, all new). The EKS/shared/docs commits on the branch are by @bllecoq and were already on main when I forked. That said, I can
see how having it under observability/ next to the EKS demo creates confusion.

I'm going to:
Move the demo to networking/vpn-tunnel-investigation-devops-agent/
Rebase onto latest upstream/main

This should make the PR completely independent from the EKS demo.

2. Operator App URL not output:

I confirmed the enable-operator-app API doesn't return a URL. I'll construct it from the agent space ID: https://${AGENT_SPACE_ID}.aidevops.global.app.aws/home

3. Operator App" vs "AWS DevOps Agent console:

I'll fix all three locations to say "AWS DevOps Agent console" and update the navigation paths to match the actual console layout (Capabilities → Webhooks, Capabilities → MCP Server). The Operator App URL will still be in the script output for users to access investigations later.

4. Missing x-api-key header:

I'll rewrite step 4c to walk through the actual console flow:

Register: name, endpoint URL, API Key auth (header: x-api-key)
Add to agent space: select the 3 tools, save

5. Wrong clone URL:

I'll fix to point to sample-aws-genai-ops-demos with the correct cd path into the networking/ folder.

Working on these now, will push an update shortly.

…pt, fix README - Move demo from observability/ to networking/ (independent from EKS demo) - Rebase onto latest upstream/main - Fix setup-devops-agent.sh: construct Operator URL from agent space ID, point webhook creation to AWS Console with direct URL - Fix README: correct clone URL, webhook instructions (console not Operator App), MCP registration with full wizard walkthrough (API Key auth, x-api-key header, tool selection), note about MCP webhook not needed - Add root README entry for networking/ folder

…ed regions

rahulnarya · 2026-04-23T21:13:31Z

Hi bquintas,

I am ready for re-review. Thank you!

bquintas · 2026-04-24T07:16:34Z

6. CloudFormation template — AZ compatibility issue

The vpn-demo.yaml template doesn't specify an Availability Zone for the subnets. CloudFormation can pick any AZ, and in us-east-1 it chose us-east-1e which doesn't support t3.micro — causing the stack to fail with:

Your requested instance type (t3.micro) is not supported in your requested Availability Zone (us-east-1e).

The fix is to either:

Add an AvailabilityZone property to both subnets (e.g. using !Select [0, !GetAZs ""] to pick the first AZ), or
Add an AvailabilityZone parameter so users can choose

This is a deployment blocker in regions where not all AZs support t3.micro.

bquintas · 2026-04-24T07:31:36Z

Follow-up on #6 (AZ issue): !Select [0, !GetAZs ""] would make it deterministic (always picks the first AZ alphabetically), but it's still not guaranteed to support t3.micro in every region/account.

An alternative would be adding an explicit parameter:

Parameters:
  AvailabilityZone:
    Type: AWS::EC2::AvailabilityZone::Name
    Description: AZ that supports t3.micro

That way the user picks a known-good AZ and CloudFormation validates it exists. Either approach is better than the current random selection — up to you which fits better.

bquintas · 2026-04-24T07:33:12Z

One more thought on the AZ issue: Users can check which AZs support t3.micro in their account with:

aws ec2 describe-instance-type-offerings \
  --location-type availability-zone \
  --filters Name=instance-type,Values=t3.micro \
  --region us-east-1 \
  --query "InstanceTypeOfferings[].Location" --output table

Could be worth adding as a helper script (e.g. scripts/check-az.sh) or baking it into deploy.sh to auto-select a valid AZ before creating the stack. That way users don't hit a random failure on first deploy.

bquintas · 2026-04-24T07:35:29Z

7. Cleanup section — missing S3 bucket deletion

The cleanup instructions don't mention deleting the S3 bucket created for the MCP server Lambda package (my-mcp-bucket-<account-id>). It's left behind after cleanup. Worth adding:

# Delete the MCP server Lambda package bucket
aws s3 rb s3://my-mcp-bucket-${AWS_ACCOUNT_ID} --force --region us-east-1

…Ops Agent - Moved to observability/ pillar, renamed directory per steering conventions - Converted to Python CDK with solution adoption tracking - Added deploy-all.ps1 for cross-platform support - Added ARCHITECTURE.md with component descriptions and data flow - All review feedback addressed (GitHub aws-samples#1-7, Slack aws-samples#9-12)

rahulnarya · 2026-04-28T06:36:59Z

Hi @bquintas,

Ready for re-review. Here's what changed since your last review:

GitHub feedback addressed:

Split PR — rebased, only VPN demo files in this PR
Operator URL — constructed from agent space ID
Console vs Operator App — all config references fixed
x-api-key header — full MCP registration walkthrough in README
Clone URL — points to sample-aws-genai-ops-demos
AZ compatibility — baked into CDK with Fn.select(0, Fn.get_azs(""))
S3 cleanup — N/A, CDK handles packaging via Code.from_asset()

Slack feedback addressed:

list output — shows both CGW and laptop-side syntax
Silent rollback — state file tracking + system state verification
dpd-timeout rollback — explicit ipsec restart for instant recovery
On-demand prompts — time-scoped prompts for traffic-selector and bgp-route-withdraw

Major changes:

Converted from raw CloudFormation to Python CDK (2 stacks)
Moved from networking/ to observability/ per steering pillars
Renamed to aws-site-to-site-vpn-tunnel-investigation-devops-agent
Added deploy-all.ps1 and ARCHITECTURE.md
Solution adoption tracking added
Conformity audit
All 10 scenarios tested in eu-west-1 region.

Note on PR scope:

The diff still shows changes to EKS files, shared scripts, and docs. These are from @bllecoq's commits that were on main when I forked — not part of this demo. Once those are merged into upstream main, I'll
rebase and the diff will only show the VPN demo files.

Thank you!

bquintas · 2026-04-28T09:30:01Z

Round 2 Review — Updated PR

Thanks for addressing the first round of feedback. The CDK migration, PowerShell script, ARCHITECTURE.md, and pillar move all look good. Found a few more issues during testing:

8. `setup-devops-agent.sh` — no PowerShell equivalent

The README says Windows users should use deploy-all.ps1, but there's no PowerShell version of setup-devops-agent.sh. Windows users can't complete step 2 (DevOps Agent setup) without bash/WSL.

9. Operator App URL still blank in setup script output

The enable-operator-app API doesn't return a url field, so $OPERATOR_URL is empty and the summary shows a blank line. The URL can be constructed from the agent space ID:

OPERATOR_URL="https://${AGENT_SPACE_ID}.aidevops.global.app.aws/"

10. Trust policy not updated when IAM role already exists

The script uses create-role || get-role — if the role already exists from a previous run in a different region, the trust policy still references the old region's ARN (arn:aws:aidevops:eu-west-1:... instead of us-east-1). This causes AssociateService to fail with "Invalid STS role configuration." The script should always update the trust policy to match the current $REGION.

11. `deploy-all.sh` — Python dependency installation fails silently (deployment blocker)

The script does:

pip3 install -q -r ... 2>/dev/null || pip3 install -q ... --break-system-packages 2>/dev/null

On macOS with pyenv/Homebrew Python, both attempts can fail silently, and CDK then crashes with ModuleNotFoundError: No module named 'aws_cdk'. Two issues:

Errors are suppressed — 2>/dev/null hides the actual failure. The user sees a confusing CDK synth error instead of "pip install failed."
--break-system-packages is not the right approach — it bypasses PEP 668 protections and can cause system Python conflicts.

Recommendation: Use a virtual environment. The script should create a venv, install deps there, and set cdk.json to use the venv's Python. This works reliably on macOS, Linux, and Windows, and avoids the --break-system-packages flag entirely. Example:

python3 -m venv "$CDK_DIR/.venv"
source "$CDK_DIR/.venv/bin/activate"
pip install -r "$CDK_DIR/requirements.txt"

And in cdk.json:

{ "app": ".venv/bin/python3 app.py" }

This is a deployment blocker — I couldn't complete step 3 without manual workarounds.

12. S3 bucket still missing from cleanup

The MCP server deployment creates an S3 bucket (my-mcp-bucket-<account-id>) that isn't mentioned in the cleanup section. (Same as previous round — the README step 4a creates it manually, but cleanup doesn't delete it.)

…MCP cleanup, E2E tested on Linux + Windows - Created setup-devops-agent.ps1, inject-failure.ps1, cleanup.ps1, verify-cleanup.ps1 - Fixed: Operator URL constructed from agent space ID (aws-samples#9) - Fixed: Trust policy updated on re-run across regions (aws-samples#10) - Fixed: pip install uses venv (aws-samples#11) - Fixed: CDK bootstrap cleanup with y/N prompt (aws-samples#12) - Fixed: MCP registration field order and x-api-key header in README - Fixed: PowerShell BOM, file:// paths, JSON quoting, ErrorActionPreference - Added: verify-cleanup script, MCP deregister steps in cleanup - E2E tested: Linux (ap-southeast-2), Windows PowerShell (ap-southeast-2)

…list-agent-spaces

bquintas · 2026-04-29T20:24:46Z

Round 3 Review — All 10 Scenarios Tested (Bash/macOS)

Successfully deployed and tested all 10 failure scenarios end-to-end on macOS. The venv fix, trust policy update, and Operator URL construction all work correctly now. Great improvements from round 2.

A few more items from testing:

13. SSH security group open to 0.0.0.0/0

Both VPC security groups allow SSH from 0.0.0.0/0. For a demo that's deployed and torn down quickly this is acceptable, but the deploy script should offer options:

Default: Auto-detect deployer's IP via curl -s https://checkip.amazonaws.com and restrict SSH to that IP
Manual: Accept an --ssh-cidr parameter for users behind NAT/VPN
Open: --ssh-open flag with a warning (not recommended)

The README should also note that in production, SSH should be locked to specific IPs and recommend SSM Session Manager as the preferred approach for CGW access.

14. (Nice-to-have) `status` command could include CloudWatch alarm state

The inject/rollback scripts already show alarm state in their verification output, but the standalone status command doesn't. Adding it would help users confirm alarms have returned to OK before injecting the next scenario — especially since the README advises waiting between scenarios.

15. Throughput alarm delay not documented

The throughput alarm uses a 300-second (5-minute) evaluation period, unlike the per-tunnel and route-withdrawn alarms which use 60-second periods. After injecting throughput-degradation, I waited several minutes wondering if something was broken before the alarm fired. The README should mention this — something like "The throughput alarm uses a 5-minute evaluation period — expect up to 5 minutes before it fires, compared to ~1 minute for other scenarios." Alternatively, consider reducing the period to 60 seconds to match the other alarms.

Testing status

✅ All 10 scenarios tested on macOS/Bash — all working
⏳ Windows/PowerShell testing still pending — I have a Windows instance ready and will test deploy-all.ps1, setup-devops-agent.ps1, inject-failure.ps1, and cleanup.ps1 next

bquintas

PR Review: Windows Testing — Critical Issues Found

⚠️ These scripts were not tested on Windows

I deployed and tested this demo end-to-end on a Windows Server EC2 instance (PowerShell 5.1, Python 3.13, Node.js, AWS CLI v2). Every .ps1 script failed on first run. The bugs found are all Windows-specific issues that are invisible when testing with pwsh on macOS/Linux — which is almost certainly how these scripts were developed.

The repo steering rules are explicit: "Must work on Windows (cmd/PowerShell primary shell)" and "Test on Windows before considering complete." PowerShell scripts exist in this repo specifically for Windows users. Testing them on macOS with pwsh does not validate Windows compatibility. Please re-test all .ps1 scripts on an actual Windows machine before resubmitting.

Bug 1: `deploy-all.ps1` uses `python3` — command does not exist on Windows

File: deploy-all.ps1, lines 51 and 65

Impact: Script fails immediately at Step 2 — CDK deployment never starts.

Problem: The script checks for python3 first (line 51) with a fallback to python, but then unconditionally calls python3 -m venv (line 65). On Windows, the standard Python installer registers the command as python, not python3. The python3 command is a Linux/macOS convention. The fallback check is dead code.

This is inconsistent with every other .ps1 script in the repo:

shared/scripts/check-prerequisites.ps1 → python --version
shared/scripts/deploy-cdk.ps1 → pip install (no python3)
operations-automation/anycompany-it-demo-portal/deploy-all.ps1 → python scripts/seed-data.py

Fix: Lines 51-56, replace:

if (-not (Get-Command python3 -ErrorAction SilentlyContinue)) {
    if (-not (Get-Command python -ErrorAction SilentlyContinue)) {
        Write-Host "ERROR: Python 3 is required for CDK. Install from https://python.org" -ForegroundColor Red
        exit 1
    }
}

with:

if (-not (Get-Command python -ErrorAction SilentlyContinue)) {
    Write-Host "ERROR: Python 3 is required for CDK. Install from https://python.org" -ForegroundColor Red
    exit 1
}

Line 65, replace python3 -m venv with python -m venv.

Bug 2: CRLF line endings break all remote bash execution (Steps 7 and 8)

File: deploy-all.ps1, lines 216 and 274

Impact: Libreswan and GoBGP configuration fails completely — IPsec tunnels do not establish, BGP does not start. Every command on the CGW fails with $'\r': command not found and Invalid unit name "ipsec" escaped as "ipsec\x0d".

Problem: PowerShell on Windows produces CRLF (\r\n) in here-strings. When piped to the CGW via SSH ($script | ssh ... "sudo bash -s"), bash receives \r at the end of every line and treats it as part of the command. This breaks systemctl, sysctl, chmod, sleep, ip, and every other command in the script.

This is invisible on macOS because PowerShell on macOS uses LF natively.

Fix: Strip \r before piping. However, note that -replace "r", ""` cleans the string in memory but PowerShell may re-encode with CRLF when piping to external processes. The fix needs to be validated on Windows — options include:

Strip on the receiving end: $script | ssh ... "sed 's/\r$//' | sudo bash -s"
Write to a temp file with explicit LF encoding and pipe the file
Use [System.IO.File]::WriteAllText() to a temp file, then Get-Content -Raw | ssh ...

The contributor must test whichever approach they choose on an actual Windows machine.

Bug 3: Step 5 silently continues when UserData fails

Impact: Misleading — the script prints "Packages installing... (30/30)" and continues to Steps 7-8 even when the CGW has no packages installed. This led to GoBGP being missing (/usr/local/bin/gobgpd: No such file or directory) with no clear error message.

Problem: The UserData script hit a transient yum cache error ([Errno 2] No such file or directory for a cached RPM). The USERDATA_COMPLETE marker was never written. The Step 5 loop timed out after 5 minutes and silently fell through. Steps 7-8 then tried to configure software that was never installed.

Suggested fixes:

Step 5 should fail with a clear error if the loop exhausts all 30 iterations without finding USERDATA_COMPLETE, not silently continue
The UserData script in the CDK stack should add yum clean packages before yum install for robustness, or retry on failure
Step 8 should verify gobgpd exists before trying to configure it

Bug 4 (cosmetic but misleading): `gobgp neighbor\r` in Step 8 output

Impact: Even after applying CRLF fixes, the gobgp neighbor status check at the end of Step 8 displays Error: unknown command "neighbor\r". This led us to spend hours debugging what appeared to be a broken BGP configuration, when in fact BGP was working correctly.

Root cause: PowerShell pipe to SSH stdin may re-encode with CRLF regardless of in-memory string cleaning. The actual GoBGP configuration and service are fine — this is only the final status check command that displays incorrectly.

Fix: Same as Bug 2 — the CRLF-to-SSH piping approach needs a solution that is validated on Windows.

ASH Security Scan Results

Ran ASH v3.4.0 in local mode against the demo:

Scanner	Result	Findings
bandit	✅ PASSED	0
cdk-nag	✅ PASSED	0
checkov	✅ PASSED	0
semgrep	✅ PASSED	0
npm-audit	✅ PASSED	0
detect-secrets	⚠️ 1 false positive	`api_key_name="vpn-devops-mcp-api-key"` in `mcp_server_stack.py:53` — parameter name flagged, not an actual secret

No real security issues.

Summary

The demo architecture and bash scripts work well. The CDK infrastructure deploys correctly, the MCP server integration is solid, and the failure scenarios are well-designed. However, the PowerShell scripts were clearly not tested on Windows — they appear to have been generated from the bash versions (likely with AI assistance) and only validated on macOS with pwsh. All four bugs are Windows-specific and would have been caught in the first run on a Windows machine.

Required before merge: Re-test all .ps1 scripts (deploy-all.ps1, setup-devops-agent.ps1, inject-failure.ps1, cleanup.ps1) on an actual Windows machine end-to-end.

Note: I will be out of office and will not be available for follow-up reviews. Once the Windows fixes are applied and tested, please request another reviewer to validate and approve.

… throughput delay note - Bug 1: python3 → python in deploy-all.ps1 (Windows convention) - Bug 2+4: sed CRLF strip on receiving end for SSH pipes in deploy-all.ps1 - Bug 3: Step 5 fails on timeout, yum clean packages in UserData, verify gobgpd before Step 8 - aws-samples#13: SSH CIDR auto-detect (default), --ssh-cidr, --ssh-open flags - aws-samples#14: CloudWatch alarm state in inject-failure status command - aws-samples#15: Throughput alarm 5-6 min delay documented in README

- cleanup.sh/ps1: handle versioned CDK bootstrap bucket (delete all object versions and delete markers before bucket removal) - README: Node.js 18+ → 20+ (CDK CLI requirement) - README: add git as prerequisite - README: add 'verify alarms returned to OK' step between rollback and next scenario - cgw-scripts/list: group bgp-route-withdraw with throughput-degradation under 'Dedicated-Alarm Scenarios' to match README

… usage - Cleanup Step 2: make list-associations an explicit numbered step instead of a comment (both Bash and PowerShell sections) - Verify cleanup: use $(aws configure get region) instead of <region> placeholder, matching cleanup.sh example

- deploy-all.ps1: strip CRLF from cgw-scripts after scp (shebang fix) - deploy-all.ps1: skip venv creation if .venv already exists - cleanup.ps1: fix versioned bucket deletion (raw JSON, PS 5.1 compat) - cleanup.sh/ps1: fix 'Skipped' hint to show correct versioned bucket steps - inject-failure.ps1: update list grouping to match Dedicated-Alarm Scenarios - inject-failure.sh/ps1: add 5-minute throughput alarm note on inject - README: add PowerShell instructions for Step 3a and 3b (MCP deploy) - README: fix step numbering in cleanup (was duplicate aws-samples#3 and aws-samples#5) - README: fix troubleshooting Lambda name (CDK auto-generates)

- README: Bold PowerShell labels with (Windows) indicator for consistency - README: Replace redundant API key command with reference to step 3b - cleanup.sh: Print runnable bash commands for manual CDK bootstrap deletion - cleanup.ps1: Print runnable PowerShell commands for manual CDK bootstrap deletion - Both cleanup scripts: Add --no-cli-pager to all printed commands Tested E2E on AL2023 (ap-southeast-2): deploy, psk-mismatch inject/rollback, cleanup, verify-cleanup all passed.

- deploy-all.ps1: sudo sed redirect runs as ec2-user (permission denied on root-owned /opt/vpn-demo/). Use sudo tee to write as root. - README: PowerShell 7+ → 5.1+ (tested on Windows Server 2025, PS 5.1 works for all scripts)

Tester finding: Steps 1, 3, 4 in 'Run the Demo' only showed bash commands. Added PowerShell equivalents for Windows users, matching the pattern used in the Quick Start section.

rahulnarya · 2026-05-05T18:58:55Z

Hi Team,

The VPN demo is ready for review.

Clarification: Earlier I had run the Windows PowerShell scripts on my windows laptop. My windows had Microsoft Python Stub that's why my windows test had worked locally.

However, I have now changed the approach of testing my demo:

Testing method: I spun up EC2 instances (Linux and Windows Server 2025) in my AWS Account and ran the demo end-to-end following the TESTER-ONBOARDING-GUIDE — fresh machines, no prior setup.

Looking for the feedback from the review and address them accordingly.

Please let me know if you have any questions-Thank you!

bquintas self-assigned this Apr 23, 2026

Rahul Arya added 2 commits April 23, 2026 20:14

Add: Intelligent Site-to-Site VPN Tunnel Investigation with Amazon De…

9ca7647

…vOps Agent

Fix: remove duplicate Contributing/Security/License sections

58311d1

rahulnarya force-pushed the add-vpn-tunnel-investigation-demo branch from 9c3e8a2 to 759815a Compare April 23, 2026 20:32

rahulnarya changed the title ~~Intelligent Site-to-Site VPN Tunnel Investigation with Amazon DevOps Agent~~ Intelligent AWS Site-to-Site VPN Tunnel Investigation with Amazon DevOps Agent Apr 23, 2026

Rename folder, add AWS to title, fix webhook button text, add support…

a5e0f5d

…ed regions

rahulnarya force-pushed the add-vpn-tunnel-investigation-demo branch 5 times, most recently from 6b27bde to ba53d65 Compare April 29, 2026 04:11

Fix 4 tester findings: webhook required, MCP before VPN, PS cleanup, …

9aece60

…list-agent-spaces

rahulnarya force-pushed the add-vpn-tunnel-investigation-demo branch from ba53d65 to 9aece60 Compare April 29, 2026 04:13

bquintas requested changes Apr 30, 2026

View reviewed changes

bquintas removed their assignment Apr 30, 2026

Rahul Arya added 2 commits April 30, 2026 19:41

rahulnarya force-pushed the add-vpn-tunnel-investigation-demo branch from b2086bf to 39fe4c8 Compare May 4, 2026 16:48

Rahul Arya added 4 commits May 4, 2026 21:09

Add PowerShell examples to Run the Demo section

05cf8f1

Tester finding: Steps 1, 3, 4 in 'Run the Demo' only showed bash commands. Added PowerShell equivalents for Windows users, matching the pattern used in the Quick Start section.

Add PowerShell commands and fix key name consistency in README

66cba4f

rahulnarya force-pushed the add-vpn-tunnel-investigation-demo branch from cc1c3dd to 66cba4f Compare May 8, 2026 23:46

benlec assigned bquintas May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intelligent AWS Site-to-Site VPN Tunnel Investigation with Amazon DevOps Agent#15

Intelligent AWS Site-to-Site VPN Tunnel Investigation with Amazon DevOps Agent#15
rahulnarya wants to merge 15 commits into
aws-samples:mainfrom
rahulnarya:add-vpn-tunnel-investigation-demo

rahulnarya commented Apr 21, 2026

Uh oh!

bquintas commented Apr 23, 2026

Uh oh!

rahulnarya commented Apr 23, 2026 •

edited

Loading

Uh oh!

rahulnarya commented Apr 23, 2026

Uh oh!

bquintas commented Apr 24, 2026

Uh oh!

bquintas commented Apr 24, 2026

Uh oh!

bquintas commented Apr 24, 2026

Uh oh!

bquintas commented Apr 24, 2026

Uh oh!

rahulnarya commented Apr 28, 2026 •

edited

Loading

Uh oh!

bquintas commented Apr 28, 2026

Uh oh!

bquintas commented Apr 29, 2026 •

edited

Loading

Uh oh!

bquintas left a comment

Uh oh!

rahulnarya commented May 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rahulnarya commented Apr 21, 2026

Summary

What's included

Testing

Uh oh!

bquintas commented Apr 23, 2026

PR Review Feedback

1. Please split this into two PRs

2. setup-devops-agent.sh — Webapp URL not output

3. Webhook and MCP registration — "Operator App" vs "AWS DevOps Agent console"

4. README step 4c — Missing x-api-key header instruction

5. Quick Start — wrong clone URL

Uh oh!

rahulnarya commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rahulnarya commented Apr 23, 2026

Uh oh!

bquintas commented Apr 24, 2026

6. CloudFormation template — AZ compatibility issue

Uh oh!

bquintas commented Apr 24, 2026

Uh oh!

bquintas commented Apr 24, 2026

Uh oh!

bquintas commented Apr 24, 2026

7. Cleanup section — missing S3 bucket deletion

Uh oh!

rahulnarya commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bquintas commented Apr 28, 2026

Round 2 Review — Updated PR

8. setup-devops-agent.sh — no PowerShell equivalent

9. Operator App URL still blank in setup script output

10. Trust policy not updated when IAM role already exists

11. deploy-all.sh — Python dependency installation fails silently (deployment blocker)

12. S3 bucket still missing from cleanup

Uh oh!

bquintas commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Round 3 Review — All 10 Scenarios Tested (Bash/macOS)

13. SSH security group open to 0.0.0.0/0

14. (Nice-to-have) status command could include CloudWatch alarm state

15. Throughput alarm delay not documented

Testing status

Uh oh!

bquintas left a comment

Choose a reason for hiding this comment

PR Review: Windows Testing — Critical Issues Found

⚠️ These scripts were not tested on Windows

Bug 1: deploy-all.ps1 uses python3 — command does not exist on Windows

Bug 2: CRLF line endings break all remote bash execution (Steps 7 and 8)

Bug 3: Step 5 silently continues when UserData fails

Bug 4 (cosmetic but misleading): gobgp neighbor\r in Step 8 output

ASH Security Scan Results

Summary

Uh oh!

rahulnarya commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2. `setup-devops-agent.sh` — Webapp URL not output

4. README step 4c — Missing `x-api-key` header instruction

rahulnarya commented Apr 23, 2026 •

edited

Loading

rahulnarya commented Apr 28, 2026 •

edited

Loading

8. `setup-devops-agent.sh` — no PowerShell equivalent

11. `deploy-all.sh` — Python dependency installation fails silently (deployment blocker)

bquintas commented Apr 29, 2026 •

edited

Loading

14. (Nice-to-have) `status` command could include CloudWatch alarm state

Bug 1: `deploy-all.ps1` uses `python3` — command does not exist on Windows

Bug 4 (cosmetic but misleading): `gobgp neighbor\r` in Step 8 output

rahulnarya commented May 5, 2026 •

edited

Loading