Skip to content

Fix gVisor workflow: Add proper health checks for Squid and Envoy#5237

Merged
lpcox merged 24 commits into
mainfrom
fix/gvisor-workflow-healthchecks
Jun 23, 2026
Merged

Fix gVisor workflow: Add proper health checks for Squid and Envoy#5237
lpcox merged 24 commits into
mainfrom
fix/gvisor-workflow-healthchecks

Conversation

@lpcox

@lpcox lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Problem

The gVisor firewall comparison workflow failed with spurious connection errors because Squid and Envoy containers weren't fully started before agent containers tried to use them.

Failed run: https://github.com/github/gh-aw-firewall/actions/runs/27737950523

Symptoms:

=== Test 1: Allowed domain via forward proxy (github.com) ===
❌ FAIL: github.com blocked

Even though HTTPS_PROXY was set correctly, curl couldn't connect because Squid wasn't listening yet.

Solution

Replace sleep 3 with proper health check loops:

  • Squid tests: Wait up to 30s for port 3128 to accept proxy requests
  • Envoy tests: Wait up to 30s for admin /ready endpoint to return 200

Each health check:

  • ✅ Uses lightweight curlimages/curl container for network checks
  • ✅ Retries for up to 30 seconds with 1-second intervals
  • ✅ Shows container logs on failure for debugging
  • ✅ Exits with error code if proxy fails to start

Changes

  • test-squid-runc: Added Squid health check (proxy port)
  • test-squid-gvisor: Added Squid health check (proxy port)
  • test-envoy-iptables-runc: Added Envoy health check (admin /ready)
  • test-envoy-gvisor: Added Envoy health check (admin /ready)
  • performance-comparison: Added health checks for both benchmarks

Testing

This PR branch will trigger the workflow to verify the fixes work.

lpcox and others added 2 commits June 17, 2026 21:42
- Compare Squid vs Envoy proxy approaches
- Test under both runc and gVisor runtimes
- Verify iptables DNAT/redirect compatibility with gVisor
- Benchmark performance (latency comparison)
- Generate summary report with recommendations

Tests answer key questions:
1. Does gVisor support iptables DNAT for traffic redirection?
2. Which proxy approach works better with gVisor?
3. Can AWF keep current Squid architecture or need Envoy?

Related to issue #3264 (gVisor compatibility investigation)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes all 19 review comments:

1. Remove ineffective setup job (ran on different runner)
2. Pin actions/checkout by SHA for supply-chain hardening
3. Add set -euo pipefail and EXIT traps to all test steps
4. Make test assertions fail with exit 1 instead of just logging
5. Add DNAT fallback tests with proxy env disabled
6. Fix benchmark outputs to write to $GITHUB_OUTPUT
7. Fix benchmark to use explicit proxy (-x flag)
8. Fix Envoy gVisor config to match runc (add dynamic_forward_proxy)
9. Clarify HTTPS expectations for Envoy (known limitation)
10. Add job outputs for performance comparison
11. Add header note explaining defense-in-depth test approach

Key changes:
- All test jobs now properly propagate failures
- DNAT verification tests actually check enforcement (not just rule acceptance)
- Performance benchmarks capture and output latency correctly
- Cleanup happens reliably via EXIT traps
- Envoy configs consistent between runc and gVisor tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 18, 2026 05:09
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Workflow triggered on PR branch: https://github.com/github/gh-aw-firewall/actions/runs/27738138136

This run will test the health check fixes. Expected improvements:

  1. Squid tests should wait for proxy port to be ready (no more premature connection attempts)
  2. Envoy tests should wait for admin /ready endpoint (proper startup validation)
  3. All tests will show container logs if startup fails (better debugging)

The workflow will provide empirical results for the gVisor compatibility question: Does gVisor's userspace network stack support iptables DNAT in a way compatible with AWF's architecture?

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 98.01% 97.88% 📉 -0.13%
Statements 97.95% 97.82% 📉 -0.13%
Functions 99.51% 99.51% ➡️ +0.00%
Branches 93.68% 93.55% 📉 -0.13%
📁 Per-file Coverage Changes (3 files)
File Lines (Before → After) Statements (Before → After)
src/commands/validators/config-assembly.ts 98.1% → 90.4% (-7.66%) 98.1% → 90.4% (-7.66%)
src/compose-generator.ts 98.4% → 98.6% (+0.21%) 98.4% → 98.6% (+0.21%)
src/workdir-setup.ts 92.7% → 94.5% (+1.82%) 92.7% → 94.5% (+1.82%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new GitHub Actions workflow to exercise and compare two proxy/firewall approaches (Squid forward proxy + iptables DNAT vs Envoy transparent proxy + iptables redirect) under both standard runc and gVisor (runsc) runtimes, plus a simple latency benchmark and summary report.

Changes:

  • Introduces end-to-end test jobs for Squid (runc + gVisor) and Envoy (runc + gVisor) using Docker networks and in-container iptables rules.
  • Adds a performance comparison job that runs 100 HTTP requests through each proxy and exports average latency as job outputs.
  • Adds a summary job that prints consolidated results and recommendations.
Show a summary per file
File Description
.github/workflows/test-gvisor-firewall-comparison.yml New workflow implementing Squid/Envoy comparison tests and a latency benchmark across runc and gVisor.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/1 changed files
  • Comments generated: 7

Comment on lines +60 to +62
ubuntu/squid:latest

sleep 3
Comment on lines +183 to +185
ubuntu/squid:latest

sleep 3
Comment on lines +319 to +321
-c /etc/envoy/envoy.yaml

sleep 3
Comment on lines +444 to +446
envoyproxy/envoy:v1.28-latest -c /etc/envoy/envoy.yaml

sleep 3
Comment on lines +508 to +510
ubuntu/squid:latest

sleep 3
Comment on lines +589 to +591
envoyproxy/envoy:v1.28-latest -c /etc/envoy/envoy.yaml

sleep 3

echo ""
echo "=== Testing HTTPS through Envoy (expected to fail) ==="
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" --max-time 5 https://github.com 2>&1 || echo "000")
Replaces 'sleep 3' with proper health check loops that wait up to 30 seconds
for proxies to be ready before running tests.

Root cause: Squid/Envoy containers were not fully initialized before agent
containers tried to connect, causing spurious test failures.

Changes:
- Squid runc test: Wait for proxy port 3128 to respond
- Squid gVisor test: Wait for proxy port 3128 to respond
- Envoy runc test: Wait for admin /ready endpoint
- Envoy gVisor test: Wait for admin /ready endpoint
- Squid perf test: Wait for proxy port 3128 to respond
- Envoy perf test: Wait for admin /ready endpoint

Each health check:
- Retries for up to 30 seconds
- Uses lightweight curl container for network checks
- Shows container logs on failure
- Exits with error if proxy fails to start

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Previous run used wrong commit (old PR before health check fixes were added).

New workflow run with health check fixes: https://github.com/github/gh-aw-firewall/actions/runs/27738319710

Commit 0454937 now includes proper Squid and Envoy health checks with 30-second timeout.

Squid v6.13 rejects configs with both '.github.com' and 'github.com' in the same ACL:
  ERROR: '.github.com' is a subdomain of 'github.com'
  FATAL: Bungled /etc/squid/squid.conf

Solution: Use only '.github.com' which matches both github.com and all subdomains.

This fixes the 30-second timeout where Squid failed to start due to config error.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Fixed Squid configuration error that caused startup failures.

Root cause: Squid v6.13 rejects ACLs with both '.github.com' and 'github.com':

ERROR: '.github.com' is a subdomain of 'github.com'
FATAL: Bungled /etc/squid/squid.conf

Solution: Use only '.github.com' which matches both github.com and all subdomains.

New workflow run: https://github.com/github/gh-aw-firewall/actions/runs/27763948134

This should fix the 30-second timeouts where Squid failed to start.

Problem: Health checks were timing out after 30 seconds because they tried
to test the full proxy path (curl -> Squid -> example.com), which is slow
and unreliable due to:
- DNS resolution delays
- External dependency (example.com)
- Network latency
- Curl container startup overhead

Each attempt could take 5-10 seconds, eating up the 30-second budget.

Solution: Replace with simple TCP port check using busybox that just verifies
Squid port 3128 is listening. This is:
- Fast (< 1 second per attempt)
- Reliable (no external dependencies)
- Accurate (tests exactly what we need: is Squid accepting connections)

Changed health checks for:
- Squid runc test
- Squid gVisor test
- Squid performance test

Envoy tests already use /ready endpoint which is fast and reliable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Failure Analysis Summary

Root Cause: Health checks were using a complex end-to-end proxy test that was too slow and unreliable.

Previous Health Check (Flawed):

docker run curlimages/curl curl -x http://172.30.0.10:3128 http://example.com

Problems:

  1. External dependency on example.com (DNS, latency, availability)
  2. Each attempt took 5-10 seconds
  3. 30 attempts x 10 seconds = timeout
  4. Complex failure modes (DNS, proxy, target site)

New Health Check (Simple):

docker run busybox timeout 2 sh -c 'cat < /dev/null > /dev/tcp/172.30.0.10/3128'

Benefits:

  • Fast: <1 second per attempt
  • Reliable: No external dependencies
  • Accurate: Tests exactly what we need (port 3128 listening)

New Run:

https://github.com/github/gh-aw-firewall/actions/runs/27764269059

SECURITY RESEARCHER PERSPECTIVE: Test must verify gVisor+Envoy replicates
ALL security guarantees of runc+Squid. Any gaps = security vulnerability.

New tests added (10 security scenarios):
1. ✅ Allowed domain (github.com) - baseline functionality
2. ✅ Blocked domain (google.com) - core firewall feature
3. ✅ DNAT fallback - defense-in-depth when proxy env ignored
4. ✅ Port blocking (SSH 22) - prevent lateral movement
5. 🔒 IP address bypass - prevent ACL bypass via IPs (e.g., curl 8.8.8.8)
6. 🔒 Subdomain verification - verify .github.com includes api.github.com
7. 🔒 Similar domain blocked - githubstatus.com should not work
8. 🔒 Dangerous ports - comprehensive blocklist (SSH, DB, Redis, etc.)
9. 🔒 Local network isolation - RFC1918, container gateway blocked
10. 🔒 Protocol bypass - ICMP/UDP blocked

Enhanced iptables rules to match AWF production security model:
- DNS restricted to approved resolvers only (8.8.8.8, 8.8.4.4)
- All RFC1918 ranges blocked (10/8, 172.16/12, 192.168/16, 169.254/16)
- Comprehensive dangerous port blocklist (22,23,25,3306,5432,6379,27017,445,1433)
- ICMP completely blocked (no ping)
- UDP blocked except DNS
- Localhost allowed (for MCP stdio servers)

Applied to both:
- Squid + runc (baseline - current AWF)
- Squid + gVisor (validate gVisor doesn't break security)

TODO: Apply same tests to Envoy variants in follow-up.

Success criteria: All 4 configurations (Squid+runc, Squid+gVisor,
Envoy+runc, Envoy+gVisor) must produce IDENTICAL security results.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

🔒 Comprehensive Security Tests Added

Expanded the test to validate ALL security guarantees from a security researcher perspective. The goal: prove gVisor+Envoy can replicate the exact security behavior of runc+Squid.

New Security Tests (10 scenarios)

Core Functionality

  1. Allowed domain (github.com) - baseline
  2. Blocked domain (google.com) - firewall works
  3. DNAT fallback - defense-in-depth

Attack Prevention

  1. Port blocking - SSH blocked
  2. 🔒 IP address bypass - Direct IP access blocked (curl 8.8.8.8)
  3. 🔒 Subdomain verification - .github.com includes api.github.com
  4. 🔒 Similar domain blocked - githubstatus.com fails
  5. 🔒 Dangerous ports - Comprehensive blocklist (SSH, DBs, Redis, MongoDB, SMB, MSSQL)
  6. 🔒 Local network isolation - RFC1918 + container gateway blocked
  7. 🔒 Protocol bypass - ICMP/UDP blocked

Enhanced iptables Rules

Now matches AWF production security model:

  • ✅ DNS restricted to approved resolvers only (8.8.8.8, 8.8.4.4)
  • ✅ All RFC1918 ranges blocked (10/8, 172.16/12, 192.168/16, 169.254/16)
  • ✅ Dangerous ports: 22,23,25,3306,5432,6379,27017,445,1433
  • ✅ ICMP completely blocked (no ping)
  • ✅ UDP blocked except DNS
  • ✅ Localhost allowed (for MCP stdio servers)

Success Criteria

All 4 configurations must produce identical security results:

  1. Squid + runc (baseline - current AWF)
  2. Squid + gVisor (validate gVisor doesn't break security)
  3. Envoy + runc (validate Envoy matches Squid behavior)
  4. Envoy + gVisor (target - proposed solution)

Any behavioral difference = security gap = FAIL.

Next Steps

  • ✅ Squid tests updated (runc + gVisor)
  • ⏳ TODO: Apply same tests to Envoy variants
  • ⏳ TODO: Run workflow and compare all 4 configurations

@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

🔄 Workflow re-triggered with comprehensive security tests

Run: https://github.com/github/gh-aw-firewall/actions/runs/27767923283

This run includes:

  • ✅ Fixed health checks (simple TCP port check)
  • ✅ 10 comprehensive security tests
  • ✅ Enhanced iptables matching AWF production security model

Will validate that both runc and gVisor configurations pass all security tests.

Root Cause: busybox uses ash shell, which doesn't support bash's /dev/tcp
pseudo-device. The health check command was silently failing.

Solution: Use 'nc -zv -w 2' (netcat) which IS available in busybox and
provides reliable TCP port checking.

Changes:
- Squid runc health check: /dev/tcp → nc -zv
- Squid gVisor health check: /dev/tcp → nc -zv
- Performance test health check: /dev/tcp → nc -zv

nc flags:
  -z: Zero-I/O mode (just check if port is open)
  -v: Verbose (output 'open' or 'succeeded')
  -w 2: 2-second timeout

grep for 'open|succeeded' to detect success across different nc versions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

🐛 Root Cause Found: Busybox Shell Incompatibility

The Problem

Health checks were timing out because busybox uses ash shell, which doesn't support bash's /dev/tcp pseudo-device.

# This was SILENTLY FAILING in busybox:
sh -c 'cat < /dev/null > /dev/tcp/172.30.0.10/3128'

The Fix

Use nc (netcat) which is available in busybox:

# Now uses netcat for reliable TCP port checking:
nc -zv -w 2 172.30.0.10 3128 2>&1 | grep -q 'open|succeeded'

Flags:

  • -z: Zero-I/O mode (just check if port is open)
  • -v: Verbose (outputs 'open' or 'succeeded')
  • -w 2: 2-second timeout

New Run

https://github.com/github/gh-aw-firewall/actions/runs/27768282438

This should fix the health check timeouts for all 3 Squid tests (runc, gVisor, performance).

@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

⚠️ Previous Run Used Wrong Commit

Run 27768282438 was triggered before the netcat fix was pushed, so it ran with the old /dev/tcp code.

✅ New Run with Netcat Fix

https://github.com/github/gh-aw-firewall/actions/runs/27768454494

This run uses commit 4eeab32b which has the netcat fix:

nc -zv -w 2 172.30.0.10 3128 2>&1 | grep -q 'open|succeeded'

Root Cause: When curl fails, it outputs 000 AND the || echo "000" runs,
resulting in "000000" being captured. This broke the status code comparison.

Solution:
- Use `2>/dev/null || true` instead of `|| echo "000"`
- Check for empty string in addition to "000" and "403"
- Use ${HTTP_CODE:-error} in output to show "error" if empty

This fixes the test logic for:
- google.com (blocked domain)
- githubstatus.com (similar domain)
- 8.8.8.8 (IP address bypass)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

🐛 Fixed HTTP Status Code Bug

The Problem

When curl failed, it output "000" AND the fallback || echo "000" also ran, resulting in "000000" being captured. This broke status code comparison.

The Fix

# Old (broken):
HTTP_CODE=$(curl ... || echo "000")  # Results in "000000" on error

# New (fixed):
HTTP_CODE=$(curl ... 2>/dev/null || true)  # Results in "000" or empty
if [ "$HTTP_CODE" = "403" ] || [ "$HTTP_CODE" = "000" ] || [ -z "$HTTP_CODE" ]; then

New Run

https://github.com/github/gh-aw-firewall/actions/runs/27768657814

This should fix the test failures for blocked domain checks.

@lpcox

lpcox commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

⚠️ Previous Run Used Wrong Commit (Again)

Run 27768657814 was triggered before the force push completed, so it ran with commit 4eeab32b (netcat fix) instead of caffabd6 (HTTP code fix).

✅ New Run with Both Fixes

https://github.com/github/gh-aw-firewall/actions/runs/27768781761

This run uses commit caffabd6 which has BOTH fixes:

  1. ✅ Netcat instead of /dev/tcp
  2. ✅ HTTP status code handling fixed

lpcox and others added 4 commits June 19, 2026 16:53
Adds a dedicated workflow_dispatch harness that proves Docker --internal
network topology (dual-homed Squid sidecar, no NET_ADMIN, no awf-managed
iptables) contains egress at least as strictly as the current iptables+Squid
baseline. Motivated by ARC/Kubernetes runners where NET_ADMIN is unavailable.

Holds Squid (the L7 filter) constant and changes only the enforcement
mechanism. Runs an identical adversarial battery (A1-A12 + L1-L3) at two
privilege levels (unprivileged user and root-without-NET_ADMIN) against both
architectures, testing at the application layer so the baseline's raw-TCP DNAT
to Squid is not misread as 'allowed'. A compare job gates on
Blocked_candidate superset Blocked_baseline and emits a markdown matrix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- baseline holder lost the script's exec bit through the artifact, so run it
  as 'bash /assets/baseline-iptables.sh' (no exec bit needed).
- A6 scored dig's 'network unreachable' diagnostic as an answer; now require a
  valid A-record so a blocked resolver reads as BLOCKED.
- L3 used agent-side getent, which is irrelevant in the proxy model (Squid does
  DNS); redefine it as a proxied fetch of a distinct approved subdomain.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The baseline DNATs port 80/443 to Squid, so the previous probes misclassified:
A7 read Squid's 403 HTML page as leaked IMDS, and A8's tcp/80 connect landed
on Squid rather than the private host. Require a real Azure IMDS doc (contains
"compute") for A7, and probe non-redirected ports (22/8080/3306/6379) for A8,
so both models read BLOCKED honestly instead of a measurement artifact.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox

lpcox commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator Author

@copilot merge main

Copilot AI commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

@copilot merge main

Done — merged main into this branch. There was an add/add conflict in test-gvisor-firewall-comparison.yml (both branches added the file); resolved by keeping the PR branch version with the comprehensive security tests and health checks.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Smoke Claude passed

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

🔌 Smoke Services — All services reachable! ✅

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Smoke Gemini completed. All facets verified. 💎

Gemini smoke test completed successfully. Comment and label added to PR #5237.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

🔑 Smoke Copilot PAT PAT auth validated. All systems operational. ✅

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK AOAI (api-key) completed. Copilot AOAI BYOK (api-key) mode operational. 🔓

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Chroot tests passed! Smoke Chroot - All security and functionality tests succeeded.

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Smoke Copilot BYOK completed. Copilot BYOK mode operational. 🔓

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Build Test Suite completed successfully!

The doc link check intermittently fails on
https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/
with a connection-level 'error sending request' from Actions runners, even
though the URL returns 200 interactively and lychee already retries 3x. Add
the otel specs path to the curated flaky-URL exclude list, matching the
existing treatment of other CI-flaky external hosts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Claude Engine Validation

Check Status
API ✅ PASS
gh CLI ✅ PASS
File access ✅ PASS

Overall result: PASS

Generated by Smoke Claude for issue #5237 · 60.9 AIC · ⊞ 3.1K ·

@github-actions

Copy link
Copy Markdown
Contributor

🔬 Smoke Test: Copilot PAT — PASS

PR: Fix gVisor workflow: Add proper health checks for Squid and Envoy
Auth mode: PAT (COPILOT_GITHUB_TOKEN) | cc @lpcox

Test Result
GitHub MCP connectivity
GitHub.com HTTP connectivity
File write/read

Overall: PASS

🔑 PAT report filed by Smoke Copilot PAT

@github-actions

Copy link
Copy Markdown
Contributor

🔥 Smoke Test Results — PASS

Test Result
GitHub MCP connectivity
GitHub.com HTTP ✅ 200
File write/read

PR: Fix gVisor workflow: Add proper health checks for Squid and Envoy
Author: @lpcox

Overall: PASS

📰 BREAKING: Report filed by Smoke Copilot

@github-actions

Copy link
Copy Markdown
Contributor

PR titles:

  • refactor: extract buildAgentSecurityConfig from buildAgentService
  • refactor(api-proxy): decompose handleUpstreamResponse into focused helpers

✅ GitHub PR read
gh PR read
✅ GitHub page title
✅ file write/read
✅ discussion query
npm ci && npm run build

Overall: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions

Copy link
Copy Markdown
Contributor

🔭 Smoke Test: API Proxy OTEL Tracing

Scenario Result Notes
1. Module Loading ✅ Pass otel.js loads; exports startRequestSpan, setTokenAttributes, setBudgetAttributes, endSpan, endSpanError, shutdown, isEnabled + internal helpers
2. Test Suite ✅ Pass 39/39 tests passed in otel.test.js (1.9s)
3. Env Var Forwarding ✅ Pass api-proxy-service-config.ts forwards GH_AW_OTLP_ENDPOINTS, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID, and sets OTEL_SERVICE_NAME=awf-api-proxy
4. Token Tracker Integration ✅ Pass onUsage callback exists in token-tracker-http.js (line 283) as the OTEL hook point; invoked after usage normalization with (normalized, model) signature
5. OTEL Diagnostics ✅ Pass Module initializes with FileSpanExporter fallback when no OTLP endpoint set; isEnabled() returns true; graceful degradation confirmed

All 5 scenarios pass. OTEL tracing integration is fully functional — spans are created per request with GenAI semconv attributes, token usage is wired through the onUsage callback, parent context propagation via GITHUB_AW_OTEL_TRACE_ID/GITHUB_AW_OTEL_PARENT_SPAN_ID is implemented, and exports route through the Squid proxy.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions

Copy link
Copy Markdown
Contributor

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 98.01% 98.08% 📈 +0.07%
Statements 97.95% 98.01% 📈 +0.06%
Functions 99.51% 99.51% ➡️ +0.00%
Branches 93.68% 93.77% 📈 +0.09%
📁 Per-file Coverage Changes (5 files)
File Lines (Before → After) Statements (Before → After)
src/option-parsers.ts 99.0% → 99.0% (+0.01%) 97.3% → 97.3% (+0.05%)
src/commands/main-action.ts 94.8% → 94.9% (+0.07%) 94.8% → 94.9% (+0.07%)
src/commands/validators/config-assembly.ts 98.1% → 98.3% (+0.22%) 98.1% → 98.3% (+0.22%)
src/compose-generator.ts 98.4% → 98.7% (+0.23%) 98.4% → 98.7% (+0.23%)
src/workdir-setup.ts 92.7% → 94.5% (+1.82%) 92.7% → 94.5% (+1.82%)
✨ New Files (1 files)
  • src/topology.ts: 100.0% lines

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

Copy link
Copy Markdown
Contributor

MCP connectivity ✅
GitHub.com connectivity ✅
File I/O ✅
Running in direct BYOK mode (AWF_AUTH_TYPE=github-oidc + AWF_AUTH_AZURE_* + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw) authenticated via Microsoft Entra ✅
Overall: PASS
@lpcox

🪪 BYOK (AOAI Entra) report filed by Smoke Copilot BYOK AOAI (Entra)

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test: Copilot BYOK (Direct) Results

PASS - Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY) via api-proxy → api.githubcopilot.com

Test Results:

  • ✅ GitHub MCP connectivity (merged PRs verified)
  • ✅ GitHub.com HTTP 200 (connectivity works)
  • ✅ File write/read (sandbox I/O verified)
  • ✅ BYOK inference path (agent → api-proxy → api.githubcopilot.com)

/cc @lpcox

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions

Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color passed ✅ PASS
Go env passed ✅ PASS
Go uuid passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Note: Java required -Dmaven.repo.local=/tmp/gh-aw/agent/m2repo to work around a permissions issue with ~/.m2/repository (owned by root in this runner environment).

Generated by Build Test Suite for issue #5237 · 39 AIC · ⊞ 7.7K ·

@github-actions

Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.16.0 v22.22.3 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall: ❌ FAILED — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

@github-actions

Copy link
Copy Markdown
Contributor

Smoke Test Results — FAIL

Check Result
Redis PING (host.docker.internal:6379) ❌ No response (timeout)
PostgreSQL pg_isready (host.docker.internal:5432) ❌ No response
PostgreSQL SELECT 1 ❌ No response

host.docker.internal resolves to 172.17.0.1 but ports 6379 and 5432 are unreachable. GitHub Actions service containers do not appear to be running or accessible in this environment.

🔌 Service connectivity validated by Smoke Services

@github-actions

Copy link
Copy Markdown
Contributor

@lpcox
Smoke Test Results:
• Fix gVisor workflow: Add proper health checks for Squid and Envoy
• fix: correctly recover runner tool on PATH (after sudo w/ secure_path). remove incorrect reading from GITHUB_PATH
GitHub MCP testing: ✅
GitHub.com connectivity: ✅
File write/read: ✅
BYOK inference path: ✅
Running in direct BYOK mode (COPILOT_PROVIDER_API_KEY + COPILOT_PROVIDER_BASE_URL) via api-proxy → Azure OpenAI (Foundry, o4-mini-aw)
Overall: PASS

🔑 BYOK (AOAI api-key) report filed by Smoke Copilot BYOK AOAI (api-key)

@lpcox lpcox merged commit 67336f9 into main Jun 23, 2026
24 of 25 checks passed
@lpcox lpcox deleted the fix/gvisor-workflow-healthchecks branch June 23, 2026 15:46
@github-actions

Copy link
Copy Markdown
Contributor

Gemini Smoke Test Results

  • GitHub MCP Testing: ✅
    • refactor: extract buildAgentSecurityConfig from buildAgentService
    • refactor(api-proxy): decompose handleUpstreamResponse into focused helpers
  • GitHub.com Connectivity: ✅ (200)
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅

Overall Status: PASS

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants