W-1: Detect direct API bypass in integrity audit + add reusable MCP routing constraint (#3134)

lpcox · web-flow · commit 19ad536c71d3 · 2026-04-03T17:15:14.000-07:00
The 2026-04-03 integrity audit flagged the AI Moderator workflow making
3 direct network calls to `api.github.com`, `github.com`, and
`chatgpt.com`, bypassing the MCP Gateway entirely — breaking DIFC
enforcement and making data access unauditable.

## `integrity-filtering-audit.md`

- **Background**: adds "direct API bypass attempts" as an explicit
problem category (firewall blocks = signal)
- **Step 3.6**: new detection step with targeted bash patterns to
surface bypass attempts from logs:
  ```bash
grep -iE 'api\.github\.com|chatgpt\.com|openai\.com|curl.*https?://[^
]*github|fetch.*https?://[^ ]*github' \
    "$TMPDIR"/*/mcp-logs/*.log 2&gt;/dev/null | head -30
  ```
- **Step 4**: extends W-1 Warning classification to cover direct API
bypass with likely causes (`difc-proxy: true` missing, weak prompt,
misconfigured `network.allowed`) and fix pointer
- **Issue template**: Runs Analyzed table gains "Agent Invoked" +
"Firewall Blocks" columns (matching the actual report format);
Recommendations section now requires bypass investigation checklist
- **Front matter**: imports `shared/mcp-api-routing.md` so the audit
agent itself is subject to the same constraint

## `shared/mcp-api-routing.md` (new)

Reusable prompt constraint importable by any workflow (`imports: -
shared/mcp-api-routing.md`). Covers:
- Hard prohibition on `curl`/`gh api`/`fetch` to `api.github.com` or
external AI services
- ✅/❌ usage examples
- Why bypassing breaks DIFC (no integrity labels → no scope enforcement)
- Pre-call checklist (MCP tool, allowed-repos, difc-proxy, no external
AI)
diff --git a/.github/workflows/integrity-filtering-audit.md b/.github/workflows/integrity-filtering-audit.md
@@ -39,6 +39,8 @@ safe-outputs:
 timeout-minutes: 20
 features:
   difc-proxy: true
+imports:
+  - shared/mcp-api-routing.md
 ---
 
 # Integrity Filtering Audit
@@ -60,6 +62,9 @@ Common problems to look for:
 - **Unscoped integrity tags** (e.g., `approved` instead of `approved:owner/repo`)
 - **Empty responses** where data was expected (over-filtering)
 - **Search result leaks** where out-of-scope items appear in filtered results
+- **Direct API bypass attempts** where an agent contacts `api.github.com`, `github.com`,
+  or external AI services (e.g., `chatgpt.com`, `openai.com`) without going through
+  the MCP Gateway — these show up as network firewall blocks in the job logs
 
 ## Procedure
 
@@ -110,6 +115,17 @@ For each downloaded artifact set, check:
 5. **Scope violations**: Check if any response contains data from repositories
    NOT in the workflow's `allowed-repos` policy.
 
+6. **Direct API bypass attempts**: Search job logs and stderr for network firewall
+   blocks that reveal the agent trying to reach external domains directly instead
+   of through the MCP Gateway. Key domains to flag:
+   - `api.github.com` — GitHub API (must go through MCP Gateway, not curl/fetch)
+   - `github.com` — GitHub web (should not be contacted directly)
+   - `chatgpt.com`, `openai.com`, `api.openai.com` — external AI services
+   - Any other non-allowlisted HTTP endpoint
+
+   For each block, record: the blocked domain, the number of block events, which
+   workflow run, and what step appears to have triggered it.
+
 ```bash
 # Example: Count DIFC events in JSONL
 grep -c 'difc_integrity' "$TMPDIR"/*/mcp-logs/rpc-messages.jsonl 2>/dev/null || echo "0"
@@ -119,6 +135,16 @@ grep -iE 'error|failed|blocked|unknown|wasm error:|WASM guard trap' "$TMPDIR"/*/
 
 # Example: Specifically search for WASM guard panics
 grep -iE 'wasm error:|WASM guard trap|unreachable' "$TMPDIR"/*/mcp-logs/mcp-gateway.log 2>/dev/null
+
+# Example: Detect direct API bypass attempts in job logs
+# The network firewall logs blocked connections; search agent stderr/stdout for clues
+grep -iE 'api\.github\.com|chatgpt\.com|openai\.com|curl.*https?://[^ ]*github|fetch.*https?://[^ ]*github' \
+  "$TMPDIR"/*/mcp-logs/*.log 2>/dev/null | head -30
+
+# Example: Summarize firewall blocks by domain from network-firewall logs (if present)
+grep -iE 'BLOCK|DENY|firewall' "$TMPDIR"/*/mcp-logs/*.log 2>/dev/null \
+  | grep -oE '(api\.github\.com|github\.com|chatgpt\.com|openai\.com|[a-z0-9.-]+\.[a-z]{2,})' \
+  | sort | uniq -c | sort -rn | head -20
 ```
 
 ### Step 4: Classify Findings
@@ -127,9 +153,20 @@ Classify each finding by severity:
 - 🔴 **Critical**: Data leak (out-of-scope data returned), guard bypass, or
   labeling failure that could expose unauthorized data
 - 🟡 **Warning**: Over-filtering (legitimate data blocked), unscoped tags,
-  zero DIFC events in a run that should have filtering, or WASM guard trap
+  zero DIFC events in a run that should have filtering, WASM guard trap, or
+  **direct API bypass attempt** (agent contacted `api.github.com`, `github.com`,
+  or an external AI service such as `chatgpt.com` / `openai.com` directly instead
+  of routing through the MCP Gateway — visible as network firewall blocks)
 - 🟢 **Info**: Normal filtering behavior, expected blocks, or configuration notes
 
+When classifying a **direct API bypass** warning (W-1), record:
+- The blocked domain(s) and block count
+- The workflow name and run ID
+- The likely cause: misconfigured `network.allowed` list, agent prompt not
+  restricting tool use, or the workflow missing `features.difc-proxy: true`
+- Recommended fix: strengthen agent system prompt to use MCP Gateway tools
+  exclusively; see `shared/mcp-api-routing.md` for reusable constraint language
+
 ### Step 5: Create Summary Issue
 
 Create an issue with the audit results using the following structure:
@@ -159,7 +196,8 @@ Create an issue with the audit results using the following structure:
 <details>
 <summary><b>Warnings</b></summary>
 
-[Details of each warning]
+[Details of each warning — for direct API bypass (W-1) warnings include: blocked
+domain(s), block count, workflow name, likely cause, and recommended fix]
 
 </details>
 
@@ -172,13 +210,17 @@ Create an issue with the audit results using the following structure:
 
 ### Runs Analyzed
 
-| Run | Workflow | Branch | DIFC Events | Filtered | Status |
-|-----|----------|--------|-------------|----------|--------|
-| [§ID](run_url) | name | branch | N | N | ✅/⚠️/❌ |
+| Run | Workflow | Branch | Agent Invoked | DIFC Events | Firewall Blocks | Status |
+|-----|----------|--------|---------------|-------------|-----------------|--------|
+| [§ID](run_url) | name | branch | ✅/❌ early-exit | N | N/total | ✅/⚠️/❌ |
 
 ### Recommendations
 
-[Actionable suggestions based on findings]
+[Actionable suggestions based on findings. For direct API bypass (W-1) findings,
+always include: 1) which workflow to investigate, 2) whether it has
+`features.difc-proxy: true`, 3) whether the agent prompt restricts tool use to
+MCP Gateway tools, and 4) a pointer to `shared/mcp-api-routing.md` for reusable
+constraint language to add to the workflow prompt.]
 ```
 
 If there are no findings (all runs look healthy), still create the issue with
diff --git a/.github/workflows/shared/mcp-api-routing.md b/.github/workflows/shared/mcp-api-routing.md
@@ -0,0 +1,55 @@
+---
+# MCP Gateway API routing constraints — import this in any workflow that makes
+# GitHub API calls to ensure the agent is reminded to use MCP tools exclusively.
+---
+
+## ⚠️ IMPORTANT: GitHub API Routing Constraint
+
+**All GitHub API calls MUST be made exclusively through the MCP Gateway's GitHub
+MCP server tools.** Direct network access to `api.github.com`, `github.com`, or
+any external service is not permitted and will be blocked by the network firewall.
+
+### Correct Usage
+
+Use the provided MCP tools (e.g., `github-mcp-server` toolset) for all GitHub
+operations:
+
+```
+✅ Use github-mcp-server list_issues with owner=..., repo=...
+✅ Use github-mcp-server get_file_contents with owner=..., repo=..., path=...
+✅ Use github-mcp-server list_workflow_runs with owner=..., repo=...
+```
+
+### Incorrect Usage
+
+Do NOT use `curl`, `wget`, `fetch`, or any other HTTP client to contact GitHub's
+APIs directly. Do NOT attempt to contact external AI services:
+
+```
+❌ curl https://api.github.com/repos/...          (blocked — use MCP tools)
+❌ gh api /repos/...                              (blocked — use MCP tools)
+❌ fetch("https://api.github.com/...")            (blocked — use MCP tools)
+❌ curl https://chatgpt.com/...                   (blocked — external service)
+❌ curl https://api.openai.com/...                (blocked — external service)
+```
+
+### Why This Matters
+
+- The MCP Gateway applies **DIFC (Decentralized Information Flow Control)**
+  integrity and secrecy labels to all GitHub API responses, enforcing scope
+  restrictions and preventing data leaks.
+- Direct API calls bypass DIFC enforcement entirely, making it impossible to
+  audit what data the agent accessed or ensure scope compliance.
+- Direct calls to external AI services (e.g., ChatGPT) are out-of-scope and
+  constitute a security boundary violation; all reasoning must happen inside
+  the Copilot engine provided by the workflow runtime.
+- Network firewall blocks from bypass attempts are **audited** by the Integrity
+  Filtering Audit workflow and will be flagged as W-1 warnings.
+
+### Checklist
+
+Before making any API call, verify:
+1. ✅ Am I using a GitHub MCP server tool (not `curl`, `gh`, or HTTP fetch)?
+2. ✅ Is the target repository in the workflow's `allowed-repos` list?
+3. ✅ Is `features.difc-proxy: true` enabled in this workflow's configuration?
+4. ✅ Am I NOT trying to contact any external AI service API?