Exempt the trusted-internal dispatch from edge middleware#5246
Merged
cafalchio merged 10 commits intoJun 25, 2026
Merged
Conversation
…one helper Replace the three near-duplicate trust checks (main, token_scoping, and the HMAC helper) with a single generic gate in auth_context: is_trusted_internal_runtime_request plus a path-aware MCP/A2A wrapper. main and token_scoping delegate to it; token_scoping gains the 'affinity' marker (previously rust-only), closing the gap where the in-process session-affinity dispatch was still token-scoped. - Path-aware auth context: required for every internal route except */authenticate. - A2A feature guard retained for /_internal/a2a/*. - HMAC + encoded auth-context are the trust boundary; loopback is defense-in-depth, documented as such under ProxyHeaders(trusted_hosts=*). - Strip forwarded / client-IP headers from loopback passthrough so the in-process replay cannot carry a spoofable client address. - Remove the now-dead duplicate statics, constants, and imports from token_scoping. Tests cover allow/deny (incl. HMAC-as-boundary and XFF), affinity, A2A enable/disable, prefix allowlist, token_scoping affinity, and forwarded-header stripping. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
…ware Route the in-process and cross-worker internal dispatch (/_internal/mcp/* and /_internal/a2a/*) around edge middleware that already ran for the originating request. Each skip is gated on the shared trust gate (loopback + HMAC + runtime marker + auth context), not a path prefix: - RateLimit: the internal hop is not counted a second time. - TokenUsage: the internal hop does not log a duplicate usage row. - HttpAuth: HTTP auth plugin hooks are not re-fired on the replay. - CSRF: the skip is contingent on the full trust gate; drop the static /_internal/mcp/ entry from csrf_exempt_paths so no externally reachable path is left CSRF-free. TokenScoping already delegates to the same gate. Each middleware gains an allow test (trusted hop is exempted) and a deny test (a forged HMAC fails the gate and the normal enforcement path runs). Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Draft to reuse when opening the stacked PR; not part of the runtime change. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
The draft is moved into the pull request body; it does not belong in the tracked tree. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Base automatically changed from
fix/session-affinity-auth-context
to
fix/session-affinity-inprocess-dispatch
June 16, 2026 14:47
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Keep the trusted-internal middleware exemptions scoped to the cases that produce wrong state on every deployment: rate-limit double counting, duplicate token-usage rows, token-scoping rejecting scoped tokens, plus the CSRF gate change. HttpAuthMiddleware is left running on the internal hop; with no HTTP-auth plugin it is already a no-op there, so removing the skip is a no-op for the default configuration. Remove the is_trusted_internal_mcp_request import and skip block from HttpAuthMiddleware, and drop the two exemption tests and their helper. Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Drop the trusted-internal CSRF exemption added in 76acff1 so /_internal/mcp/* is no longer CSRF-exempt. Remove the trust-gate skip and its import from CSRFMiddleware, and drop the two trust-gate tests that asserted the exemption. The /_internal/mcp/ entry stays out of csrf_exempt_paths, so the hop is genuinely enforced rather than falling back to a path-prefix skip. Bearer-authenticated internal hops still pass via the existing bearer short-circuit; non-bearer (OAuth/public-only) dispatch now requires CSRF. The trust-gate exemption is preserved in 76acff1 and can be restored with: git show 76acff1 -- mcpgateway/middleware/csrf_middleware.py Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
Signed-off-by: Pratik Gandhi <gandhipratik203@gmail.com>
593913c
into
fix/session-affinity-inprocess-dispatch
1 check passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on
#4987(fix/session-affinity-inprocess-dispatch);#5212has merged into that branch, so this PR's diff shows only the work below.Summary
The session-affinity and Rust runtime paths re-enter the gateway through an in-process / loopback dispatch to
/_internal/mcp/rpc(and/_internal/a2a/*). Because that dispatch goes back through the ASGI stack, the edge middleware that already ran for the originating request runs a second time on the internal hop: token scoping, rate limiting, and token-usage logging.This PR routes the trusted-internal hop around that middleware. Every skip is gated on a single shared trust check, so the exemption is contingent on the full trust boundary and never on a URL prefix alone.
It lands in two commits:
consolidate the internal-runtime trust gate into one helperexempt trusted-internal dispatch from edge middlewareArchitecture: before & after
The cross-worker forward and the trusted
/_internal/mcp/rpcdispatch are unchanged (that landed in#5212). What changes here is what the edge middleware does when the trusted-internal hop re-enters the ASGI stack.After
#5212, the owning worker dispatches in-process to/_internal/mcp/rpc. Because that hop goes back through the full ASGI stack, every edge middleware that already ran for the originating request runs a second time on the internal replay: token scoping, rate limiting, and token-usage logging. That double-counts rate limits and writes duplicate usage rows. CSRF was sidestepped only by a static/_internal/mcp/entry incsrf_exempt_paths, which left an externally reachable CSRF-free URL.Before this PR (state after
#5212):After this PR:
Net change:
The trust gate is the single decision point. A genuine loopback forward (valid HMAC + encoded auth-context) skips the duplicated edge work, while anything that fails the gate (forged HMAC, external caller) flows through the normal middleware exactly as before. Route-level authorization is untouched and still runs on every internal request.
The trust gate
Three near-duplicate trust checks (in
main,token_scoping, and the HMAC helper) are replaced by one gate inauth_context:A request is trusted only when all of the following hold:
Notes:
ProxyHeaders(trusted_hosts="*")makesrequest.client.hostinfluenced byX-Forwarded-Forfor genuinely external requests.*/authenticatecreates the context, so it is the only internal route that does not require the header./_internal/a2a/*is never trusted when A2A is disabled.token_scopingpreviously trusted only therustmarker. It now delegates to the shared gate and so also trusts theaffinitymarker, closing the gap where the in-process affinity dispatch was still token-scoped.Middleware exemptions
CSRF change
/_internal/mcp/is removed from the defaultcsrf_exempt_paths. The skip is now contingent on the full trust gate rather than a static path prefix. This is strictly tighter: a path-only exemption left a CSRF-free URL reachable by any client, whereas the gate-based skip requires a valid HMAC from a loopback client. The legit loopback forward still skips CSRF; a forged external request to the same path now hits CSRF enforcement.Loopback passthrough hardening
_LOOPBACK_SKIP_HEADERSis broadened to strip forwarded and client-IP headers (forwarded,x-forwarded-*,x-real-ip,cf-connecting-ip,true-client-ip) so the in-process replay cannot carry a spoofable client address into the internal hop.What is intentionally NOT exempted
Route-level enforcement is unchanged and still runs on every internal request: the route-level trust check, the forwarded-user build, the internal-request authorization, the server-scope enforcement, method authorization, the HMAC verification, and the auth-context decode. The middleware exemptions remove duplicated edge work; they do not relax the route's own authorization.
Testing
Unit
test_internal_runtime_trust.py: allow and deny for the gate (including the HMAC-is-the-boundary andX-Forwarded-Forcases), the affinity marker, A2A enable/disable, the prefix allowlist,token_scopingtrusting affinity, and the forwarded-header stripping.Live verification (3 replicas × 24 workers, affinity on)
Built an image from this branch and ran the multi-worker session-affinity stack:
make docker(tagsmcpgateway/mcpgateway:latest), thenmake testing-up(3 replicas × 24 gunicorn workers,USE_STATEFUL_SESSIONS=true,MCPGATEWAY_SESSION_AFFINITY_ENABLED=true,CACHE_TYPE=redis, behind nginx on:8080). Requests round-robin through nginx across the 72 workers with no sticky load balancing, so they land on non-owner workers and must be forwarded to the owner — exactly the in-process/_internal/mcp/rpchop this PR exempts from duplicate edge middleware.Setup — counter server, registration, and the session driver
Minimal per-session counter (streamable HTTP). State is keyed by the upstream session, so a scatter onto a fresh connection would show a reset counter — which is exactly what the test detects:
Mint an admin token, register the counter as a gateway (discovery runs against the live counter), and scope its tools into a virtual server:
Each test drives sessions against
/servers/{id}/mcp:T1 — single session, 25 increments (PASS)
Steps: one session;
increment×25, thenget_value.Result:
Requests round-robin through nginx across the 3 × 24 replicas, so they hit non-owner workers and forward to the owner; the strictly monotonic counter proves the forward reused the single bound upstream session (no scatter). Exempting the edge middleware on the internal hop did not break per-session isolation.
T2 — owner-worker kill / recovery (PASS)
Steps: bind a session and
increment×5; read its owner from Redis (mcpgw:pool_owner:<sid>=host:pid);kill -9the owning gunicorn worker; hit the stale session id; theninitializea fresh session.Result:
When the owning worker dies, the in-memory upstream is gone; a request on the stale id returns a clean structured
404/ JSON-RPC-32600 "Session not found"(not a hang or 5xx), and a freshinitializebinds and works. The middleware exemptions do not change this recovery contract.T3 — exemption is gated (PASS)
Steps: from outside the trust boundary, POST to the internal endpoints with missing or forged trust headers. Every case must be rejected — there is no blanket path exemption.
Result:
A genuine loopback forward (valid HMAC + auth context) and a forged external request take the same
/_internal/mcp/rpcpath with opposite outcomes (200 vs 403), decided only by the trust gate. The forged-no-bearer case being rejected by CSRF confirms the path is no longer blanket-exempt.Throughput
make benchmark-mcp-tools(125 users, 60s) × 3 runsTools-only benchmark against the same stack, three back-to-back 60s runs:
About 90,700 forwarded requests with a single failure, well above the broken-affinity baseline (~15 RPS). The exemptions only remove work on the internal hop, so they do not regress throughput. Absolute RPS is host-load dependent; these are indicative runs on a local colima VM.
Compatibility