Reverse proxy in Go acting as an OAuth 2.1 Authorization Server compatible with the MCP auth spec, federating authentication to any OIDC Identity Provider (Keycloak, Microsoft Entra ID, Auth0, Okta, Google...) via auto-discovery. It lets MCP clients (claude.ai, Claude Code) access a private MCP server through a standard PKCE flow.
Companion docs: docs/conformance.md (RFC claim matrix + IdP evidence), docs/threat-model.md (STRIDE coverage with code + test + runbook links).
This proxy MUST conform to the following specifications:
| Spec | Usage |
|---|---|
| OAuth 2.1 (draft-ietf-oauth-v2-1-13) | Authorization code + PKCE flow, token handling, security requirements. Draft-13 is the version pinned by MCP Authorization 2025-06-18; the IETF OAuth WG has since published draft-15 — deltas tracked separately and not yet adopted here |
| RFC 8414 — OAuth 2.0 Authorization Server Metadata | GET /.well-known/oauth-authorization-server discovery endpoint |
| RFC 7591 — OAuth 2.0 Dynamic Client Registration | POST /register — automatic client registration for MCP clients |
| RFC 9728 — OAuth 2.0 Protected Resource Metadata | GET /.well-known/oauth-protected-resource — MCP clients discover the AS through this endpoint; WWW-Authenticate header on 401 responses MUST include resource_metadata URL |
| RFC 8707 — Resource Indicators for OAuth 2.0 | resource parameter accepted in /authorize and /token requests |
| RFC 7636 — PKCE | code_verifier must be 43-128 characters, code_challenge_method must be S256 |
| MCP Authorization Spec (2025-06-18) | End-to-end MCP auth flow combining the above RFCs |
- Claude's OAuth callback URL:
https://claude.ai/api/mcp/auth_callback(may migrate tohttps://claude.com/api/mcp/auth_callback) - Claude's OAuth client name:
Claude - Claude supports Dynamic Client Registration (DCR)
- Claude supports both SSE and Streamable HTTP transports (SSE may be deprecated)
- Claude supports token expiry and refresh
MCP Client (claude.ai / Claude Code)
│
│ 0. MCP request → 401 + WWW-Authenticate header
│ 1. GET /.well-known/oauth-protected-resource (RFC 9728)
│ 2. GET /.well-known/oauth-authorization-server (RFC 8414)
│ 3. POST /register (RFC 7591 Dynamic Client Registration)
│ 4. GET /authorize (PKCE + resource param)
│ 5. POST /token (+ resource param)
│ 6. MCP requests with Bearer token
▼
mcp-auth-proxy (this service)
│
│ Federates auth → OIDC IdP (Keycloak, Entra, Auth0...)
│ Validates incoming Bearer tokens
│ Forwards requests to upstream MCP server
▼
Upstream MCP Server (target, unmodified)
All transient OAuth state (client registrations, authorize sessions, authorization codes, refresh tokens) is AES-GCM encrypted into the tokens and URL parameters themselves. Any instance sharing the same TOKEN_SIGNING_SECRET can handle any request — no shared storage, no sticky sessions.
| Flow state | Encrypted into | Carries audience? |
|---|---|---|
| Client registration | client_id (encrypted blob, 7d default TTL — configurable via CLIENT_REGISTRATION_TTL) |
yes |
| Authorize session | IdP state parameter (encrypted blob, 10min TTL) |
yes |
| Authorization code | code parameter (encrypted blob, 60s TTL) |
yes |
| Access token | Opaque token (encrypted claims, 1h TTL) | yes |
| Refresh token | Opaque token (encrypted claims + iat, 7d TTL) |
yes |
Every sealed payload carries the proxy's PROXY_BASE_URL as an audience field, populated at creation time and verified on every open. Two deployments accidentally sharing the same TOKEN_SIGNING_SECRET (e.g. by copy-pasted Helm values, mirrored DR configs, or a shared Secret) cannot replay each other's tokens — the receiving instance rejects any payload whose audience does not match its own PROXY_BASE_URL.
The check is enforced in:
middleware/auth.go:Validate— access token bearer checkhandlers/authorize.go—sealedClientopenhandlers/callback.go—sealedSessionopen (before the IdP exchange runs)handlers/token.go:handleAuthorizationCode—sealedCodeandsealedClientopenhandlers/token.go:handleRefreshToken—sealedRefreshandsealedClientopen
Within a single deployment this is invisible: the audience always matches and nothing changes for clients. The cost is one string comparison per check.
Trade-offs:
- No per-token revocation without a shared store. Mitigated by short access token TTL (1h), PKCE preventing code replay, and
REVOKE_BEFOREfor bulk revocation. - Authorization codes are replayable within their 60-second TTL when no replay store is configured. Mitigated by PKCE (the attacker also needs the
code_verifier) and the short window. SetREDIS_URLto make codes strictly single-use across replicas (RFC 6749 §4.1.2). - Bulk revocation via
REVOKE_BEFORE: set to the current timestamp and redeploy — all existing access tokens AND refresh tokens withiatbefore the cutoff are rejected. Refresh tokens carry their owniatso an attacker holding a leaked refresh cannot keep minting fresh access tokens past the cutoff. Incident response: rotateREVOKE_BEFOREand watch akubectl rollout statuscomplete before assuming the cutoff is enforced fleet-wide.
REDIS_REQUIRED defaults to true: the proxy fails startup with Fatal if REDIS_URL is unset. Stateless mode is an explicit opt-out (REDIS_REQUIRED=false) for dev or single-replica deployments that accept the trade-off.
Set REDIS_URL (e.g. redis://redis:6379/0, or rediss:// for TLS) to enable two layered protections backed by Redis SET NX / EXISTS. All Redis keys are namespaced with REDIS_KEY_PREFIX (default mcp-auth-proxy:) so multiple proxy deployments can safely share a single Redis DB without key collisions:
- Single-use authorization codes. Each code carries a unique
tid(UUID);/tokenclaims the key atomically, so a second exchange attempt is rejected withinvalid_grant+error_code: code_replay. Claim TTL matches the remaining code lifetime. - Refresh rotation with reuse detection (RFC 6749 §10.4 / OAuth 2.1 §6.1). Each refresh carries a unique
tidand afam(family ID). Thefamis seeded at/callback(on the sealed code) and inherited by every refresh that descends from it, so a replayed authorization code (per RFC 6749 §4.1.2) and a replayed refresh both target the same family marker — the legitimate holder and the attacker are revoked together. On rotation the oldtidis claimed; replaying an already-rotated token past theREFRESH_RACE_GRACE_SECwindow is detected as reuse, revokes the whole family (refresh_family_revoked:<fam>marker, 7-day TTL), and any subsequent use of any sibling refresh is rejected witherror_code: refresh_family_revoked. The compromised lineage stops minting tokens; both parties are forced back through/authorize. A racing second submit inside the grace window (parallel-tab refresh, slow-network double-submit) returns 429refresh_concurrent_submitinstead — the legitimate peer's rotation already succeeded, no family is killed, the racing client retries with the new refresh once it lands.
On Redis failure the handler fails closed (503 server_error / error_code: replay_store_unavailable) rather than issuing tokens against an unknown replay state. When REDIS_URL is unset the proxy stays fully stateless: codes remain replayable within the 60s TTL (mitigated by PKCE), and refresh tokens rotate without reuse detection.
Three defaults were flipped to enforce the strict OAuth 2.1 / MCP posture by default. An operator pulling a new image without re-reading the config table will hit a hard Fatal at startup for the first two; the third changes the shape of the /authorize response.
UPSTREAM_MCP_URLnow requires an explicit path. Origin-only URLs (http://backend,http://backend/) used to be the only legal shape; they are now rejected. The path is the proxy's public mount AND the path forwarded upstream — pick what your upstream actually serves (FastMCP default:/mcp). The path is also restricted to RFC 3986 unreserved characters plus/, so:,*,{,},@,+etc. are rejected — they would otherwise silently register chi router patterns instead of literal segments.PROD_MODEnow defaults totrue. Wasfalse. Strict mode rejects every relaxation flag (PKCE_REQUIRED=false,COMPAT_ALLOW_STATELESS=true,REDIS_REQUIRED=false,REDIS_URLempty, legacyTRUST_PROXY_HEADERS=truewithoutTRUSTED_PROXY_CIDRS). Existing dev / single-replica deployments that depended on those flags must setPROD_MODE=falseexplicitly.RENDER_CONSENT_PAGEnow defaults totrue. Wasfalse./authorizenow returns a 200 HTML consent page instead of a 302 to the IdP; the user clicks Approve or Deny on a<form action="/consent" method="POST">. Closes the open-DCR-plus-active-IdP-session silent-issuance phishing class. Browser-driven MCP clients (claude.ai, Claude Code, Cursor, MCP Inspector, ChatGPT) follow the form transparently. If you drive/authorizefrom a non-browser caller (CI rig, scrape test, headless agent that expected a 302), you have two options: setRENDER_CONSENT_PAGE=falseto keep the legacy silent redirect, or update the caller to handle a 200 HTML response with aconsent_token-bearing form (seekeycloak_e2e_test.go::approveConsentfor the reference walk-through).
All configuration is via environment variables.
| Variable | Description | Example |
|---|---|---|
OIDC_ISSUER_URL |
OIDC Issuer URL (auto-discovery via /.well-known/openid-configuration) |
https://keycloak.example.com/realms/myrealm or https://login.microsoftonline.com/{tenant}/v2.0 |
OIDC_CLIENT_ID |
OIDC client ID registered with the IdP | xxxxxxxx-... |
OIDC_CLIENT_SECRET |
OIDC client secret | ... |
PROXY_BASE_URL |
Public URL of this proxy | https://mcp-proxy.example.com |
UPSTREAM_MCP_URL |
Upstream MCP URL. Path is mandatory and is used verbatim as both the proxy's public mount AND the path forwarded upstream. Query, fragment, userinfo, origin-only URLs (no path / lone /), and paths that collide with a reserved control-plane route (/healthz, /register, /authorize, /callback, /token, /.well-known) are rejected at startup. |
http://mcp-server:8080/mcp |
LISTEN_ADDR |
Bind address | :8080 |
METRICS_ADDR |
Prometheus metrics bind address. Default 127.0.0.1:9090 — loopback only so /metrics and /readyz are never exposed on the public interface. Override to :9090 or a specific interface when a Prometheus scraper must reach the pod over the network |
127.0.0.1:9090 (default) |
TOKEN_SIGNING_SECRET |
Secret for AES-GCM opaque tokens (min 32 bytes, shared across all instances) | ... |
LOG_LEVEL |
debug, info, warn |
info |
GROUPS_CLAIM |
Flat claim name in the OIDC id_token containing user groups | groups (default) |
ALLOWED_GROUPS |
Comma-separated group allowlist. Empty = allow all authenticated users | admin,mcp-users |
REVOKE_BEFORE |
RFC3339 timestamp — both access tokens AND refresh tokens with iat before this are rejected (bulk revocation). Empty = disabled |
2026-03-28T12:00:00Z |
PKCE_REQUIRED |
Require PKCE on /authorize (default true). Set false for Cursor, MCP Inspector, ChatGPT compat |
true |
SHUTDOWN_TIMEOUT |
Graceful shutdown deadline. Raise above the longest expected SSE stream so rolling deploys do not cut MCP sessions mid-stream. Match terminationGracePeriodSeconds in K8s |
120s (default) |
REDIS_URL |
Enables single-use authorization codes and refresh rotation with reuse detection (OAuth 2.1 §6.1) across replicas. rediss:// for TLS. On Redis failure the proxy fails closed (503) |
redis://redis:6379/0 |
REDIS_REQUIRED |
Fail startup (logger.Fatal) when REDIS_URL is unset. Default true — stateless mode leaves authorization codes / refresh tokens replayable within their TTL (findings C3/C4). Set false only for dev or single-replica deployments that accept the trade-off |
true (default) |
REDIS_KEY_PREFIX |
Prefix applied to every Redis key (default mcp-auth-proxy:). Override when sharing a Redis DB between multiple proxy deployments to avoid key collisions. Set explicitly to empty (REDIS_KEY_PREFIX=) to opt out of namespacing |
prod-mcp: |
RATE_LIMIT_ENABLED |
Per-IP rate limiting on pre-auth endpoints and on the authenticated MCP route (default true). Keyed on the stripped RemoteAddr by default; set TRUST_PROXY_HEADERS=true to honor X-Forwarded-For/X-Real-IP/True-Client-IP behind a trusted frontend |
true |
TRUST_PROXY_HEADERS |
Legacy blanket trust of X-Forwarded-For/X-Real-IP/True-Client-IP when keying the rate limiter (default false). Prefer TRUSTED_PROXY_CIDRS; PROD_MODE=true rejects this flag unless CIDRs are configured, otherwise a direct client can trivially mint its own rate-limit key and bypass the limiter |
false (default) |
MCP_PER_SUBJECT_CONCURRENCY |
Per-subject in-flight request cap on the authenticated MCP route (default 16). A runaway or compromised client identity cannot saturate the proxy / upstream pool at the expense of others. Entries for subjects with no in-flight work are reclaimed by a background pruner after ≥5 min idle so map memory stays proportional to active principals, not the lifetime set of ever-seen subjects. 0 disables the limit. Excess requests return 503 temporarily_unavailable with Retry-After: 1 and increment mcp_auth_access_denied_total{reason="subject_concurrency_exceeded"} |
16 (default) |
COMPAT_ALLOW_STATELESS |
When true, /authorize synthesizes a state server-side if the client omits it (legacy MCP Inspector / Cursor). Default false — strict mode refuses with 400 invalid_request because a silent server-synth hides client-side CSRF bugs. mcp_auth_access_denied_total{reason="state_missing"} is incremented either way so operators can see how many clients still rely on the compat path |
false (default) |
MCP_LOG_BODY_MAX |
Max bytes buffered per authenticated request for JSON-RPC method extraction into access logs (default 65536). 0 disables buffering — no rpc_method/rpc_tool/rpc_id fields are emitted. Only triggered when Content-Type: application/json and Content-Length is set and within the limit; SSE / chunked uploads pass through untouched |
65536 (default) |
ACCESS_LOG_SKIP_RE |
Go RE2 regexp matched against r.URL.Path on the public listener only. Matching paths are dropped from the access log; handler response, Prometheus counters, and panic recovery are unaffected. Compiled once at startup; invalid pattern is fatal. RE2 is linear-time — no ReDoS surface. Whitespace-only values are treated as unset. /readyz and /metrics live on METRICS_ADDR and never reach this middleware. Always anchor with ^…$; unanchored substrings can match unrelated upstream paths and .* silences the entire access log |
^/healthz$ |
PROD_MODE |
Strict-posture gate. Default true — fails startup if any compatibility flag that weakens a security control is set (PKCE_REQUIRED=false, COMPAT_ALLOW_STATELESS=true, REDIS_REQUIRED=false, REDIS_URL empty, or legacy TRUST_PROXY_HEADERS=true without TRUSTED_PROXY_CIDRS). Set PROD_MODE=false explicitly for dev / single-replica work that needs one of the relaxation toggles |
true (default) |
TRUSTED_PROXY_CIDRS |
Comma-separated CIDRs of peers whose X-Forwarded-For/X-Real-IP/True-Client-IP headers are honored for rate-limit keying. Other peers fall back to RemoteAddr. Preferred over TRUST_PROXY_HEADERS; takes precedence when both are set |
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 |
MCP_RESOURCE_NAME |
Optional human-readable display name advertised under resource_name in the RFC 9728 PRM. Used by MCP clients for consent / UI display. Field is omitted when unset |
ACME MCP |
UPSTREAM_AUTHORIZATION_HEADER |
When non-empty, sent verbatim as the Authorization header on every request to the upstream MCP backend (full value incl. scheme, e.g. Bearer s3cr3t). Treat as a secret — mount from a Secret, not a ConfigMap |
Bearer xyz |
TOKEN_SIGNING_SECRETS_PREVIOUS |
Whitespace-separated retired signing secrets accepted on Open during a rolling rotation. New seals always use TOKEN_SIGNING_SECRET (primary); Open tries primary first, then each previous entry. Each entry must be ≥32 bytes |
<old1> <old2> |
LOG_LEVEL |
Zap log level (debug / info / warn / error) |
info (default) |
GROUPS_CLAIM |
Flat claim name in id_token that carries user group memberships | groups (default) |
ALLOWED_GROUPS |
Comma-separated allowlist; empty = allow all authenticated users | admin,mcp-users |
mcp-auth-proxy/
├── main.go
├── go.mod
├── go.sum
├── config/
│ └── config.go # env parsing, validation
├── handlers/
│ ├── helpers.go # OAuthError, sealed types, writeJSON, isLoopback
│ ├── resource_metadata.go # GET /.well-known/oauth-protected-resource (RFC 9728)
│ ├── discovery.go # GET /.well-known/oauth-authorization-server (RFC 8414)
│ ├── register.go # POST /register (RFC 7591 DCR)
│ ├── authorize.go # GET /authorize (+ resource param, RFC 8707)
│ ├── callback.go # GET /callback (OIDC IdP return)
│ └── token.go # POST /token (+ resource param, RFC 8707)
├── middleware/
│ └── auth.go # Bearer token validation on MCP routes
├── proxy/
│ └── proxy.go # reverse proxy to upstream MCP server
├── token/
│ └── token.go # AES-GCM seal/open, access token issue/validate
├── replay/
│ ├── replay.go # Store interface + ErrAlreadyClaimed
│ ├── redis.go # Redis-backed Store (SET NX, SET, EXISTS with prefix)
│ └── memory.go # In-process Store (tests / single replica)
├── metrics/
│ └── metrics.go # Prometheus counters for security events
└── Dockerfile
See go.mod — kept inline previously, now the source of truth lives next to the code.
Response 200 JSON:
{
"resource": "{PROXY_BASE_URL}/",
"authorization_servers": ["{PROXY_BASE_URL}"],
"bearer_methods_supported": ["header"],
"scopes_supported": [],
"resource_name": "{MCP_RESOURCE_NAME if set}"
}MCP clients use this endpoint to discover which authorization server protects this resource. No authentication required.
Intentional deviation from RFC 9728 §3. The spec reads the resource value at the origin-root PRM as the identifier into which the well-known suffix was inserted — i.e. {PROXY_BASE_URL} without the trailing slash. We instead advertise {PROXY_BASE_URL}/ because Claude.ai canonicalizes RFC 8707 resource indicators with a trailing slash; stripping the slash here would cause resource-param equality checks on every /authorize and /token call from Claude.ai to fail. matchResource (handlers/helpers.go) is trailing-slash insensitive, so clients that send the spec-strict form without the slash still validate correctly. Strict-spec clients that want the canonical form should fetch the per-mount variant below, which advertises exactly {PROXY_BASE_URL}<mount> with no suffix.
scopes_supported is emitted as an empty array: the proxy has no scope model (scopes are not parsed at /authorize, not sealed into access tokens, not checked by the RS middleware). Publishing [] is more informative than omitting — least-privilege-aware clients see a concrete "no scopes" signal rather than having to probe.
resource_name is advertised when MCP_RESOURCE_NAME is set; omitted otherwise. Clients use it for consent / UI display.
Where <mount> is the path component of UPSTREAM_MCP_URL (e.g. /mcp, /api/v1/mcp). Same shape, resource = {PROXY_BASE_URL}<mount>. This variant is spec-strict: the resource value matches the identifier into which the well-known suffix was inserted, no trailing slash. MCP clients that follow RFC 9728 §3.1 per-resource discovery fetch this path and get the canonical form.
Also served at /.well-known/oauth-authorization-server<mount> (non-spec compat for MCP clients that probe the per-resource suffix), where <mount> is the path component of UPSTREAM_MCP_URL (e.g. /mcp, /api/v1/mcp). Both paths return the same document.
Response 200 JSON:
{
"issuer": "{PROXY_BASE_URL}",
"authorization_endpoint": "{PROXY_BASE_URL}/authorize",
"token_endpoint": "{PROXY_BASE_URL}/token",
"registration_endpoint": "{PROXY_BASE_URL}/register",
"response_types_supported": ["code"],
"grant_types_supported": ["authorization_code", "refresh_token"],
"code_challenge_methods_supported": ["S256"],
"token_endpoint_auth_methods_supported": ["none"],
"scopes_supported": []
}PKCE-only proxy: no client secrets are validated. scopes_supported is an explicit empty array — the proxy carries no scope model (see PRM note above). No authentication required on this endpoint.
Request body (JSON):
{
"redirect_uris": ["https://claude.ai/..."],
"client_name": "Claude",
"token_endpoint_auth_method": "none"
}Behavior:
- Validate that
redirect_urisis present and non-empty - Reject
client_namelonger than 512 bytes so unauthenticated registrations cannot amplify into oversized logs or sealedclient_idresponses - OAuth 2.1 §2.3.1: each
redirect_urimust use HTTPS, or HTTP when pointing at a loopback host. Loopback is recognized vianet.ParseIP().IsLoopback()(covers the full 127/8 range,::1,::ffff:127.0.0.1,::0.0.0.1) plus the literallocalhost/localhost.. Non-http(s) schemes (e.g.ftp://,ldap://,file://, custom app schemes) are rejected unconditionally even when the host is loopback - Generate an internal UUID for the client
- Encrypt the whole
{ id, redirect_uris, client_name, expires_at }with AES-GCM → this is the returnedclient_id - TTL embedded in the encrypted blob: 7d default, configurable via
CLIENT_REGISTRATION_TTL(Go duration, capped at 90d). The 7d default matchesrefreshTokenTTLso a client holding a still-valid refresh token can always exchange it — a shorter TTL silently kills long-running MCP clients (which treat DCR as one-shot at startup) the moment their access token first expires. - Request body limited to 1 MB (
MaxBytesReader)
Response 201 JSON:
Headers: Cache-Control: no-store, Pragma: no-cache.
{
"client_id": "<encrypted blob>",
"client_id_issued_at": 1234567890,
"client_id_expires_at": 1234654290,
"redirect_uris": ["..."],
"client_name": "<echoed if submitted>",
"token_endpoint_auth_method": "none"
}client_id_expires_at (RFC 7591 §3.2.1) is the UNIX timestamp at which the sealed client_id stops opening (default client_id_issued_at + 7d, configurable via CLIENT_REGISTRATION_TTL). Clients that cache the handle should re-register before this time to avoid a 400 on /authorize and on /token refresh-token rotations.
Error responses use RFC 7591 §3.2.2 codes: invalid_redirect_uri for any redirect_uri-shape defect (missing, over-count, over-length, malformed, opaque, hostless, fragment-bearing, userinfo-bearing, or non-https-non-loopback); invalid_client_metadata for unsupported token_endpoint_auth_method, over-length client_name, or client_name containing control bytes (NUL/CR/LF/TAB) or the X-User-Groups delimiter , (the field is sealed into the returned client_id and emitted to logs; raw control bytes would smuggle past zap's JSON-escaping when downstream code unsealed and parsed it); invalid_request for structural problems: 400 with "invalid JSON body" for a malformed body, 413 with "request body exceeds the 1 MB cap" when the body crosses MaxBytesReader.
Query params:
response_type=code(required, reject otherwise)client_id(required, decrypt and validate not expired)redirect_uri(required, must match a registered URI — exact match)code_challenge(required ifPKCE_REQUIRED=true, optional otherwise; 43-128 unreserved characters per RFC 7636)code_challenge_method=S256(required ifcode_challengepresent)state(required by default; strict mode rejects/authorizewith 400invalid_requestwhen absent. SetCOMPAT_ALLOW_STATELESS=trueto keep the legacy server-synth behavior for Cursor / MCP Inspector —mcp_auth_access_denied_total{reason="state_missing"}is incremented either way for visibility)resource(optional, RFC 8707 — accepted when it matches either{PROXY_BASE_URL}/{PROXY_BASE_URL}/or the configured mount resource{PROXY_BASE_URL}<mount>)
Behavior:
- Validate all params; reject repeated singleton params (
resourcemay appear more than once per RFC 8707); everyresourcevalue must match an accepted resource URI - Decrypt the
client_id→ verify not expired,redirect_urimatched - Consent fork. The validated parameters are sealed into a
sealedConsentblob (PurposeConsentAAD, audience-bound, 5-min TTL) and an HTML page is rendered showing the registeredclient_name, the parsed redirect host, and the resource. The user clicks Approve or Deny. The form POSTs to/consent(see below). Default behavior; settingRENDER_CONSENT_PAGE=falseskips this step and runs the silent redirect at step 4 directly — only safe when every caller is non-interactive and known-trusted. - Encrypt the session with AES-GCM (10min TTL):
{ client_id (internal UUID), redirect_uri, code_challenge, original_state, expires_at } - Use the encrypted blob as the
stateparameter sent to the IdP - Build the authorization URL from endpoints discovered via OIDC auto-discovery:
{discovered_authorization_endpoint} ?client_id={OIDC_CLIENT_ID} &response_type=code &redirect_uri={PROXY_BASE_URL}/callback &scope=openid email profile &state={encrypted_session} &response_mode=query - Redirect 302 to the IdP
Always registered. Driven by the consent form rendered at step 3 of /authorize (default behaviour; bypassed only when the operator sets RENDER_CONSENT_PAGE=false).
Form fields: consent_token (sealed blob from the GET-side render), action (approve or deny).
Guards (mirror /token):
r.URL.RawQuery != ""→ 400. The sealed token would otherwise leak into access logs / browser history / Referer.Authorizationheader present → 401invalid_clientwith aWWW-Authenticatechallenge. The endpoint advertises no client-auth scheme.- Body capped at 1 MB; repeated
consent_token/actionfields rejected.
Behavior:
- Open
consent_token(PurposeConsentAAD); reject on bad shape, wrong purpose, audience mismatch, or expired (5 minTTL). action=deny: incrementmcp_auth_consent_decisions_total{decision="denied"}, logconsent_denied, redirect 302 to the registeredredirect_uriwitherror=access_deniedper RFC 6749 §4.1.2.1.action=approve: mint upstream OIDCnonce(random 32 hex) + upstream PKCE verifier, regenerate the H6 server-side PKCE pair if the consent blob recorded one, seal asealedSession(same shape as step 4 of/authorize), redirect 302 to the IdP. Incrementmcp_auth_consent_decisions_total{decision="approved"}.- Anything else: 400
invalid_request.
Single-use replay defense on consent_token: when a replay store is wired (REDIS_URL), the consent token's jti is claimed atomically before either branch runs — a captured consent_token can be redeemed at most once for either Approve or Deny. Each GET /authorize render mints a fresh jti, so the back-button case still works (a re-render gets a new claim slot, the prior one is dead once redeemed). On replay: 400 invalid_request with error_code=consent_replay, mcp_auth_replay_detected_total{kind="consent"} increments. With no replay store wired (configured opt-out) the handler falls back to the prior stateless behavior (token unique, audience-bound, TTL-bounded, but replayable within the 5-min window).
Query params: code, state (from the IdP)
Behavior:
- If the IdP returns an
error(RFC 6749 §4.1.2.1), propagate it to the client - Decrypt the
state→ retrieve the session, verify not expired - Single-use claim on the session's
sidwhen a replay store is wired — a replayed/callbackis rejected with 400invalid_request+error_code=callback_state_replayBEFORE the upstream IdP exchange runs (no fan-out, no audit-log noise).mcp_auth_replay_detected_total{kind="callback_state"}increments. Emptysid(legacy session in flight during rollout) falls through to stateless behavior. Store-error path fail-closes to 503 +replay_store_unavailable. - Exchange the code with the IdP (POST token endpoint, 10s timeout) to obtain
id_token+access_token - Validate the
id_tokenvia go-oidc (JWKS signature auto-discovery, issuer, audience) - Extract claims:
sub,email,email_verified,name - If
email_verifiedis present and false, reject with 403access_denied+error_code: email_not_verified(absent claim is accepted — not all IdPs emit it) - Extract groups from the configured claim (
GROUPS_CLAIM, defaultgroups) - If
ALLOWED_GROUPSis configured, verify the user belongs to at least one allowed group → 403 otherwise - Encrypt an internal authorization code with AES-GCM (60s TTL):
{
token_id (UUID, used for single-use replay check),
client_id (internal UUID),
redirect_uri, code_challenge,
subject, email, name, groups,
expires_at
}
- Redirect 302 to
redirect_uri?code={encrypted_code}&state={original_state}&iss={PROXY_BASE_URL}- Built via
url.Parse+ merged query params (safe even if redirect_uri already contains query params) issis emitted per RFC 9700 §2.1.4 (mix-up defense): a client that talks to multiple ASes can verify the response came from the AS it actually sent the request to. Value matches theissuerfield in the RFC 8414 metadata document.
- Built via
Request body (application/x-www-form-urlencoded, max 1 MB):
For grant_type=authorization_code:
grant_type=authorization_code
&code=<encrypted internal code>
&redirect_uri=<must match>
&client_id=<must match>
&code_verifier=<PKCE verifier, 43-128 chars per RFC 7636 §4.1>
For grant_type=refresh_token:
grant_type=refresh_token
&refresh_token=<encrypted token>
&client_id=<must match>
Behavior — authorization_code:
- Reject repeated singleton params (
resourcemay appear more than once per RFC 8707); everyresourcevalue must match an accepted resource URI; validatecode_verifiershape (43-128 unreserved characters, RFC 7636 §4.1) - Decrypt the code, verify not expired
- Decrypt the
client_id, verify not expired - Verify
client_id(internal UUID) andredirect_urimatch the code - Validate PKCE: base64url-encoded
SHA256(code_verifier)== storedcode_challenge(constant-time comparison) - If
REDIS_URLis configured, atomically claim the code'stoken_idviaSET NX. A second attempt is rejected withinvalid_grant+error_code: code_replay; Redis failures fail closed with 503 - Issue an opaque access token (AES-GCM, 1h TTL) and a refresh token (AES-GCM, 7d TTL)
Behavior — refresh_token:
-
Decrypt the refresh token, verify its
audiencematchesPROXY_BASE_URL -
If
REVOKE_BEFOREis configured, reject if refreshiat< cutoff (bulk revocation applies to refresh tokens too, not only access tokens) -
Verify the refresh is not expired
-
Decrypt the
client_id, verify audience + not expired + UUID matches the refresh -
If
REDIS_URLis configured: a. Reject ifrefresh_family_revoked:<fam>is set (prior reuse killed the family) b. Atomically claimrefresh:<tid>viaSET NX, recording the claim's set time. If the key is already claimed AND the prior claim landed withinREFRESH_RACE_GRACE_SEC(default 2s, max 10s, set 0 to disable): treat as a benign concurrent submit — return 429invalid_grantwitherror_code: refresh_concurrent_submit+Retry-After: 2. The family is NOT revoked; the legitimate peer's rotation already succeeded, the racing client retries with the new refresh once it lands. If the prior claim is past the grace window: reuse detected — mark the family revoked for 7 days, reject witherror_code: refresh_reuse_detected. Any subsequent sibling refresh also gets rejected in step 5aNote on the 429 status: RFC 6749 §5.2 defines
/tokenerrors as 400 / 401. Returning 429 here is a deliberate deviation — most OAuth client libraries treat 429 +Retry-Afteras "back off and retry", which is exactly the desired behavior for a racing peer (the legit rotation already succeeded; the racing client should retry with the new refresh from shared storage). Theerror_code=refresh_concurrent_submitfield disambiguates from generic rate limiting for clients that look at the body. A 400invalid_grantwas considered but rejected: it would force every client library to add a custom retry path keyed offerror_codeinstead of using its existing 429 handling. -
Issue new access + refresh tokens. The new refresh inherits the original
fam(so reuse detection spans the lineage) and gets a freshtid.iatis set tonowso it survives the nextREVOKE_BEFOREapplication
Response 200 JSON:
{
"access_token": "<opaque token>",
"token_type": "Bearer",
"expires_in": 3600,
"refresh_token": "<opaque refresh token>"
}Headers Cache-Control: no-store and Pragma: no-cache required (RFC 6749 §5.1).
Standard OAuth2 errors:
{ "error": "invalid_grant", "error_description": "...", "error_code": "..." }error_code is an optional extension field for machine-readable internal
error identifiers. Clients must treat it as advisory and rely on the
standard error field for OAuth behavior.
Opaque format: JSON → AES-GCM encryption with TOKEN_SIGNING_SECRET → base64url.
Access token payload:
type Claims struct {
TokenID string // UUID
Audience string // PROXY_BASE_URL — bound at issuance, verified on every Validate
Subject string // IdP sub
Email string
Groups []string // from GROUPS_CLAIM in id_token
ClientID string
IssuedAt time.Time
ExpiresAt time.Time
}Refresh token payload (sealed identically; TokenID and FamilyID drive the optional Redis-backed reuse detection):
type sealedRefresh struct {
TokenID string // UUID, unique per refresh (single-use key when Redis is wired)
FamilyID string // UUID, constant across rotations — shared by every sibling in the lineage
Subject string
Email string
Groups []string
ClientID string
Audience string
IssuedAt time.Time // used by REVOKE_BEFORE bulk revocation
ExpiresAt time.Time
}No JWT exposed — opaque token only on the MCP client side. Validated by AES-GCM decryption + audience check + expiry. No store required.
All MCP routes (/* except OAuth endpoints):
- Extract
Authorization: Bearer <token> - Decode and validate the opaque token (AES-GCM decryption, expiry check)
- Verify
claims.Audience == PROXY_BASE_URL— rejects tokens minted by a sibling instance sharing the same secret but with a different baseURL - If
REVOKE_BEFOREis configured, reject ifiat< cutoff (bulk revocation) - Inject into context:
sub,email,groups - On failure, return
401per RFC 6750 §3.1:- Missing or malformed
Authorizationheader →{ "error": "invalid_request", "error_description": "bearer credential is missing or malformed" } - Token decrypt/expiry/audience/iat failures →
{ "error": "invalid_token", "error_description": "bearer token is invalid, expired, or not intended for this resource" }Both responses includeWWW-Authenticate: Bearer error="<code>", error_description="<text>", resource_metadata="{PROXY_BASE_URL}/.well-known/oauth-protected-resource"(RFC 9728 §5.1 + RFC 6750 §3). Theerror_descriptionis a closed allowlist of fixed strings — no caller-controlled data reaches the header.
- Missing or malformed
After auth middleware passes:
// Forward to upstream MCP server
// Client request path forwarded verbatim; proxy mount == UPSTREAM_MCP_URL path
// Added headers:
r.Header.Set("X-User-Sub", claims.Subject)
r.Header.Set("X-User-Email", claims.Email)
r.Header.Set("X-User-Groups", "group1,group2") // comma-separated, omitted if empty
r.Header.Del("Authorization") // do not leak the internal token
// Support SSE (text/event-stream): no response buffering
// Support Streamable HTTP (chunked): immediate flushUse httputil.ReverseProxy with FlushInterval: -1 (immediate flush) to support SSE and streaming. The underlying *http.Transport sets ResponseHeaderTimeout: 30s so a wedged upstream fails fast during header negotiation — stream bodies themselves remain uncapped. The transport follows 307/308 redirects server-side (Python FastAPI/Starlette backends), same-host only, body replayed, max 10 hops. On exhaustion the proxy responds 502 Bad Gateway with {"error":"bad_gateway","error_description":"too many upstream redirects"} rather than echoing the last 307/308 (which would leak a broken upstream Location: to the MCP client). Proxied request bodies are capped at 16 MiB via http.MaxBytesReader to bound the memory the redirect-follow buffer can hold.
UPSTREAM_MCP_URL must include a path component; that path is the proxy's public mount and the path forwarded upstream, verbatim both sides. No join, no strip, no rewrite. If the client hits {PROXY_BASE_URL}<path>, the upstream sees <path>.
Real-world MCP endpoints use many paths — set UPSTREAM_MCP_URL accordingly:
| Product | Path | Example value |
|---|---|---|
| FastMCP (Python) default | /mcp |
http://fastmcp:8000/mcp |
GitHub Copilot (api.githubcopilot.com) |
/mcp |
https://api.githubcopilot.com/mcp |
Cloudflare (mcp.cloudflare.com) |
/mcp |
https://mcp.cloudflare.com/mcp |
| Atlassian Rovo | /v1/mcp |
https://rovo.atlassian.com/v1/mcp |
| GitLab | /api/v4/mcp |
https://gitlab.com/api/v4/mcp |
Origin-only URLs (no path / lone /), query, fragment, userinfo, and paths that collide with a reserved control-plane route (/healthz, /register, /authorize, /callback, /token, /.well-known) are rejected at startup.
The router is built in main.go (func main) — see that file rather than a copy here, since this block historically rotted. High level:
- Global middlewares: in-flight WaitGroup → strip inbound
X-Request-Id→chimw.RequestID→zapMiddleware→chimw.Recoverer→ per-IP rate limiter. - OAuth endpoints (
/register,/authorize,/callback,/token) and the discovery surface (/.well-known/oauth-authorization-server,/.well-known/oauth-protected-resource, mount-suffixed variants, and the openid-configuration / under-mount 404 carve-outs) carry per-endpoint rate limiters whenRATE_LIMIT_ENABLED=true(passthrough otherwise). Discovery is silent on rate-limit per RFC 8414 §3 / RFC 9728 §3.1; the 60/min/IP ceiling here only catches floods. TheX-RateLimit-Limit/-Remaining/-Resetheaders httprate sets internally are stripped from every response — production MCP servers (Cloudflare, GitHub Copilot, Atlassian, Notion, Sentry) all keep them silent, and the IETF rate-limit-headers draft warns that disclosing quota state on auth/error paths leaks operational capacity to attackers.replayStoreis wired only whenREDIS_URLis set. - Liveness
/healthz(always 200) on the public listener; readiness/readyzlives ONLY on the metrics listener (an unauthenticated/readyzon the public port is a Redis-DoS amplifier — see comment atmain.go:304). - MCP proxy mounts at
cfg.UpstreamMCPMountPath(path fromUPSTREAM_MCP_URL) underauthMW.Validate→RPCPeek→ per-subject concurrency limiter. Client path == upstream path, verbatim, no rewrite.
Always return errors conforming to RFC 6749:
type OAuthError struct {
Error string `json:"error"`
ErrorDescription string `json:"error_description,omitempty"`
ErrorCode string `json:"error_code,omitempty"`
}Standard error values (RFC 6749 §5.2): invalid_request, invalid_client, invalid_grant, unauthorized_client, unsupported_grant_type, invalid_scope, server_error, access_denied, temporarily_unavailable.
Extension error_code values (proxy-specific, advisory — clients MUST rely on error for OAuth behavior):
error_code |
When | Paired error |
|---|---|---|
code_replay |
Authorization code reused (requires Redis) | invalid_grant |
refresh_reuse_detected |
Refresh token replayed after rotation → family revoked (requires Redis) | invalid_grant |
refresh_family_revoked |
Refresh token whose family was previously revoked | invalid_grant |
email_not_verified |
id_token email_verified is false |
access_denied |
subject_missing |
IdP returned a verified id_token without a sub claim (L5) |
access_denied |
group_invalid |
IdP group name contains , \r \n \x00 |
access_denied |
replay_store_unavailable |
Redis unreachable; handler fails closed | server_error |
id_token_verification_failed |
go-oidc rejected the IdP id_token | server_error |
token_issue_failed |
AES-GCM seal error when minting an access token | server_error |
IdP-supplied error values on /callback are allowlisted against the
RFC 6749 §4.1.2.1 set (invalid_request, invalid_client,
unauthorized_client, access_denied, unsupported_response_type,
invalid_scope, server_error, temporarily_unavailable); anything
outside that set is rewritten to server_error before being echoed to
the MCP client. error_description is truncated at 200 bytes and
stripped of non-ASCII-printable bytes to defeat log / header injection.
See Dockerfile. Static Go binary on gcr.io/distroless/static-debian13:nonroot (UID 65532, no shell, no apt). Build args inject build-time metadata via -ldflags -X; OCI labels carry source/created/version/revision.
Operator-load-bearing invariants that are not surfaced by the env table or the endpoint reference. The "Configuration" table is the canonical knob list; the "Architecture" and "Endpoints" sections are the canonical flow descriptions.
- redirect_uri shape: exact match (with the RFC 8252 §7.3 loopback-port relaxation at
/authorize);http://allowed only to a loopback host (full 127/8 range,::1,::ffff:127.0.0.1,::0.0.0.1,localhost,localhost.); non-http(s) schemes rejected even on loopback; fragments and userinfo rejected; length capped at 512 chars; at most 5 entries per client registration. The port relaxation applies only to the registered ↔/authorizematch — at/tokenthe client MUST echo the sameredirect_urivalue it sent to/authorize(RFC 6749 §4.1.3 byte-equality), which native apps already do because they bind their loopback port once and reuse it for the whole flow. - Structured logs: zap JSON,
request_idon every line. InboundX-Request-Idis stripped before chi mints one (defeats client-controlled log forgery). Authenticated requests carrysubandemail; JSON-RPC requests additionally carryrpc_method,rpc_tool(capped at 128 chars),rpc_id(capped at 64), all passed through a narrow allowlist (ASCII alphanumerics plus._:/-+).MCP_LOG_BODY_MAX=0suppresses therpc_*fields;ACCESS_LOG_SKIP_REdrops whole lines for matching paths. - Business metrics (under
/metricsonMETRICS_ADDR):mcp_auth_tokens_issued_total{grant_type}mcp_auth_access_denied_total{reason}— see README for the enumerated reasonsmcp_auth_replay_detected_total{kind}(code/refresh)mcp_auth_rate_limited_total{endpoint}mcp_auth_clients_registered_totalmcp_auth_groups_claim_shape_mismatch_total— IdP-schema-drift signal; user is admitted with empty groups, so it's NOT a denialmcp_auth_token_seals_total{purpose}— cross-replica AES-GCM seal counter; alert onsum(increase(metric[7d])) > 2**28to driveTOKEN_SIGNING_SECRETrotationmcp_auth_rpc_calls_total{tool}/mcp_auth_rpc_calls_failed_total{tool}/mcp_auth_rpc_request_bytes_total{tool}/mcp_auth_rpc_response_bytes_total{tool}— per-tool RPC traffic. Fire only on JSON-RPCtools/call(protocol-level methods likeinitialize/tools/listare excluded so_unknownreliably flags malformedtools/callpayloads). Batches fan out into onerpc_calls_totalincrement pertools/callentry; byte counters stay scoped to single-call requests because per-call Content-Length / response bytes cannot be honestly attributed inside a batch. Disabled by default (cardinality + privacy trade); opt-in viaMCP_TOOL_METRICS=true. Distinct labels capped byMCP_TOOL_METRICS_MAX_CARDINALITY(default 256) — overflow folds into_overflow, unparseable tool names into_unknownmcp_auth_rpc_batches_total/mcp_auth_rpc_batches_failed_total/mcp_auth_rpc_batch_bytes_total{direction}— batch-shape counters, disjoint from the per-tool family. One increment per HTTP request that decoded as a JSON-RPC batch with at least onetools/callentry; carries the request's actual Content-Length / BytesWritten. No per-tool label — batch contents do not have honest per-call attribution. Same opt-in toggle (MCP_TOOL_METRICS=true) as the per-tool family
- HTTP timeouts:
ReadTimeout: 30s,WriteTimeout: 0(SSE),IdleTimeout: 120s. SSE streams MUST flush; do not buffertext/event-streamresponses. - Body size limit: POST endpoints capped at 1 MB via
MaxBytesReader. - 307/308 redirect following: proxy follows server-side for Python MCP backends that redirect
/mcp→/mcp/. Same-host only, body replayed, max 10 hops, scheme downgrade rejected. - PRM
resourcefield: the root/.well-known/oauth-protected-resourcecarries the trailing-slash form for Claude.ai canonicalization (intentional deviation from RFC 9728 §3 — see the Endpoints section). The per-mount variant is spec-strict. - Email verification:
email_verified=falsein the id_token is rejected at/callbackwith 403access_denied+error_code: email_not_verified. A missing claim is accepted — not every IdP emits it. - Security headers: every response on the public listener carries
Strict-Transport-Security: max-age=63072000; includeSubDomains(RFC 6797),X-Content-Type-Options: nosniff,X-Frame-Options: DENY,Referrer-Policy: no-referrer(RFC 9700 §4.2.4 RECOMMENDED for OAuth ASes — defends againstcodeleakage via Referer), andContent-Security-Policy: default-src 'none'; frame-ancestors 'none'. Headers are set before the handler runs so they apply to every status code, including upstream pass-through 5xx and rate-limiter 429s. Not applied to the metrics listener (Prometheus scrape is in-cluster only). The HSTSincludeSubDomainsdirective assumes the operator's parent zone is all-HTTPS.
The stateless design supports horizontal scaling without sticky sessions. Required configuration symmetry across replicas:
TOKEN_SIGNING_SECRET— must be byte-identical (mount from aSecret, not generated per-pod). The single most important invariant: a mismatch breaks every cross-pod token validation.PROXY_BASE_URL— must be the public DNS name reached by clients, not a per-pod hostname. Audience binding enforces this — a pod with a wrongPROXY_BASE_URLwill reject every token minted by its siblings.OIDC_*— same registration on the same IdP.UPSTREAM_MCP_URL— same in-cluster service URL.PKCE_REQUIRED,ALLOWED_GROUPS,GROUPS_CLAIM,REVOKE_BEFORE,REDIS_URL,REDIS_KEY_PREFIX,RATE_LIMIT_ENABLED— any asymmetry produces "works on some pods, not others" bugs.REDIS_URL+REDIS_KEY_PREFIXin particular must match across all replicas, otherwise single-use and reuse-detection guarantees only hold within each pod's local set of clients.
Recommended Deployment shape:
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
strategy:
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
template:
spec:
terminationGracePeriodSeconds: 120 # match SHUTDOWN_TIMEOUT
containers:
- name: mcp-auth-proxy
envFrom:
- secretRef:
name: mcp-auth-proxy-secret
- configMapRef:
name: mcp-auth-proxy-config
livenessProbe:
httpGet:
path: /healthz
port: http
# /readyz lives ONLY on the metrics port — an unauthenticated
# readiness endpoint on the public listener is a Redis-DoS
# amplifier (a sustained probe flood saturates the pool, flips
# readiness fleet-wide, and drops every pod from the Service
# simultaneously). Probe via the metrics port the kubelet can
# reach in-cluster.
readinessProbe:
httpGet:
path: /readyz
port: metrics
ports:
- { name: http, containerPort: 8080 }
- { name: metrics, containerPort: 9090 }
---
apiVersion: v1
kind: Service
spec:
sessionAffinity: None # explicit; no stickiness needed
selector: { app: mcp-auth-proxy }
ports:
- { name: http, port: 80, targetPort: 8080 }Ship a PodDisruptionBudget with minAvailable: 1 (or 2 for ≥3 replicas) so node drains do not take out the whole auth plane at once.
The manifests/ folder ships a turn-key demo: a Docker Compose stack (Keycloak + Redis + fake MCP upstream + proxy) for local exploration and a Kubernetes reference set (Deployment, Service, Ingress, PodDisruptionBudget, plus scripts/generate-signing-secret.sh).
REVOKE_BEFORE is read at startup. Updating it requires a rolling restart, and during the rollout window some pods enforce the new cutoff while others still use the old one. Wait for kubectl rollout status to converge before assuming the cutoff is fleet-wide enforced.
config.Load() fails closed on the following config mistakes (fatal at startup):
TOKEN_SIGNING_SECRETshorter than 32 bytes.TOKEN_SIGNING_SECRETmatching an obvious-weakness pattern whenPROD_MODE=true. Three rejection classes: (1) all-same byte (aaaa…); (2) short repeating period (abcabc…,0123456789abcdef0123456789abcdef); (3) tiny alphabet — fewer than 8 distinct byte values, catching uneven-run-length shapes (aaaa…b) that defeat the period and all-same checks. Same gate applies to every entry inTOKEN_SIGNING_SECRETS_PREVIOUSso a rolling rotation cannot regress. UnderPROD_MODE=falsethe same secret triggers a non-fatal weakness warning instead. Real random output of any encoding (raw, hex, base64) is non-periodic AND has well over 8 distinct values in 32+ bytes, so it passes; canonical generator ismanifests/scripts/generate-signing-secret.sh.SHUTDOWN_TIMEOUTnon-positive or greater than 15 minutes (L2).REDIS_KEY_PREFIXcontaining{,},\r,\n, or any byte outside the 0x20..0x7E ASCII-printable range (L3).PROXY_BASE_URLwith a scheme other thanhttps://(orhttp://to a loopback host), a non-empty userinfo, a fragment, or a path beyond/(L8).MCP_LOG_BODY_MAX/MCP_PER_SUBJECT_CONCURRENCYnot parseable as non-negative integers.SHUTDOWN_TIMEOUT/REVOKE_BEFOREunparseable as duration / RFC3339.ACCESS_LOG_SKIP_REnot compilable as a Go RE2 regexp.
Non-fatal startup warnings:
token_signing_secret_weakfires when the secret matches an obvious-weakness pattern (all-same byte, repeating period < length, or fewer than 8 distinct byte values) — signals a human-typed / patterned secret whose effective entropy is well below its length.token_seal_rotation_thresholdfires once after 2^28 successful seals per Manager, suggestingTOKEN_SIGNING_SECRETrotation before AES-GCM nonce-collision bounds matter (L6).