Skip to content

Return 503 for expired proxy tokens#4722

Open
gkatz2 wants to merge 2 commits intostacklok:mainfrom
gkatz2:fix/proxy-token-expired-503-4721
Open

Return 503 for expired proxy tokens#4722
gkatz2 wants to merge 2 commits intostacklok:mainfrom
gkatz2:fix/proxy-token-expired-503-4721

Conversation

@gkatz2
Copy link
Copy Markdown
Contributor

@gkatz2 gkatz2 commented Apr 10, 2026

Summary

  • The token injection middleware returns 401 when the OAuth token source fails, which MCP SDKs interpret as "begin OAuth authentication." Since the proxy manages OAuth internally and has no client-facing OAuth metadata, the SDK's discovery attempt fails and the client caches the server as unreachable.
  • Change the error response to 503 Service Unavailable with Retry-After: 10, which clients treat as a transient connection error.
  • Add unit tests for CreateTokenInjectionMiddleware (previously untested).

Fixes #4721

Type of change

  • Bug fix

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)
  • Manual testing: started a test HTTP server returning 503 on the token error path, confirmed MCP SDK (Claude Code) does not enter OAuth discovery. Compared against a 401 server where the SDK does enter OAuth discovery and caches "needs authentication."

Does this introduce a user-facing change?

MCP clients connecting through the proxy will see 503 instead of 401 when the proxy's OAuth tokens are expired. This is a behavioral change, but 503 is the correct status for this situation and prevents clients from entering a broken OAuth flow.

Generated with Claude Code

When the proxy's OAuth token source fails, the token injection
middleware returned 401 Unauthorized. MCP clients interpret 401
as a signal to begin OAuth authentication, but the proxy manages
OAuth internally and has no client-facing OAuth metadata. The
failed discovery is cached, blocking reconnection even after the
token refreshes.

Return 503 Service Unavailable with Retry-After instead, which
clients treat as a transient connection error.

Fixes stacklok#4721

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Greg Katz <gkatz@indeed.com>
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions bot added the size/S Small PR: 100-299 lines changed label Apr 10, 2026
jhrozek
jhrozek previously approved these changes Apr 10, 2026
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! One nit below.

// the workload as unauthenticated in its Token() method.
// Return 503 instead of 401 so MCP clients do not mistake this
// for a server that requires client-side OAuth authentication.
w.Header().Set("Retry-After", "10")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (feel free to take or leave): consider extracting "10" into a named constant with a short comment explaining it matches the initial MonitoredTokenSource backoff interval. Makes it easier to find if the backoff config ever changes.

// retryAfterSecs tells MCP clients how long to wait before retrying.
// Matches the initial MonitoredTokenSource backoff interval.
const retryAfterSecs = "10"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Fixed.

Address review nit: extract the retry delay into a package-level
constant with a comment linking it to the MonitoredTokenSource
backoff interval.

Signed-off-by: Greg Katz <gkatz@indeed.com>
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Apr 10, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.76%. Comparing base (d851c69) to head (5c8e314).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4722      +/-   ##
==========================================
+ Coverage   68.71%   68.76%   +0.05%     
==========================================
  Files         517      516       -1     
  Lines       54817    54230     -587     
==========================================
- Hits        37666    37290     -376     
+ Misses      14252    14093     -159     
+ Partials     2899     2847      -52     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gkatz2
Copy link
Copy Markdown
Contributor Author

gkatz2 commented Apr 10, 2026

@jhrozek, addressed the nit in the latest push. Ready for another look when someone has a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S Small PR: 100-299 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expired proxy tokens block MCP client reconnection

2 participants