Skip to content

Commit 1bc25d9

Browse files
Avi-Robustaclaude
andauthored
[ROB-3999] Fix oauth client secret post 200 (HolmesGPT#2106)
The two-attempt token exchange After the user clicks "Allow" in the browser, Slack redirects to Holmes with an authorization code. Holmes calls exchange_code_for_tokens to swap that code for an access_token. Attempt 1 — HTTP Basic Auth (unchanged): auth = httpx.BasicAuth(client_id, client_secret) resp = httpx.post(token_url, data=data, auth=auth, ...) Holmes sends client_id + client_secret in the Authorization header. For Slack, this returns: HTTP/2 200 {"ok": false, "error": "bad_client_secret"} That's the trap: Slack returns 200 OK with an error body because its token endpoint only accepts client_secret_post, not Basic Auth. Attempt 2 — POST-body fallback (the fix): Before: if client_secret and not resp.is_success: # retry with secret in body This never fired for Slack, because resp.is_success was True. After: def _has_access_token(r): try: return "access_token" in r.json() except Exception: return False if client_secret and (not resp.is_success or not _has_access_token(resp)): data["client_secret"] = client_secret resp = httpx.post(token_url, data=data, ...) # no auth=, secret in body The new clause not _has_access_token(resp) treats a 200 response without an access_token field as a soft failure. For Slack's 200-with-error pattern, this fires the body retry — Holmes re-POSTs with client_secret inside the form-encoded body. Slack accepts that and responds: {"ok": true, "access_token": "xoxp-…", "scope": "...", "team": {...}} The function then validates "access_token" in token_data and returns the dict. Why this is safe for IdPs that worked before IdPs that accept Basic Auth (e.g. Notion) still succeed on attempt 1 — the response contains access_token, _has_access_token returns True, no retry happens. IdPs that return 4xx on Basic Auth (e.g. Supabase) still trigger the retry via the original not resp.is_success branch. The change only adds a third trigger for the retry; no existing IdP loses any behavior. What you'd see end-to-end (verified live earlier in this session) CLI/cluster opens browser → user authorizes → callback delivers code. exchange_code_for_tokens POSTs with Basic Auth → Slack returns 200 + bad_client_secret. New gate detects no access_token → POSTs again with secret in body → Slack returns 200 + xoxp-… token. Holmes caches the token and uses it as Authorization: Bearer xoxp-… against https://mcp.slack.com/mcp, then calls tools/list → tools come back → Holmes can invoke slack_read_user_profile, slack_read_file, etc. The two regression tests in tests/test_mcp_oauth.py exercise both Slack-style outcomes (success on body retry; failure if body retry also returns no token), so the behavior is locked in. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * OAuth token exchange now properly handles successful HTTP responses that lack access tokens, automatically retrying with alternative credential delivery methods—improving compatibility with certain identity providers. * Enhanced error diagnostics for OAuth tool loading failures to better identify configuration mismatches or workspace/token inconsistencies. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: avi@robusta.dev <avi@robusta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent ac5e944 commit 1bc25d9

3 files changed

Lines changed: 135 additions & 2 deletions

File tree

holmes/core/oauth_config.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,17 @@ def exchange_code_for_tokens(
7676
except httpx.HTTPError as e:
7777
raise OAuthTokenExchangeError(0, f"Token request to {token_url} failed: {e}") from e
7878

79-
# If Basic Auth failed, retry with client_secret in POST body
80-
if client_secret and not resp.is_success:
79+
# Retry with client_secret in POST body if Basic Auth failed, OR if the IdP
80+
# returned 200 with a non-token body (e.g. Slack responds HTTP 200 +
81+
# {"ok": false, "error": "bad_client_secret"} when it only accepts
82+
# client_secret_post).
83+
def _has_access_token(r: httpx.Response) -> bool:
84+
try:
85+
return "access_token" in r.json()
86+
except Exception:
87+
return False
88+
89+
if client_secret and (not resp.is_success or not _has_access_token(resp)):
8190
data["client_secret"] = client_secret
8291
try:
8392
resp = httpx.post(

holmes/core/tools_utils/oauth_tool_connector.py

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,7 @@ def load_tools_for_user(
131131
"Failed to load OAuth tools for user %s on toolset %s: %s",
132132
user_id, toolset.name, self._extract_error_message(e),
133133
)
134+
self._log_token_config_mismatch(user_id, toolset)
134135
return []
135136

136137
def store_user_tools(self, user_id: str, toolset_name: str, tools: List[Tool]) -> None:
@@ -227,6 +228,60 @@ def _extract_error_message(exc: BaseException) -> str:
227228
current = current.exceptions[0]
228229
return str(current)
229230

231+
@staticmethod
232+
def _log_token_config_mismatch(user_id: str, toolset: Any) -> None:
233+
"""On MCP failure, log if the stored token was issued under a different
234+
client_id / token_url than the toolset's current OAuth config.
235+
236+
Catches config drift that would otherwise be invisible: rotated
237+
client_id, two toolsets sharing a provider URL with different client
238+
configs, or a stale cached token after credentials were changed.
239+
Also surfaces the workspace/team the token belongs to for servers
240+
that gate access per workspace (e.g. Slack), so users can spot
241+
wrong-workspace auth.
242+
"""
243+
try:
244+
oauth_cfg = getattr(getattr(toolset, "_mcp_config", None), "oauth", None)
245+
if oauth_cfg is None:
246+
return
247+
store = getattr(_get_token_manager(), "_store", None)
248+
if store is None:
249+
return
250+
provider = getattr(oauth_cfg, "authorization_url", None) or "unknown"
251+
try:
252+
tok = store.get_token(provider, user_id=user_id)
253+
except TypeError:
254+
tok = store.get_token(provider)
255+
if not isinstance(tok, dict):
256+
return
257+
258+
cfg_client_id = getattr(oauth_cfg, "client_id", None)
259+
cfg_token_url = getattr(oauth_cfg, "token_url", None)
260+
tok_client_id = tok.get("client_id")
261+
tok_token_url = tok.get("token_url")
262+
tok_team = tok.get("team")
263+
264+
mismatches = []
265+
if cfg_client_id and tok_client_id and cfg_client_id != tok_client_id:
266+
mismatches.append(f"client_id: stored={tok_client_id} config={cfg_client_id}")
267+
if cfg_token_url and tok_token_url and cfg_token_url != tok_token_url:
268+
mismatches.append(f"token_url: stored={tok_token_url} config={cfg_token_url}")
269+
270+
if mismatches:
271+
logger.warning(
272+
"MCP token/config mismatch for user %s on toolset %s — %s. "
273+
"The cached token was issued under a different OAuth config; re-authenticate.",
274+
user_id, toolset.name, "; ".join(mismatches),
275+
)
276+
elif tok_team:
277+
logger.warning(
278+
"MCP failure for user %s on toolset %s — token was issued in workspace %s. "
279+
"If the server gates access per workspace, the user may have authenticated in the wrong one.",
280+
user_id, toolset.name, tok_team,
281+
)
282+
except Exception:
283+
logger.debug("Failed to log token/config mismatch", exc_info=True)
284+
230285
@staticmethod
231286
def _evict_expired_token(user_id: str, toolset: Any) -> None:
232287
"""Remove an expired/revoked token from cache and disk (CLI only).

tests/test_mcp_oauth.py

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,9 @@
3232
OAuthDecisionCode,
3333
OAuthEndpoints,
3434
OAuthExchangeManager,
35+
OAuthTokenExchangeError,
3536
_get_exchange_manager,
37+
exchange_code_for_tokens,
3638
parse_oauth_decision,
3739
)
3840
from holmes.core.oauth_utils import (
@@ -370,6 +372,73 @@ def test_missing_exchange_does_not_crash(self):
370372
_get_exchange_manager().complete_exchange("nonexistent-id", oauth_code, None)
371373
# Should log error but not raise
372374

375+
def test_client_secret_post_falls_back_when_idp_returns_200_on_error(self):
376+
"""Regression: Slack (and other client_secret_post-only IdPs) return HTTP 200
377+
with {"ok": false, "error": ...} when the secret is sent via Basic Auth.
378+
379+
The exchange must detect a 200 response without an access_token and retry
380+
with client_secret in the POST body.
381+
"""
382+
basic_auth_resp = MagicMock()
383+
basic_auth_resp.status_code = 200
384+
basic_auth_resp.is_success = True
385+
basic_auth_resp.json.return_value = {"ok": False, "error": "bad_client_secret"}
386+
387+
body_auth_resp = MagicMock()
388+
body_auth_resp.status_code = 200
389+
body_auth_resp.is_success = True
390+
body_auth_resp.json.return_value = {
391+
"ok": True,
392+
"access_token": "xoxp-slack-token",
393+
"token_type": "Bearer",
394+
}
395+
396+
with patch(
397+
"holmes.core.oauth_config.httpx.post",
398+
side_effect=[basic_auth_resp, body_auth_resp],
399+
) as mock_post:
400+
token_data = exchange_code_for_tokens(
401+
token_url="https://slack.com/api/oauth.v2.user.access",
402+
code="slack-auth-code",
403+
redirect_uri="http://frontend/callback",
404+
client_id="slack-client-id",
405+
code_verifier="slack-pkce-verifier",
406+
client_secret="slack-client-secret",
407+
)
408+
409+
assert token_data["access_token"] == "xoxp-slack-token"
410+
assert mock_post.call_count == 2
411+
412+
first_call_kwargs = mock_post.call_args_list[0].kwargs
413+
assert isinstance(first_call_kwargs.get("auth"), httpx.BasicAuth)
414+
415+
second_call_kwargs = mock_post.call_args_list[1].kwargs
416+
assert second_call_kwargs.get("auth") is None
417+
assert second_call_kwargs["data"]["client_secret"] == "slack-client-secret"
418+
419+
def test_client_secret_post_raises_when_body_retry_also_fails(self):
420+
"""If even the POST-body retry returns 200 without access_token, the function
421+
must still raise OAuthTokenExchangeError (no silent success)."""
422+
bad_resp = MagicMock()
423+
bad_resp.status_code = 200
424+
bad_resp.is_success = True
425+
bad_resp.text = '{"ok": false, "error": "bad_client_secret"}'
426+
bad_resp.json.return_value = {"ok": False, "error": "bad_client_secret"}
427+
428+
with patch(
429+
"holmes.core.oauth_config.httpx.post",
430+
side_effect=[bad_resp, bad_resp],
431+
):
432+
with pytest.raises(OAuthTokenExchangeError) as excinfo:
433+
exchange_code_for_tokens(
434+
token_url="https://slack.com/api/oauth.v2.user.access",
435+
code="slack-auth-code",
436+
redirect_uri="http://frontend/callback",
437+
client_id="slack-client-id",
438+
client_secret="slack-client-secret",
439+
)
440+
assert "access_token" in str(excinfo.value.detail)
441+
373442

374443
class TestParseOAuthDecision:
375444
def test_valid_oauth_decision(self):

0 commit comments

Comments
 (0)