Skip to content

dynamic mcp: harden client catalog discovery#772

Open
nickmisasi wants to merge 4 commits into
dynamic-mcp-01-config-storefrom
dynamic-mcp-02-client-catalog
Open

dynamic mcp: harden client catalog discovery#772
nickmisasi wants to merge 4 commits into
dynamic-mcp-01-config-storefrom
dynamic-mcp-02-client-catalog

Conversation

@nickmisasi
Copy link
Copy Markdown
Collaborator

@nickmisasi nickmisasi commented May 23, 2026

Summary

Hardens MCP client catalog discovery with safer tool identity handling (prior to this, a tool registration with a name that was already registered would be dropped), plugin/embedded discovery refresh behavior, user-client synchronization, and retrieval override plumbing. All plumbing includes backwards compatibility support for existing tool policies that will not have the namespace plumbing setup by this branch.

Stack: 2/6

Ticket Link

N/A

Screenshots

N/A

Release Note

NONE

Made with Cursor

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced MCP tool dynamic loading with improved caching and conflict resolution across multiple tool sources.
    • Added support for plugin-based MCP server integration with per-user client management.
    • Implemented tool namespacing to better organize and deduplicate tools across different sources.
  • Improvements

    • Enhanced tool discovery logging and tracking for better diagnostics.
    • Optimized tool list caching to reduce redundant lookups.

Review Change Stack

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2026

🤖 LLM Evaluation Results

OpenAI

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ OPENAI 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

OPENAI

1. TestReactEval/[openai]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text token "smiley_cat", not an actual cat emoji (e.g., 🐱/😺) or a heart/love emoji (e.g., ❤️/💕).

2. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output does not provide an actual list of bugs; it states inability to access the bug tracker and offers a template and instructions for providing data. A template without filled bug entries is not a list of bugs.

3. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output explicitly says it cannot enumerate the actual reported bugs without source data and provides only a blank template; it does not include descriptions of each bug.

4. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output does not list any bugs or associate them with specific users; it only asks for source data and provides a blank template. Therefore, it does not attribute each bug to a user.

5. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention the specific bug about saving without a color or the save button doing nothing, and it does not attribute any bug to @maria.nunez.

6. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention the specific bug about an end user being able to change the channel banner, nor does it attribute that bug to @maria.nunez. It only asks for source data and provides a template.

7. TestDirectMessageConversations/[openai]_bot_dm_tool_introspection

  • Score: 0.00
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation (Mattermost documentation link) but does not mention GitHub anywhere, so it does not satisfy the rubric requirement to mention GitHub and refer to the documentation.

Anthropic

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ ANTHROPIC 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

ANTHROPIC

1. TestReactEval/[anthropic]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text string "heart_eyes_cat" rather than an actual cat emoji (e.g., 😺) or a heart/love emoji (e.g., ❤️).

2. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output explicitly states it cannot access bug trackers and does not provide any actual bugs; it only suggests where to look and offers to help format details if provided.

3. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output does not describe any specific bugs; it only states an inability to access bug trackers and suggests where to look. Therefore it does not include a description of each bug.

4. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output does not list any bugs at all, and therefore does not attribute each bug to a user. It only states lack of access and suggests where to look.

5. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention @maria.nunez and does not attribute the specific bug (saving without a color and the save button doing nothing) to anyone. It only states lack of access to bug trackers and suggests where to look.

6. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention the specific bug about an end user changing the channel banner, and it does not attribute that bug to @maria.nunez.

7. TestDirectMessageConversations/[anthropic]_bot_dm_tool_introspection

  • Score: 0.00
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation at docs.mattermost.com, satisfying the documentation reference, but it does not mention GitHub anywhere.

This comment was automatically generated by the eval CI pipeline.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbdf599280

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mcp/user_clients.go
@nickmisasi
Copy link
Copy Markdown
Collaborator Author

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

📝 Walkthrough

Walkthrough

This PR introduces MCP dynamic tool loading with telemetry tracking, tool store enhancements with namespacing helpers, MCP client caching with concurrency-safe invalidation, and per-server slug-based tool naming. It propagates context throughout the MCP client lifecycle and adds tool retrieval override metadata.

Changes

MCP Dynamic Tool Loading and Telemetry

Layer / File(s) Summary
LLM context and MCP dynamic tool telemetry
llm/context.go, llm/context_test.go
Context gains request scoping (RequestContext, ConversationID), MCP dynamic tool fields (loading flag, telemetry interface, search/load tracking maps, server origins, tool callbacks), and four methods to safely observe/track events with fallback bot-name logic and nil-guards.

Tool Store Enhancements and MCP Tool Namespacing

Layer / File(s) Summary
ToolCall schema field and ToolStore tracing/unloading
llm/tools.go (portions)
ToolCall adds schema and MCP bare name fields; ToolStore extends with configurable tracing/logging interfaces, MCP unloading state, and updated constructors accepting optional configuration.
Tool resolution with telemetry and logging
llm/tools.go (portions)
ResolveTool emits telemetry errors and invokes tracing/logging helpers; new TraceUnknown, TraceResolved, LogUnknownToolWarning methods; nil-receiver guards on GetTool and MCP unloading methods.
MCP tool naming and filtering helpers
llm/tools.go (portions)
New MCPToolNameSeparator constant and helpers to namespace/bare-name MCP tools, test matches, check allowlist membership, normalize disabled origins, and filter by allowlist; RetainOnlyMCPTools uses new allowlist logic.
Tool store tests and test updates
llm/tools_test.go, llm/* (test updates)
Log-capture test helpers; comprehensive coverage for tool resolution tracing, MCP naming functions, allowlist filtering, unloaded tool lifecycle, and origin normalization; existing tests updated to use NewToolStore(nil, false).

MCP Client Caching, Invalidation, and Tool Discovery

Layer / File(s) Summary
Client types and cache field additions
mcp/client.go (portions)
EmbeddedServerClient adds toolsCache field; Client adds concurrency-aware fields (toolsMu, discovery/dirty state, generation, notification owner coordination) for thread-safe invalidation.
Constructor updates and shared tool listing
mcp/client.go (portions)
New NewEmbeddedServerClientWithCache constructor; shared listAllTools helper collects paginated tools; NewClient uses listAllTools, stores under lock, manages cache/toolsCache assignment.
Plugin client and embedded session setup
mcp/client.go (portions)
New NewPluginClient for per-user plugin-server clients with HTTP headers and in-memory invalidation; createSession wires tool invalidation into SDK ToolListChangedHandler.
Tool invalidation, rediscovery, and snapshot access
mcp/client.go (portions)
New invalidateDiscoveredTools for cache clearing; notificationOwner/setNotificationOwner for concurrent invalidation coordination; ensureDiscoveredTools for concurrency-safe rediscovery with shared-cache invalidation; Tools() returns defensive copy under lock.
Embedded reconnect and tool-call response handling
mcp/client.go (portions)
Embedded reconnect uses Tools() snapshot, updates under lock, clears dirty flag; tool-call response text extraction refactored to strings.Builder.
Embedded client OAuth and cache tests
mcp/client_embedded_oauth_test.go
Comprehensive test suite for embedded client tool discovery, cache/tool invalidation on list changes, defensive Tools() copy safety, OAuth metadata extraction, nil cache handling, shared-cache decision logic.
Client pagination, cache, and notification tests
mcp/client_test.go
Tests for listAllTools pagination, cache hit/miss, partial failure safety, empty catalogue errors, remote tool-list invalidation, rediscovery, concurrent invalidation during discovery, nil cache, static OAuth cache skip, notification-storm idempotence.

MCP Client Manager and User Client Integration

Layer / File(s) Summary
Client manager initialization and context propagation
mcp/client_manager.go (portions)
ReInit uses NewEmbeddedServerClientWithCache; createAndStoreUserClient accepts context, performs remote connections outside lock, populates errors via setInitialRemoteConnectErrors; getClientForUser accepts context and returns errors.
Tool retrieval overrides and plugin context
mcp/retrieval_overrides.go, mcp/client_manager.go (portions)
New mcp/retrieval_overrides.go with ToolRetrievalOverride struct and key builder; GetToolRetrievalOverrides builds override maps for remote/embedded/plugin tools with whitespace trimming and duplicate handling.
Tool filtering with normalized names
mcp/client_manager.go (portions), mcp/client_manager_filter_test.go
Tool policy lookups normalize names via llm.BareMCPToolName before evaluation; test cases validate namespaced tool denormalization for admin policies.
UserClients concurrency control and context
mcp/user_clients.go (portions)
New clientsMu for thread-safe cache access; ConnectToRemoteServers and ConnectToEmbeddedServerIfAvailable accept context with 10s embedded timeout; snapshot-based operations and nil-receiver guards.
Tool slug-based namespacing and request context
mcp/user_clients.go (portions)
Per-server slug generation with collision deconfliction; GetTools accepts context, uses snapshots, returns namespaced names with cross-server deduplication; createToolResolver uses request context when available; slug sanitization and hashing helpers.
Manager and UserClient integration tests
mcp/client_manager_test.go, mcp/user_clients_test.go
Tests for override generation and duplicate handling; tool deduplication/slugging, resolver bare-name usage, embedded slug conversion, deterministic collision resolution, rediscovery ordering, snapshot-based client inspection, context cancellation propagation.

Test Updates and Cache Infrastructure

Layer / File(s) Summary
Test infrastructure and tool store constructor updates
mcp/tools_cache_test.go, mcpserver/eval_helpers_test.go, telemetry/integration_test.go, search/search_test.go
Added mockLogService.Flush() method and cache invalidation test; updated tool store constructor calls across test files to use NewToolStore(nil, false); tightened async mock expectations in search test.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • mattermost/mattermost-plugin-agents#542: Both PRs add telemetry around tool resolution and MCP tool calls; main PR adds dynamic tool event tracking via LLM Context, while retrieved PR adds OpenTelemetry tracing for tool resolution.
  • mattermost/mattermost-plugin-agents#678: Main PR's MCP tool-name parsing helpers (BareMCPToolName, namespacing logic) are code-level dependencies for the retrieved PR's plugin-server tool filtering that normalizes tool names via those helpers.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.79% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: hardening MCP client catalog discovery, which is the central focus across context extensions, tool identity refactoring, and client synchronization.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dynamic-mcp-02-client-catalog

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
mcp/client.go (1)

163-164: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard against nil pluginAPI before session validation.

On Line 164, c.pluginAPI is dereferenced when sessionID != "". If this client is constructed without a plugin API, this path panics instead of returning a typed error.

Suggested fix
 func (c *EmbeddedServerClient) CreateClient(ctx context.Context, userID, sessionID string) (*Client, error) {
 	// Validate session exists before creating transport (unless empty for tool discovery)
 	if sessionID != "" {
+		if c.pluginAPI == nil {
+			return nil, fmt.Errorf("plugin API is required when sessionID is provided")
+		}
 		mmSession, err := c.pluginAPI.Session.Get(sessionID)
 		if err != nil {
 			return nil, fmt.Errorf("failed to get session: %w", err)
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mcp/client.go` around lines 163 - 164, The code dereferences c.pluginAPI when
sessionID != "" (calling c.pluginAPI.Session.Get) which will panic if
c.pluginAPI is nil; add a nil check for c.pluginAPI before calling Session.Get
and return a typed error (e.g., ErrPluginAPIUnavailable or a wrapped error) when
it's nil, then only call c.pluginAPI.Session.Get when c.pluginAPI is non-nil and
proceed to validate mmSession as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/tools.go`:
- Around line 665-668: The function toolArgsForLog should guard the argsGetter
before calling it: check if argsGetter == nil and return a safe string like "no
args getter" (or similar) instead of invoking it, then proceed to call
argsGetter(&raw) and handle its error as before; update the toolArgsForLog
function to perform this nil check (referencing the argsGetter parameter and the
toolArgsForLog function) so logging paths that pass a nil getter do not panic.

In `@mcp/client_manager.go`:
- Around line 205-213: The code currently persists transient connect errors by
calling userClient.appendInitialRemoteConnectError(connectErr) after
ConnectToPluginServer fails; instead, make plugin connect errors request-scoped
or cleared on successful connect: remove or avoid appending to the cached client
on transient failure and either collect connectErr in a local slice for this
request (return it with the response) or ensure you clear per-plugin entries on
subsequent successful ConnectToPluginServer calls (e.g., call a
clearInitialRemoteConnectError-like operation after a successful connect).
Update the logic around userClient.ConnectToPluginServer,
userClient.appendInitialRemoteConnectError and
userClient.InitialRemoteConnectErrors so the cached UserClient no longer
permanently stores transient plugin connect failures.
- Around line 184-188: Change GetToolsForUser to accept ctx context.Context as
its first parameter (i.e., func (m *ClientManager) GetToolsForUser(ctx
context.Context, userID string) ...) and remove the context.Background() call;
propagate this ctx into the downstream call to m.getClientForUser(ctx, userID)
and any other internal calls within GetToolsForUser so caller
cancellation/deadlines flow through the LLM entry path.

In `@mcp/user_clients.go`:
- Around line 371-377: The code currently creates a fallback callCtx :=
context.Background() before invoking client.CallToolWithMetadata which allows
tool calls to outlive request cancellation; instead, require a request-scoped
context: remove the context.Background() fallback, validate that llmContext and
llmContext.RequestContext are non-nil (or ensure a context is passed into the
resolver closure) and return an error immediately if missing, then call
client.CallToolWithMetadata using the verified llmContext.RequestContext along
with toolName, args and metadata.

---

Outside diff comments:
In `@mcp/client.go`:
- Around line 163-164: The code dereferences c.pluginAPI when sessionID != ""
(calling c.pluginAPI.Session.Get) which will panic if c.pluginAPI is nil; add a
nil check for c.pluginAPI before calling Session.Get and return a typed error
(e.g., ErrPluginAPIUnavailable or a wrapped error) when it's nil, then only call
c.pluginAPI.Session.Get when c.pluginAPI is non-nil and proceed to validate
mmSession as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 2d34a450-b143-49b9-abb0-256fdd5cd29c

📥 Commits

Reviewing files that changed from the base of the PR and between c65e350 and dbdf599.

📒 Files selected for processing (17)
  • llm/context.go
  • llm/context_test.go
  • llm/tools.go
  • llm/tools_test.go
  • mcp/client.go
  • mcp/client_embedded_oauth_test.go
  • mcp/client_manager.go
  • mcp/client_manager_filter_test.go
  • mcp/client_manager_test.go
  • mcp/client_test.go
  • mcp/retrieval_overrides.go
  • mcp/tools_cache_test.go
  • mcp/user_clients.go
  • mcp/user_clients_test.go
  • mcpserver/eval_helpers_test.go
  • search/search_test.go
  • telemetry/integration_test.go

Comment thread llm/tools.go
Comment thread mcp/client_manager.go Outdated
Comment thread mcp/client_manager.go Outdated
Comment thread mcp/user_clients.go Outdated
Co-authored-by: Cursor <cursoragent@cursor.com>
@nickmisasi nickmisasi force-pushed the dynamic-mcp-02-client-catalog branch from dbdf599 to eb5e85f Compare May 23, 2026 02:52
Co-authored-by: Cursor <cursoragent@cursor.com>
nickmisasi and others added 2 commits May 23, 2026 11:20
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review: No Actionable Findings

Reviewed the full diff (3014 additions, 424 deletions across 38 files) for this MCP client catalog hardening PR. The changes are security-positive overall.

Key areas examined:

  • channelAnalysisToolBot tool expansion (api/api_channel.go): Overrides the bot's MCP tool allowlist for channel analysis. Initially appeared risky since channel analysis auto-approves all tool calls, but the downstream AnalyzeChannel function correctly scopes the tool set to only read_channel and get_channel_info with bound channel_id parameters before the LLM sees them. The expanded catalog is only used to locate these two tools.

  • Context propagation: context.Background() removed from production MCP tool paths. GetToolsForUser, createAndStoreUserClient, ConnectToRemoteServers, and ConnectToEmbeddedServerIfAvailable now accept caller-scoped context.Context. Tool resolvers require RequestContext and fail fast if missing.

  • Tool namespacing and bare-name fallback (llm/tools.go): lookupTool bare-name resolution correctly returns ambiguous results as not-found. mcpToolAllowed scopes allowlist checks by ServerOrigin, preventing cross-server tool confusion.

  • Concurrency safety: New mutex protection (toolsMu, clientsMu, discoveryMu, notifyOwnerMu) on Client and UserClients prevents race conditions during tool list invalidation and rediscovery.

  • Plugin MCP client (mcp/client.go): NewPluginClient correctly injects authenticated user ID via headerTransport from session-derived data, not user input. Tool list changed handlers properly invalidate caches.

  • Trace logging (llm/tools.go): TraceResolved logs full tool results when doTrace is true, but this flag is not enabled in any production code path.

No medium, high, or critical vulnerabilities identified.

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant