dynamic mcp: enforce loaded tool execution#774
Conversation
🤖 LLM Evaluation ResultsOpenAI
❌ Failed EvaluationsShow 7 failuresOPENAI1. TestReactEval/[openai]_react_cat_message
2. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json
3. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json
4. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json
5. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json
6. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json
7. TestDirectMessageConversations/[openai]_bot_dm_tool_introspection
Anthropic
❌ Failed EvaluationsShow 7 failuresANTHROPIC1. TestReactEval/[anthropic]_react_cat_message
2. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json
3. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json
4. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json
5. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json
6. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json
7. TestDirectMessageConversations/[anthropic]_bot_dm_tool_introspection
This comment was automatically generated by the eval CI pipeline. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e882aa74f0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThis PR implements conversation-scoped MCP dynamic tool loading with strict registries and loaded-tool persistence. It threads request context through MCP tool retrieval, extends tool-use serialization with approval metadata (schema, description, bare name), detects and errors unloaded/unknown tools early in execution, exempts meta-tool failures from retry limits, and strips persisted assistant reasoning from tool rounds. ChangesMCP Tool Loading, Metadata Persistence, and Registry
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@conversation/convert.go`:
- Around line 254-259: In the json.RawMessage branch of the schema handling,
detect and treat a raw "null" value as absent: after confirming schema is
json.RawMessage, check if len(raw)==0 or the trimmed bytes equal []byte("null")
(e.g., bytes.TrimSpace(raw) == []byte("null")); if so return nil, otherwise
return the copied raw with append(json.RawMessage(nil), raw...).
In `@llmcontext/llm_context.go`:
- Around line 407-410: Comparisons use raw ServerOrigin strings which can differ
by trailing slashes/whitespace; call normalizeMCPServerOrigin(...) on both
spec.ServerOrigin and tool.ServerOrigin before comparing or using as map keys so
normalized values are matched consistently (update the if check around
spec.ServerOrigin/tool.ServerOrigin and any storage/lookup of origin keys), and
apply the same normalization in the other block referenced (the code around
lines handling 640-649) to ensure allowlist/preload logic uses normalized
origins when calling llm.MCPToolNameMatches(tool.Name, spec.ToolName) and when
storing/looking up origin-based keys.
- Around line 262-267: requestContextFromLLMContext currently drops caller
request cancellation/deadline because it always falls back to Background; change
its signature to accept a fallback stdcontext.Context and return
c.RequestContext when set otherwise return the provided fallback (so callers can
pass the real request context into GetToolsForUser instead of losing
cancellation), update the call site that invokes GetToolsForUser to pass the
incoming request context through that new param; also make origin comparisons
deterministic by using normalizeMCPServerOrigin whenever comparing or building
keys in preloadMCPTools and retainOnlyAllowedMCPTools (replace raw uses of
spec.ServerOrigin and t.ServerOrigin with
normalizeMCPServerOrigin(spec.ServerOrigin) /
normalizeMCPServerOrigin(t.ServerOrigin) before equality checks and allowlist
key construction).
In `@mcp/client_manager.go`:
- Around line 184-187: The current GetToolsForUser method silently substitutes
context.Background() when ctx is nil, which masks cancellation and deadlines;
change GetToolsForUser to validate ctx at the start and return a clear *Errors
(or standard error wrapped in *Errors) indicating "nil context passed" (or
similar) instead of creating a background context, update the function
signature/return path to produce this error when ctx == nil, and adjust callers
to pass a non-nil context (or handle the new error) so context propagation is
preserved; reference the GetToolsForUser method and the Errors return type when
making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 609b73c1-2fe3-442b-9f7c-59444be78519
📒 Files selected for processing (29)
api/api.goapi/api_mcp.goapi/api_no_tools_test.goapi/api_test.goconversation/content_block.goconversation/content_block_test.goconversation/convert.goconversation/convert_test.goconversation/helpers.goconversation/helpers_test.goconversation/service.goconversation/service_test.goconversations/agent_mention_reminder_test.goconversations/conversations_test.goconversations/dm_conversation_test.gollm/tool_retry.gollm/tool_retry_test.gollmcontext/llm_context.gollmcontext/llm_context_strict_test.gollmcontext/llm_context_test.gomcp/client_embedded_oauth_test.gomcp/client_integration_test.gomcp/client_manager.gomcp/client_manager_test.gomcp/client_test.goserver/main.gotoolrunner/toolrunner.gotoolrunner/toolrunner_extended_test.gotoolrunner/toolrunner_test.go
e6c9846 to
e5ef694
Compare
e882aa7 to
80e5ad8
Compare
e5ef694 to
0928d23
Compare
80e5ad8 to
c383e63
Compare
Co-authored-by: Cursor <cursoragent@cursor.com>
c383e63 to
1c509aa
Compare
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Security Review — No Findings
Reviewed the full diff (34 files, ~3.4k additions) for this PR which introduces strict loaded/unloaded tool visibility enforcement, MCP dynamic tool loading via search_tools/load_tool meta-tools, and hardened retry/approval behavior.
Areas analyzed
-
Meta-tool auto-approval (
conversations/conversations.go):search_toolsandload_toolbypass the normal policy-checker approval flow viaisMCPMetaToolCall. Verified that:- The check correctly rejects tool calls with a
ServerOriginset, preventing a malicious MCP server from registering a same-named tool that gets auto-approved. - When
llmContext.Toolsexists, it confirms the tool has no server origin in the store — a second defense against name collisions. - In channel mode,
applyBotChannelAutoEverywhereToolFilterremoves meta-tools (emptyServerOrigin), so they are not reachable from channel contexts.
- The check correctly rejects tool calls with a
-
Tool registry scoping (
llmcontext/llm_context.go): TheToolRegistryfed to meta-tools is built frommcpToolsthat have already been filtered by admin bot allowlist, user-disabled server origins, and channel predicates.load_toolcan only materialize tools from this pre-filtered registry — it cannot bypass access controls. -
Loaded-tool restoration (
restoreLoadedMCPTools,AttachConversationID): Previously loaded tools are restored only if they still exist in the current (filtered) registry. Stale entries are deleted. AfterAttachConversationID,handleDMViaConversationapplies a secondRemoveToolsByServerOriginpass, ensuring user-disabled server tools can't resurface through restoration. -
Unknown/unavailable tool handling (
toolrunner/toolrunner.go): The newcontainsUnavailableToolscheck runs BEFOREshouldExecute, preventing both execution and approval-prompt exposure for unknown tools. Mixed batches (some available, some not) correctly skip all tools and return batch-skipped errors. -
Tool retry exemptions (
llm/tool_retry.go):IsToolRetryExemptfor meta-tools prevents search/load failures from exhausting the 3-failure limit. This is bounded by LLM token limits and natural convergence behavior — not exploitable for resource exhaustion beyond normal tool-call patterns. -
Data redaction (
conversation/content_block.go,conversation/convert.go): The newInputSchema,MCPBareName, andToolDescriptionfields are properly cleared inFilterForNonRequester(for non-requesting users) and in the redact-unshared path ofBlocksToPost. -
allToolsAutoRunEverywherewith meta-tools: When a turn contains only meta-tool calls, the function returnstrue(shared=true). Meta-tool results contain only tool names/schemas (not user channel data), so sharing them is safe.
No medium, high, or critical vulnerabilities identified.
Sent by Cursor Automation: Find vulnerabilities
| if turn.Role == "assistant" { | ||
| // Anthropic signed thinking must be replayed byte-for-byte. Our stored | ||
| // content blocks intentionally normalize assistant output (merge text | ||
| // blocks, pair tool_use/result turns, redact private content), so the | ||
| // persisted thinking block is not safe to send back as provider history. | ||
| // Keep reasoning for UI/persistence, but omit it from rebuilt requests. | ||
| post.Reasoning = "" | ||
| post.ReasoningSignature = "" | ||
| } |
There was a problem hiding this comment.
This is needed until:


Summary
Enforces strict loaded/unloaded tool visibility in LLM context and toolrunner flows, restores loaded MCP tools from conversation state, and hardens retry/approval execution behavior.
Stack: 4/6
Ticket Link
N/A
Screenshots
N/A
Release Note
Made with Cursor
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Improvements