Skip to content

dynamic mcp: wire conversation workflows#775

Open
nickmisasi wants to merge 4 commits into
dynamic-mcp-04-context-toolrunnerfrom
dynamic-mcp-05-conversations-api
Open

dynamic mcp: wire conversation workflows#775
nickmisasi wants to merge 4 commits into
dynamic-mcp-04-context-toolrunnerfrom
dynamic-mcp-05-conversations-api

Conversation

@nickmisasi
Copy link
Copy Markdown
Collaborator

@nickmisasi nickmisasi commented May 23, 2026

Summary

Wires dynamic MCP loading into conversation, approval, channel analysis, regeneration, and API flows, including safer prompt guidance for underspecified side-effecting actions.

Stack: 5/6

Ticket Link

N/A

Screenshots

N/A

Release Note

Enabled dynamic MCP tool loading in conversation and channel analysis workflows.

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Agent API adds optional MCP dynamic tool loading flag (defaults to true); prompts now include dynamic MCP tool workflow when enabled.
    • Channel analysis and tool contexts preload and bind required embedded MCP tools.
  • Bug Fixes

    • Preserve existing MCP setting on partial updates; improved tool availability validation and safer tool-approval handling.
    • Restored tool-state behaviour for regeneration and follow-ups.
  • Tests

    • Extensive tests added for agent config, channel analysis, dynamic MCP workflows, tool approval, and policy behaviours.

Review Change Stack

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2026

🤖 LLM Evaluation Results

OpenAI

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ OPENAI 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

OPENAI

1. TestReactEval/[openai]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text string "heart_eyes_cat", not an actual cat emoji (e.g., 😺) or a heart/love emoji (e.g., ❤️).

2. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output does not provide any actual bug entries. It states it cannot access a bug tracker and includes a blank table template and instructions for the user to supply bugs, so it is not itself a list of bugs.

3. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output does not include descriptions of actual bugs; it only asks the user to provide bug entries and shows an empty table template. Therefore it does not include a description of each bug.

4. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output asks the user to paste bug entries and provides a table template with a "Reported by" column, but it does not actually attribute any specific bug to any user because no bugs or reporters are listed.

5. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention the specific bug about saving without a color or the save button doing nothing, nor does it attribute that bug to @maria.nunez. It only asks for bug entries to be provided.

6. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention any specific bug about an end user being able to change the channel banner, nor does it attribute such a bug to @maria.nunez. It only asks for bug entries to be provided.

7. TestDirectMessageConversations/[openai]_bot_dm_tool_introspection

  • Score: 0.00
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation (docs.mattermost.com) but does not mention GitHub anywhere, so it does not satisfy the rubric requirement to mention GitHub and refer to the documentation.

Anthropic

⚠️ Overall: 20/28 tests passed (71.4%)

Provider Total Passed Failed Pass Rate
⚠️ ANTHROPIC 28 20 8 71.4%

❌ Failed Evaluations

Show 8 failures

ANTHROPIC

1. TestReactEval/[anthropic]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the literal text "heart_eyes_cat" rather than an actual cat emoji (e.g., 😺/🐱) or a heart/love emoji (e.g., ❤️/😍).

2. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output explains it cannot access a bug tracker and suggests ways to find bugs, but it does not actually provide a list of bugs.

3. TestChannelSummarization/[anthropic]_channel_summarization_developers_webapp_channel

  • Score: 0.00
  • Rubric: mentions @claudio.costa is working on adding code coverage tracking to the monorepo
  • Reason: The output says @claudio.costa is adding code coverage tracking to the webapp, not specifically to the monorepo. It does not mention the monorepo explicitly.

4. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output does not list any bugs or provide descriptions of each bug; it only states it cannot access a tracker and suggests ways to find bug details.

5. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output does not list any bugs at all, and therefore does not attribute each bug to a user. It instead explains it cannot access a tracker and suggests ways to find bugs.

6. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention @maria.nunez or attribute any bug (including saving without a color or the save button doing nothing) to her. It only states inability to access trackers and suggests ways to find bugs.

7. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention the specific bug about an end user changing the channel banner, nor does it attribute that bug to @maria.nunez.

8. TestDirectMessageConversations/[anthropic]_bot_dm_tool_introspection

  • Score: 0.00
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation (docs.mattermost.com) but does not mention GitHub anywhere, so it does not satisfy the rubric requirement to mention GitHub and refer to the documentation.

This comment was automatically generated by the eval CI pipeline.

@nickmisasi
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

📝 Walkthrough

Walkthrough

This PR adds optional agent MCPDynamicToolLoading, preloads and validates embedded MCP tools (including bare-name aliasing and ServerOrigin checks), centralizes single-build LLM context construction with late conversation-ID binding, refactors tool-resolution and policy lookups, and expands tests and prompt text for dynamic MCP search/load workflows.

Changes

Dynamic MCP Tool Loading Feature

Layer / File(s) Summary
Agent MCPDynamicToolLoading Configuration
api/api_agents.go, api/api_agents_test.go
CreateAgentRequest and UpdateAgentRequest gain optional MCPDynamicToolLoading *bool fields that default to true on create and update only when provided. Tests verify defaulting, persistence, and JSON round-trips.
Tool Store Constructor Standardization
api/api_llm_bridge.go, channels/analysis_conversation_test.go, conversations/bot_channel_tool_filter_test.go, conversations/tool_policy_test.go, prompts/standard_personality_without_locale_test.go
Replaces bare llm.NewToolStore() uses with explicit llm.NewToolStore(nil, false) across runtime and tests for consistent initialization.
Channel Analysis Tool Preloading and Validation
api/api_channel.go, api/api_channel_analysis_test.go
Preloads required embedded MCP tools (read_channel, get_channel_info) into LLM context, computes available vs missing embedded tools (handles namespaced/bare aliases), logs diagnostics, and returns 500 on missing tools. Tests exercise preload and availability logic.
Tool Resolution with Metadata Validation
channels/channels.go, channels/analysis_conversation_test.go
Introduces helpers to resolve MCP tools by exact or bare names, enforces embedded ServerOrigin, binds channel_id args, and rebuilds scoped tool stores containing only the bound embedded tools.
Tool Policy and Auto-Execution Resolution
conversations/conversations.go, conversations/tool_policy_test.go
Resolves incoming tool-call names against the visible tool store (exact or unique bare-name), auto-allows MCP meta-tools, and looks up policy by ServerOrigin + bare name; updates auto-execution and all-tools-auto-run logic and adds tests for denormalization and disambiguation.
Conversation Context Building with Late Binding
conversations/handle_messages.go, conversations/regeneration.go, conversations/single_build_test.go
Introduces buildConversationContextWithTools and AttachConversationID for single-pass MCP tool registry building, injects per-user MCP preference context options, and preserves visible-store semantics via post-build pruning. Tests verify single GetToolsForUser invocation and late-binding restoration.
Channel Follow-up Tool Filtering
conversations/bot_channel_tool_filter.go, conversations/bot_channel_tool_filter_test.go
Adds channelFollowUpMCPToolFilterContextOptions and shouldConstrainChannelFollowUpToAutoEverywhere to conditionally apply auto-run-everywhere filtering for channel follow-ups; filter preserves meta-tools and records disabled non-meta tools info when checker is nil. Tests cover fail-closed registry lookups, DM exclusion, and activation gating.
Approved Tool Block Resolution and Loaded State
conversations/tool_approval.go, conversations/tool_approval_internal_test.go, conversations/loaded_state_flow_test.go
Adds resolveApprovedToolUseBlock to validate approved loaded-tool metadata (ServerOrigin, MCP bare name) before resolution, tightens Shared/DecidedAt semantics, and adds loaded-state tests for restore, revocation, and regeneration flows.
Dynamic MCP Workflow Tests
conversations/dynamic_mcp_workflow_test.go
Adds dynamicWorkflowLLM stub and integration tests that simulate search→load→business tool sequences, asserting persistence of loaded tools, schema materialization in context, and policy-mode differences (auto-run vs ask).
Prompt Template and MCP Tests
prompts/standard_personality_without_locale.tmpl, prompts/standard_personality_without_locale_test.go, api/api_mcp_test.go, api/api_config_test.go
Adds conditional dynamic MCP workflow instructions to the standard personality prompt when dynamic loading and meta-tools are present; updates tests for prompt rendering, bare-name exposure, retrieval override preservation, and admin-config round-trips.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested labels

Setup Cloud Test Server

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'dynamic mcp: wire conversation workflows' directly summarizes the main change—integrating dynamic MCP loading into conversation and related workflows.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dynamic-mcp-05-conversations-api

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
api/api_channel_analysis_test.go (1)

158-194: 💤 Low value

Consider closing the response body to prevent resource leaks.

At line 186, resp := recorder.Result() returns an *http.Response whose body should be closed. While this rarely causes issues in tests, it's good hygiene to close it.

🧹 Suggested fix
 			recorder := httptest.NewRecorder()
 			e.api.ServeHTTP(&plugin.Context{}, recorder, request)
 			resp := recorder.Result()
+			defer resp.Body.Close()

 			require.Equal(t, http.StatusOK, resp.StatusCode)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/api_channel_analysis_test.go` around lines 158 - 194, In
TestHandleChannelAnalysisPreloadsRequiredMCPTools, the http.Response returned by
recorder.Result() (resp) is never closed; after calling resp :=
recorder.Result() add a resp.Body.Close() (or immediately defer
resp.Body.Close()) to avoid leaking the response body; locate the resp variable
in TestHandleChannelAnalysisPreloadsRequiredMCPTools and insert the close right
after obtaining resp.
channels/channels.go (1)

117-144: 💤 Low value

Re-aliasing tool name alters the returned tool's identity.

At line 142, tool.Name = name mutates the copy to replace the namespaced name (e.g., mattermost__read_channel) with the bare name (read_channel). This is intentional per the docstring in api/api_channel.go (lines 190-195) explaining that preload re-aliases tools. However, this implicit mutation could surprise callers who expect the original namespaced name to be preserved.

Consider adding a brief inline comment to clarify this intentional re-aliasing, e.g., // Return tool with bare name for scoped binding.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@channels/channels.go` around lines 117 - 144, In
requiredEmbeddedToolByExactOrBareName, callers may be surprised by the
intentional re-aliasing of the tool name; add a brief inline comment immediately
before the statement that sets tool.Name = name to explain this behavior (e.g.,
"// Return tool with bare name for scoped binding" or similar) so readers
understand the mutation is deliberate and matches the preload re-aliasing
behavior referenced in api/api_channel.go.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/api_agents_test.go`:
- Around line 246-320: Combine the three separate tests
TestCreateAgentMCPDynamicToolLoadingDefaultsTrueWhenOmitted,
TestCreateAgentMCPDynamicToolLoadingPersistsExplicitFalse, and
TestCreateAgentMCPDynamicToolLoadingPersistsExplicitTrue into a single
table-driven test that iterates cases for "omitted", "explicit true", and
"explicit false" and runs each as a subtest; keep the shared setup calls
(setupAgentTestEnvironment, mockLicensed, mocks for
HasPermissionTo/CreateBot/LogError) and for each case supply the request body
(nil with deletion for omitted, or map with "mcpDynamicToolLoading":
true/false), perform doRequest, decode into llm.BotConfig and assert both the
response value and stored value in
e.agentStore.agents[agent.ID].MCPDynamicToolLoading accordingly; also refactor
the similar tests in the later block referenced (lines ~895-1020) the same way
so all MCP dynamic-tool-loading variants are covered by a single table-driven
subtest loop.

In `@api/api_mcp_test.go`:
- Around line 98-212: Combine the three separate tests
TestHandleGetUserMCPToolsReturnsBareNamesForNamespacedRuntimeTools,
TestHandleGetUserMCPToolsBareAllowlistUICompatibility, and
TestHandleGetUserMCPToolsReturnsUpstreamDescriptionNotOverride into a single
table-driven test (e.g. TestHandleGetUserMCPTools_TableDriven) that iterates
over cases with fields for server config, mock tool (use
mockMCPClientManager.tools), expected tool.Name, Enabled, Policy and
Description; inside the loop call SetupTestEnvironment once per subtest, set
e.config.mcpConfig and e.api.mcpClientManager, call getUserMCPToolsResponse and
assert the expected values using t.Run for each case, keeping existing unique
symbols (getUserMCPToolsResponse, mockMCPClientManager, mcp.ServerConfig,
mcp.ToolConfig) so behavior from the three original tests is preserved as table
entries.

In `@conversations/bot_channel_tool_filter_test.go`:
- Around line 247-452: Combine the repeated "ChannelFollowUpStrict" tests into a
single table-driven subtest that iterates cases (e.g., failures on GetPost,
GetUser, DM-no-filter, ActivateAI true) instead of duplicating logic; for each
case provide inputs/fixtures (origin, config, builder tools, mmClient
expectations, rootPost/user, toolPolicyChecker) and assertions, then run the
same flow that calls channelFollowUpMCPToolFilterContextOptions,
buildChannelFollowUpStrictContext, and uses channelFollowUpSearchToolNames to
assert presence/absence of "jira__safe_tool" and "jira__ask_tool"; retain and
reuse helper symbols like newChannelFollowUpTestBuilder,
channelFollowUpTestMCPTool, buildChannelFollowUpStrictContext, and
channelFollowUpMCPToolFilterContextOptions to construct each case and move
per-case mmClient setups into the table entries' setup functions to keep tests
concise and table-driven.

In `@conversations/dynamic_mcp_workflow_test.go`:
- Around line 78-250: Combine TestDynamicMCPStrictSearchLoadCallHappyPath and
TestDynamicMCPLoadedToolStillRequiresApprovalWhenPolicyAsks into a single
table-driven test that iterates over cases differing by toolPolicyChecker (e.g.,
origin -> "get_issue" {policy: mcp.ToolPolicyAutoRunInDM, enabled:true} vs
mcp.ToolPolicyAsk) and expected outcomes (resolverCalls >0 vs 0,
foundPendingBusinessTool true/false, number of lm.requests, turns length,
loadedStore rows assertions). Reuse the shared harness setup
(loadedStateConversationStore(), loadedStateStore{}, jiraTool,
newChannelFollowUpTestBuilder, builder.SetLoadedMCPToolStore(loadedStore), lm :=
&dynamicWorkflowLLM{}, bot := loadedStateBot(lm), llmContext :=
builder.BuildLLMContextUserRequest(...), and c := &Conversations{...}) inside
the loop, and assert per-case expectations for resolverCalls, tool approval
events, lm.requests indices (check
lm.requests[2].Context.Tools.GetTool("jira__get_issue")), and persisted
loadedStore rows; leave TestDynamicMCPMetaToolsBypassApproval as a separate test
unchanged.

In `@prompts/standard_personality_without_locale_test.go`:
- Around line 184-252: The four tests
TestStandardPersonalityIncludesDynamicToolWorkflowWhenEnabled,
TestStandardPersonalityIncludesDynamicToolWorkflowWithDisabledToolsInfo,
TestStandardPersonalityOmitsDynamicToolWorkflowWhenFlagOff, and
TestStandardPersonalityOmitsDynamicToolWorkflowWithoutMetaTools should be
combined into a single table-driven test that iterates subtests via t.Run;
create a slice of cases with a name, an llm.Context (varying Tools:
dynamicMetaToolStore() vs llm.NewNoTools(), MCPDynamicToolLoading true/false,
and DisabledToolsInfo when needed), and expected positive/negative string
assertions, then for each case call renderStandardPersonalityWithoutLocale and
run assert.Contains/assert.NotContains according to the case expectations; keep
references to renderStandardPersonalityWithoutLocale, dynamicMetaToolStore,
llm.NewNoTools, DisabledToolsInfo, and MCPDynamicToolLoading when constructing
the cases so the logic is identical to the original tests.

---

Nitpick comments:
In `@api/api_channel_analysis_test.go`:
- Around line 158-194: In TestHandleChannelAnalysisPreloadsRequiredMCPTools, the
http.Response returned by recorder.Result() (resp) is never closed; after
calling resp := recorder.Result() add a resp.Body.Close() (or immediately defer
resp.Body.Close()) to avoid leaking the response body; locate the resp variable
in TestHandleChannelAnalysisPreloadsRequiredMCPTools and insert the close right
after obtaining resp.

In `@channels/channels.go`:
- Around line 117-144: In requiredEmbeddedToolByExactOrBareName, callers may be
surprised by the intentional re-aliasing of the tool name; add a brief inline
comment immediately before the statement that sets tool.Name = name to explain
this behavior (e.g., "// Return tool with bare name for scoped binding" or
similar) so readers understand the mutation is deliberate and matches the
preload re-aliasing behavior referenced in api/api_channel.go.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: f68c9585-15d4-4ad1-a5e3-c509b9947d0b

📥 Commits

Reviewing files that changed from the base of the PR and between e882aa7 and 95f8795.

📒 Files selected for processing (22)
  • api/api_agents.go
  • api/api_agents_test.go
  • api/api_channel.go
  • api/api_channel_analysis_test.go
  • api/api_config_test.go
  • api/api_llm_bridge.go
  • api/api_mcp_test.go
  • channels/analysis_conversation_test.go
  • channels/channels.go
  • conversations/bot_channel_tool_filter.go
  • conversations/bot_channel_tool_filter_test.go
  • conversations/conversations.go
  • conversations/dynamic_mcp_workflow_test.go
  • conversations/handle_messages.go
  • conversations/loaded_state_flow_test.go
  • conversations/regeneration.go
  • conversations/single_build_test.go
  • conversations/tool_approval.go
  • conversations/tool_approval_internal_test.go
  • conversations/tool_policy_test.go
  • prompts/standard_personality_without_locale.tmpl
  • prompts/standard_personality_without_locale_test.go

Comment thread api/api_agents_test.go Outdated
Comment thread api/api_mcp_test.go Outdated
Comment thread conversations/bot_channel_tool_filter_test.go
Comment thread conversations/dynamic_mcp_workflow_test.go Outdated
Comment thread prompts/standard_personality_without_locale_test.go Outdated
@nickmisasi nickmisasi force-pushed the dynamic-mcp-04-context-toolrunner branch from e882aa7 to 80e5ad8 Compare May 23, 2026 02:52
@nickmisasi nickmisasi force-pushed the dynamic-mcp-05-conversations-api branch from 95f8795 to b365666 Compare May 23, 2026 02:52
@nickmisasi nickmisasi force-pushed the dynamic-mcp-04-context-toolrunner branch from 80e5ad8 to c383e63 Compare May 23, 2026 16:48
@nickmisasi nickmisasi force-pushed the dynamic-mcp-05-conversations-api branch from b365666 to a8e402c Compare May 23, 2026 16:48
@nickmisasi nickmisasi force-pushed the dynamic-mcp-04-context-toolrunner branch from c383e63 to 1c509aa Compare May 23, 2026 17:23
@nickmisasi nickmisasi force-pushed the dynamic-mcp-05-conversations-api branch from a8e402c to 461b81d Compare May 23, 2026 17:23
Co-authored-by: Cursor <cursoragent@cursor.com>
@nickmisasi nickmisasi force-pushed the dynamic-mcp-05-conversations-api branch from 461b81d to df9713a Compare May 23, 2026 19:22
Co-authored-by: Cursor <cursoragent@cursor.com>
@nickmisasi
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
conversations/handle_messages.go (1)

438-458: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-prune disabled MCP servers after AttachConversationID.

buildConversationContextWithTools strips DisabledMCPServerOrigins before the conversation exists, but AttachConversationID restores loaded tools afterwards. In a DM or group DM, that can put a previously loaded tool from a user-disabled server back into llmContext.Tools, so existing conversations ignore the user's MCP preference.

Proposed fix
 	c.contextBuilder.AttachConversationID(llmContext, bot, convResult.ConversationID)
+	removePreFilteredMCPServersFromVisibleStore(llmContext)
 
 	// Anchor this run's trace to the user turn ID. Link to the previous user
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@conversations/handle_messages.go` around lines 438 - 458, The context is
rebuilt with buildConversationContextWithTools which prunes
DisabledMCPServerOrigins too early and AttachConversationID can re-add tools;
after calling c.contextBuilder.AttachConversationID(llmContext, bot,
convResult.ConversationID) re-apply the pruning of DisabledMCPServerOrigins (or
call the existing helper that removes user-disabled MCP servers from
llmContext.Tools) so DM/group DM flows do not reintroduce tools the user
disabled; update the sequence around CreateOrGetDMConversation /
AttachConversationID to prune llmContext.Tools immediately after
AttachConversationID (ensuring ensureDMWebSearchTracking still runs on the final
pruned context).
🧹 Nitpick comments (2)
api/api_channel_analysis_test.go (1)

196-212: ⚡ Quick win

Convert this two-scenario test to table-driven subtests.

This function exercises two distinct cases (non-embedded origin missing, then embedded origin present) but keeps them in one linear test body. Please convert to a tests := []struct{...} + t.Run(...) pattern for consistency with repo test rules.

♻️ Suggested refactor
 func TestChannelAnalysisToolAvailabilityRequiresEmbeddedOrigin(t *testing.T) {
-	tools := llm.NewToolStore(nil, false)
-	tools.AddTools([]llm.Tool{
-		{Name: "read_channel", ServerOrigin: "https://remote.example.com"},
-		{Name: "get_channel_info", ServerOrigin: mcp.EmbeddedClientKey},
-	})
-
-	_, missing := channelAnalysisToolAvailability(tools)
-	require.ElementsMatch(t, []string{"read_channel"}, missing)
-
-	tools.AddTools([]llm.Tool{
-		{Name: "read_channel", ServerOrigin: mcp.EmbeddedClientKey},
-	})
-
-	_, missing = channelAnalysisToolAvailability(tools)
-	require.Empty(t, missing)
+	tests := []struct {
+		name        string
+		tools       []llm.Tool
+		wantMissing []string
+	}{
+		{
+			name: "marks non-embedded required tool as missing",
+			tools: []llm.Tool{
+				{Name: "read_channel", ServerOrigin: "https://remote.example.com"},
+				{Name: "get_channel_info", ServerOrigin: mcp.EmbeddedClientKey},
+			},
+			wantMissing: []string{"read_channel"},
+		},
+		{
+			name: "accepts embedded required tools",
+			tools: []llm.Tool{
+				{Name: "read_channel", ServerOrigin: mcp.EmbeddedClientKey},
+				{Name: "get_channel_info", ServerOrigin: mcp.EmbeddedClientKey},
+			},
+			wantMissing: nil,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			store := llm.NewToolStore(nil, false)
+			store.AddTools(tt.tools)
+			_, missing := channelAnalysisToolAvailability(store)
+			require.ElementsMatch(t, tt.wantMissing, missing)
+		})
+	}
 }

As per coding guidelines: **/*_test.go: Go tests must be table-driven when there is more than one test case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/api_channel_analysis_test.go` around lines 196 - 212, Replace the linear
TestChannelAnalysisToolAvailabilityRequiresEmbeddedOrigin with a table-driven
test: create a tests := []struct{name string, initialTools []llm.Tool, addTools
[]llm.Tool, expectedMissing []string}{...} containing the two scenarios (first:
initialTools includes read_channel with ServerOrigin
"https://remote.example.com" and get_channel_info with mcp.EmbeddedClientKey
expecting missing ["read_channel"]; second: initialTools includes both tools
with EmbeddedClientKey expecting missing []), then iterate tests with
t.Run(tc.name, func(t *testing.T){ tools := llm.NewToolStore(nil, false);
tools.AddTools(tc.initialTools); if len(tc.addTools)>0 {
tools.AddTools(tc.addTools) }; _, missing :=
channelAnalysisToolAvailability(tools); use require.ElementsMatch/require.Empty
as appropriate }), making sure each subtest creates a fresh ToolStore to avoid
state leakage.
conversations/bot_channel_tool_filter_test.go (1)

206-208: Zero-arg llm.NewToolStore() is safe (equivalent to llm.NewToolStore(nil, false)).

llm.NewToolStore is func NewToolStore(options ...any) *ToolStore; when called with no arguments it defaults to log=nil and doTrace=false, so llm.NewToolStore() at lines 206-208 won’t cause a compile-time break. Standardizing to llm.NewToolStore(nil, false) is optional for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@conversations/bot_channel_tool_filter_test.go` around lines 206 - 208, The
call llm.NewToolStore() is valid but for consistency make its defaults explicit:
in the llm.Context initialization replace llm.NewToolStore() with
llm.NewToolStore(nil, false) so the intent (log=nil, doTrace=false) is clear;
update the llm.Context creation where Tools is set to reference the explicit
call to llm.NewToolStore(nil, false).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@conversations/loaded_state_flow_test.go`:
- Around line 213-216: The fake UpsertLoadedMCPTool currently appends every
tool, causing duplicates; change loadedStateStore.UpsertLoadedMCPTool to
implement upsert semantics by scanning s.rows for an existing
store.LoadedMCPTool with the same identity (match on the unique identifier
field(s) used by LoadedMCPTool, e.g., ID or Name), replace that entry in-place
if found, and return; only append if no match was found. Use the existing
function name UpsertLoadedMCPTool and the loadedStateStore.rows slice to locate
and update the matching element.

---

Outside diff comments:
In `@conversations/handle_messages.go`:
- Around line 438-458: The context is rebuilt with
buildConversationContextWithTools which prunes DisabledMCPServerOrigins too
early and AttachConversationID can re-add tools; after calling
c.contextBuilder.AttachConversationID(llmContext, bot,
convResult.ConversationID) re-apply the pruning of DisabledMCPServerOrigins (or
call the existing helper that removes user-disabled MCP servers from
llmContext.Tools) so DM/group DM flows do not reintroduce tools the user
disabled; update the sequence around CreateOrGetDMConversation /
AttachConversationID to prune llmContext.Tools immediately after
AttachConversationID (ensuring ensureDMWebSearchTracking still runs on the final
pruned context).

---

Nitpick comments:
In `@api/api_channel_analysis_test.go`:
- Around line 196-212: Replace the linear
TestChannelAnalysisToolAvailabilityRequiresEmbeddedOrigin with a table-driven
test: create a tests := []struct{name string, initialTools []llm.Tool, addTools
[]llm.Tool, expectedMissing []string}{...} containing the two scenarios (first:
initialTools includes read_channel with ServerOrigin
"https://remote.example.com" and get_channel_info with mcp.EmbeddedClientKey
expecting missing ["read_channel"]; second: initialTools includes both tools
with EmbeddedClientKey expecting missing []), then iterate tests with
t.Run(tc.name, func(t *testing.T){ tools := llm.NewToolStore(nil, false);
tools.AddTools(tc.initialTools); if len(tc.addTools)>0 {
tools.AddTools(tc.addTools) }; _, missing :=
channelAnalysisToolAvailability(tools); use require.ElementsMatch/require.Empty
as appropriate }), making sure each subtest creates a fresh ToolStore to avoid
state leakage.

In `@conversations/bot_channel_tool_filter_test.go`:
- Around line 206-208: The call llm.NewToolStore() is valid but for consistency
make its defaults explicit: in the llm.Context initialization replace
llm.NewToolStore() with llm.NewToolStore(nil, false) so the intent (log=nil,
doTrace=false) is clear; update the llm.Context creation where Tools is set to
reference the explicit call to llm.NewToolStore(nil, false).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: c75c0cbb-76dd-4424-ab48-abef4f97506a

📥 Commits

Reviewing files that changed from the base of the PR and between 95f8795 and 6ab2688.

📒 Files selected for processing (22)
  • api/api_agents.go
  • api/api_agents_test.go
  • api/api_channel.go
  • api/api_channel_analysis_test.go
  • api/api_config_test.go
  • api/api_llm_bridge.go
  • api/api_mcp_test.go
  • channels/analysis_conversation_test.go
  • channels/channels.go
  • conversations/bot_channel_tool_filter.go
  • conversations/bot_channel_tool_filter_test.go
  • conversations/conversations.go
  • conversations/dynamic_mcp_workflow_test.go
  • conversations/handle_messages.go
  • conversations/loaded_state_flow_test.go
  • conversations/regeneration.go
  • conversations/single_build_test.go
  • conversations/tool_approval.go
  • conversations/tool_approval_internal_test.go
  • conversations/tool_policy_test.go
  • prompts/standard_personality_without_locale.tmpl
  • prompts/standard_personality_without_locale_test.go

Comment thread conversations/loaded_state_flow_test.go
nickmisasi and others added 2 commits May 23, 2026 15:39
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review — No Findings

I performed a comprehensive security analysis of the changes in this PR which wire dynamic MCP tool loading into conversation, approval, channel analysis, regeneration, and API flows. I examined:

Tool resolution and approval flow (resolveApprovedToolUseBlock): Validates ServerOrigin and MCPBareName metadata from stored conversation blocks before executing, preventing tool substitution after registry changes. The HandleToolCall path verifies conv.UserID == userID (requester-only gating).

Bot-activated channel constraint enforcement: The pre-build WithLLMContextMCPToolFilter restricts the MCP tool registry to only auto_run_everywhere-policy tools before the strict store is built. This means search_tools/load_tool meta-tools can only discover/load tools with the appropriate policy — not bypass the constraint. The post-build applyBotChannelAutoEverywhereToolFilter provides defense-in-depth.

DM disabled-server re-pruning: removePreFilteredMCPServersFromVisibleStore is correctly called after AttachConversationID (which may restore loaded tools), so user-disabled MCP servers remain pruned even after late-binding the conversation.

Fail-closed patterns: When toolPolicyChecker is nil, shouldAutoExecuteTool returns false, botChannelAutoEverywhereKeepTool returns false, and allToolsAutoRunEverywhere returns false — all fail-closed.

Shared flag scoping: The HandleToolResult change narrows Shared=true to only tool blocks belonging to the clicked post (via clickedPostToolUseIDs), reducing the blast radius of the share decision — a security improvement.

Meta-tool identity check: isMCPMetaToolCall requires BOTH name-match AND empty ServerOrigin, preventing an MCP server from registering a tool named search_tools to gain auto-approval.

No medium, high, or critical vulnerabilities identified.

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review — No Findings

I reviewed the production code changes in this PR with focus on the security-sensitive areas for a multi-user Mattermost plugin:

Areas Analyzed

  1. MCP meta-tools (search_tools, load_tool) bypassing approval — These are intentionally auto-approved as they are read-only discovery/loading operations. search_tools only returns tool names/summaries from the user-scoped registry (built via GetToolsForUser). load_tool only materializes a tool's schema. Business tools loaded via load_tool still go through full policy enforcement before execution.

  2. Channel tool filtering preserving meta-toolsapplyBotChannelAutoEverywhereToolFilter now preserves meta-tools even in fail-closed mode. This is safe because loaded business tools still require auto_run_everywhere policy to auto-execute in channels, and tools with ask policy surface for user approval.

  3. resolveApprovedToolUseBlock server origin validation — This is a security improvement. The new function validates ServerOrigin and MCPBareName against persisted approval metadata before executing, preventing tool identity spoofing during deferred approval execution.

  4. resolveToolStoreLookupName ambiguity handling — Fails closed when a bare tool name maps to multiple registered tools without a disambiguating ServerOrigin. Prevents unintended tool execution.

  5. Meta-tool spoofing resistanceisMCPMetaToolCall requires tc.ServerOrigin == "" and verifies the tool store doesn't associate the name with a non-empty origin, preventing an external MCP server from registering tools named search_tools/load_tool that bypass approval.

  6. Prompt injection via dynamic tool workflow — The system prompt instructs the LLM to ask clarification before side-effecting actions. While this is a defense-in-depth measure (LLM instructions are not a security boundary), the server-side policy enforcement (shouldAutoExecuteTool, allToolsAutoRunEverywhere) provides the actual gate — tools that need approval still require it regardless of what the LLM is instructed to do.

Conclusion

No medium, high, or critical vulnerabilities identified. The PR maintains existing permission/authorization patterns while adding server-origin validation on tool approval execution (a security hardening).

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant