Activate selected capabilities at sampling boundaries#29946
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8ca277e6f7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let config = turn_context.config.as_ref(); | ||
| let mcp_config = self | ||
| .services | ||
| .mcp_manager | ||
| .runtime_config_for_thread(config, &candidate) |
There was a problem hiding this comment.
Preserve reload config when publishing selected capabilities
When config/mcpServer/reload is queued before the first turn that activates a selected executor (or before any later selected-capability generation advances), the turn path first refreshes MCP from the newly loaded config, then prepare_runtime_snapshot immediately rebuilds and swaps the manager using this stale per-turn turn_context.config. That drops any newly added/removed global MCP servers from the just-applied reload for the sampled request; use the refresh config snapshot (or update the session config before activation) so selected-capability publication does not undo config reloads.
AGENTS.md reference: AGENTS.md:L102-L110
Useful? React with 👍 / 👎.
| ) | ||
| .await; | ||
| record_inspected_inputs(&sess, &turn_context, inspected_input).await; | ||
| return Err(err); |
There was a problem hiding this comment.
Surface activation failures instead of completing the turn
When the first selected-capability activation fails here (for example, a selected plugin contributes a required MCP server that times out or exits during validate_required_servers), this returns Err before entering the sampling loop that normally converts CodexErr into an ErrorEvent. Session::on_task_finished treats unexpected task errors as a logged warning with no abort reason, so clients receive TurnComplete and no user-visible failure even though the turn never sampled; emit/send the error in this branch as the later sampling error path does.
Useful? React with 👍 / 👎.
| if !self.committed { | ||
| self.cancellation_token.cancel(); |
There was a problem hiding this comment.
Shut down cancelled candidate MCP managers
If a turn is interrupted or the session is shut down while a selected-capability candidate is validating, this drop path only cancels the startup token and clears the pending slot. Candidate servers that have already initialized are no longer observing that token, and AsyncManagedClient::shutdown is the path that actually sends shutdown to ready clients, so those MCP processes can be left running with no remaining manager handle to clean them up.
Useful? React with 👍 / 👎.
Summary
Why
A session can outlive several executor readiness transitions, while each model request still needs one coherent world view. This makes readiness dynamic but publication request-scoped: an executor that is pending does not freeze thread creation, and a root that becomes ready is visible only after its complete MCP and connector view can commit at the next sampling boundary.