Skip to content

Commit bbb1d5b

Browse files
xuiocodex
andcommitted
Add intuitive Codex MCP front doors
Co-Authored-By: OpenAI Codex <noreply@openai.com>
1 parent 16af90b commit bbb1d5b

11 files changed

Lines changed: 957 additions & 67 deletions

README.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ npm run test:claude-desktop
133133

134134
`test:claude-real-session` is an opt-in live Claude Code test for daemonless persistent sessions. It loads the symlinked installed plugin, starts a real Codex session, sends a follow-up without `project_dir`, and verifies the session stays pinned to the original project directory.
135135

136-
`test:claude-autodiscovery` is an opt-in live Claude Code test for automatic tool selection. It gives Claude a natural "ask Codex" request, loads the local plugin with the fake Codex binary, and verifies that Claude chooses the Codex MCP tool without being told the exact tool name.
136+
`test:claude-autodiscovery` is an opt-in live Claude Code test for automatic tool selection. It gives Claude a natural "ask Codex" request, loads the local plugin with the fake Codex binary, and verifies that Claude chooses the intuitive Codex MCP front door without being told the exact low-level tool name.
137137

138138
Run Claude Code with the local plugin:
139139

@@ -151,7 +151,15 @@ After startup, ask Claude to use Codex subagents, or invoke the plugin skill:
151151

152152
`codex_usage_guide` returns the operating guide and example calls Claude can use when deciding how to delegate to Codex.
153153

154-
`run_agent` launches one Codex `exec` process and waits for it. It uses the same bounded queue as async jobs.
154+
`codex_choose_tool` returns a concise decision guide for picking between one agent, parallel agents, persistent sessions, aggregation, and async jobs.
155+
156+
`ask_codex` is the preferred front door for one Codex task. It launches one Codex `exec` process and waits for it.
157+
158+
`ask_codex_parallel` is the preferred front door for multiple independent Codex tasks. It launches bounded parallel Codex `exec` processes and returns one structured result per task.
159+
160+
`start_codex_session` and `continue_codex_session` are the preferred front doors for daemonless persistent Codex sessions.
161+
162+
`run_agent` launches one Codex `exec` process and waits for it. It uses the same bounded queue as async jobs and remains available for lower-level/manual control.
155163

156164
`run_agents` launches multiple Codex `exec` processes concurrently with a bounded `max_parallel` setting and the global queue.
157165

@@ -163,7 +171,7 @@ After startup, ask Claude to use Codex subagents, or invoke the plugin skill:
163171

164172
`get_agent_run`, `wait_agent_run`, and `cancel_agent_run` inspect, wait for, or cancel async jobs.
165173

166-
`start_session`, `send_session_prompt`, `get_session`, `list_sessions`, and `cancel_session` manage daemonless persistent Codex sessions using Codex's own resumable thread ids.
174+
`start_session`, `send_session_prompt`, `get_session`, `list_sessions`, and `cancel_session` manage daemonless persistent Codex sessions using Codex's own resumable thread ids. They are compatibility aliases behind `start_codex_session` and `continue_codex_session`.
167175

168176
`codex_status` reports the resolved Codex binary, server working directory, Claude project directory, default model, default reasoning effort, feature sets, and version probe.
169177

@@ -175,7 +183,7 @@ Prefer `start_agent_run` or `start_agents_run` for work that may run longer than
175183

176184
Async job snapshots expose partial stdout/stderr and parsed event summaries through `get_agent_run` while work is still running.
177185

178-
When a client supports MCP progress tokens, `run_agent`, `run_agents`, `run_agents_aggregate`, `start_session`, `send_session_prompt`, `start_agent_run`, `start_agents_run`, `get_agent_run`, `wait_agent_run`, and `cancel_agent_run` send progress notifications. SDK clients should pass an `onprogress` handler and enable timeout reset on progress for long waits.
186+
When a client supports MCP progress tokens, `ask_codex`, `ask_codex_parallel`, `start_codex_session`, `continue_codex_session`, `run_agent`, `run_agents`, `run_agents_aggregate`, `start_session`, `send_session_prompt`, `start_agent_run`, `start_agents_run`, `get_agent_run`, `wait_agent_run`, and `cancel_agent_run` send progress notifications. SDK clients should pass an `onprogress` handler and enable timeout reset on progress for long waits.
179187

180188
## License
181189

dist/index.js

Lines changed: 324 additions & 13 deletions
Large diffs are not rendered by default.

skills/codex-subagents/SKILL.md

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,24 +16,27 @@ Default behavior:
1616
- Lets the caller set model, reasoning effort, project directory, timeout, and parallelism per agent.
1717
- Supports `model_preset: "spark"` for Codex Spark (`gpt-5.3-codex-spark`) without requiring Claude to remember the exact model string.
1818
- Supports nested Codex subagents by passing `codex_subagents`, `subagent_tasks`, and `subagent_runtime`; custom agents are sent as Codex `agents.<name>...` config overrides for the child run.
19-
- Supports persistent Codex sessions with `start_session` and `send_session_prompt`; use these when the same Codex subagent should keep context across multiple prompts.
19+
- Supports persistent Codex sessions with `start_codex_session` and `continue_codex_session`; use these when the same Codex subagent should keep context across multiple prompts.
2020
- Supports structured results with `output_contract` or `output_schema`; use these when Claude must merge, compare, or aggregate Codex outputs.
2121
- Redacts secret-looking output by default and does not forward secret-looking environment variables unless `forward_sensitive_env` is explicitly true.
2222
- Writes very verbose JSONL logs to stderr by default, including raw MCP JSON-RPC frames, tool arguments/results, prompt outputs, progress notifications, queue/job/session lifecycle, and Codex stdin/stdout/stderr traffic.
2323
- Compacts large tool responses before returning them to Claude; when `mcpResponse.compacted` is true, use the returned summary first and inspect server logs only if the omitted raw tail is necessary.
2424

25-
For one delegated task, call `run_agent`. Make the prompt self-contained: include the scope, the expected read-only behavior, and the output shape Claude needs. For code review and exploration, ask for concise findings with file paths and line references.
25+
Prefer the intuitive front-door tools for normal use:
2626

27-
For independent tasks that can run concurrently, call `run_agents` with one agent object per task. Split by ownership such as API flow, tests, security, performance, UI, docs, or migration risk. Keep prompts concrete and bounded, and set `max_parallel` to the smaller of the useful agent count and `4` unless the user asks for more.
27+
- For one delegated task, call `ask_codex`. Make `task` self-contained: include the scope, expected read-only behavior, and output shape Claude needs. For code review and exploration, ask for concise findings with file paths and line references.
28+
- For independent tasks that can run concurrently, call `ask_codex_parallel` with one task object per workstream. Split by ownership such as API flow, tests, security, performance, UI, docs, or migration risk. Keep tasks concrete and bounded, and set `max_parallel` to the smaller of the useful agent count and `4` unless the user asks for more.
29+
- For multi-turn Codex work, call `start_codex_session` for the initial task and `continue_codex_session` for follow-ups. Session tools use Codex's recorded thread id and remain daemonless; the MCP server keeps only metadata and the last result.
30+
- If unsure which path fits, call `codex_choose_tool` before delegating.
2831

29-
When Claude needs a concise consensus object from several agents, call `run_agents_aggregate` instead of `run_agents`. Prefer `output_contract: "review_findings"` for review-style aggregation.
32+
Use the lower-level compatibility tools only when they fit better: `run_agent`, `run_agents`, `start_session`, and `send_session_prompt` expose the same execution paths with more literal naming. When Claude needs a concise consensus object from several agents, call `run_agents_aggregate`. Prefer `output_contract: "review_findings"` for review-style aggregation.
3033

3134
For slow, broad, or potentially flaky Codex work, prefer `start_agent_run` or `start_agents_run` instead of the blocking tools. Poll with `get_agent_run`, wait with `wait_agent_run`, and cancel with `cancel_agent_run` when the work is no longer needed. The async tools keep the MCP request responsive and use the same global Codex process queue.
3235

33-
For multi-turn Codex work, call `start_session` for the initial prompt and `send_session_prompt` for follow-ups. Session tools use Codex's recorded thread id and remain daemonless; the MCP server keeps only metadata and the last result.
34-
3536
When Claude wants Codex to work in the same repository or folder as the active Claude Code session, pass that folder as `project_dir`. Use `cwd` only as a compatibility alias.
3637

38+
Do not use Bash, Read, or filesystem inspection to locate Codex. The MCP server resolves Codex automatically and prefers the Codex desktop app binary when it is installed.
39+
3740
When the user explicitly asks Codex to edit files, write to git, use DNS/network, install packages, or otherwise run with normal non-sandbox Codex capabilities, set `dangerously_bypass_approvals_and_sandbox: true`. Keep it off for routine review or exploration.
3841

3942
Prefer `reasoning_effort: "medium"` for exploration and `high` or `xhigh` only when the task is complex enough to justify the extra latency and token usage. Do not use `minimal`; the plugin rejects it because Codex currently auto-attaches `web_search`, which the API does not allow with minimal reasoning.
@@ -48,13 +51,13 @@ Set `isolated_codex_home: true` when unrelated Codex MCP servers from the user's
4851

4952
Use `mcp_config_policy: "explicit"` with `codex_mcp_servers` when the user intentionally wants to share MCP servers with Codex. Use `mcp_config_policy: "inherit_claude_project"` only when `project_dir` has a Claude project MCP config that should be imported.
5053

51-
Use `codex_doctor` or `codex_status` only when diagnosing installation, binary resolution, defaults, or after a failed Codex tool call. Normal delegation should start with `run_agent`, `run_agents`, `run_agents_aggregate`, or a session tool.
54+
Use `codex_doctor` or `codex_status` only when diagnosing installation, binary resolution, defaults, or after a failed Codex tool call. Normal delegation should start with `ask_codex`, `ask_codex_parallel`, `run_agents_aggregate`, or a session tool.
5255

5356
Example single-agent call:
5457

5558
```json
5659
{
57-
"prompt": "Review the MCP server implementation read-only. Return the top risks with file paths and line references, then a brief summary.",
60+
"task": "Review the MCP server implementation read-only. Return the top risks with file paths and line references, then a brief summary.",
5861
"project_dir": "/path/to/project",
5962
"model_preset": "spark",
6063
"reasoning_effort": "medium"
@@ -65,15 +68,15 @@ Example parallel call:
6568

6669
```json
6770
{
68-
"agents": [
71+
"tasks": [
6972
{
7073
"name": "api",
71-
"prompt": "Review the MCP tool schemas and runtime options read-only. Return concrete risks with paths.",
74+
"task": "Review the MCP tool schemas and runtime options read-only. Return concrete risks with paths.",
7275
"project_dir": "/path/to/project"
7376
},
7477
{
7578
"name": "tests",
76-
"prompt": "Review the test coverage read-only. Identify missing scenarios with paths.",
79+
"task": "Review the test coverage read-only. Identify missing scenarios with paths.",
7780
"project_dir": "/path/to/project"
7881
}
7982
],

0 commit comments

Comments
 (0)