Skip to content

Commit 41ddc49

Browse files
xuiocodex
andcommitted
Refactor Claude-facing Codex tool surface
Expose native-style task and session tools by default, hide legacy compatibility tools behind CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS, and update docs/tests for the new Claude discovery path. Co-Authored-By: OpenAI Codex <noreply@openai.com>
1 parent 8490cfa commit 41ddc49

26 files changed

Lines changed: 2012 additions & 673 deletions

README.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ full-access Codex work when the user asks for it.
1414

1515
## Why Use It?
1616

17-
- **Native Claude Code workflow:** Claude gets MCP tools and a plugin skill, so it can decide when to ask Codex without shell glue.
17+
- **Native Claude Code workflow:** Claude gets a small Task-like MCP surface: `codex_task`, `codex_task_group`, and `codex_session_*`.
1818
- **Read-only by default:** Codex starts with `--sandbox read-only` and non-interactive approvals.
1919
- **No daemon:** Claude launches the MCP server over stdio for the active session.
2020
- **Fast parallel review:** Claude can launch several independent Codex agents with bounded concurrency.
@@ -62,15 +62,15 @@ after `dist/index.js` is rebuilt.
6262
Ask Codex for a second opinion on the session recovery code. Keep it read-only and return concrete findings with file paths.
6363
```
6464

65-
Claude should use the `ask_codex` front door.
65+
Claude should use `codex_task`.
6666

6767
### Run Parallel Codex Agents
6868

6969
```text
7070
Launch three Codex subagents in parallel: one for API behavior, one for tests, and one for security. Keep all of them read-only.
7171
```
7272

73-
Claude should use `ask_codex_parallel` and split the work into independent tasks.
73+
Claude should use `codex_task_group` and split the work into independent tasks.
7474

7575
### Use Spark
7676

@@ -86,8 +86,8 @@ Claude can pass `model_preset: "spark"` instead of remembering the exact Spark m
8686
Start a long-running Codex session on this repo, then let me send follow-up prompts into the same context.
8787
```
8888

89-
Claude should use `start_codex_session_async`, `send_codex_session_prompt`,
90-
`steer_codex_session`, `get_codex_session`, and `wait_codex_session`.
89+
Claude should use `codex_session_start`, `codex_session_prompt`,
90+
`codex_session_steer`, `codex_session_status`, and `codex_session_wait`.
9191

9292
## Safety Model
9393

@@ -125,16 +125,17 @@ use DNS/network, install packages, or behave like a normal unrestricted Codex ru
125125

126126
| Use case | Preferred tools |
127127
| --- | --- |
128-
| One read-only Codex task | `ask_codex` |
129-
| Several independent tasks | `ask_codex_parallel` |
130-
| Aggregated parallel review | `run_agents_aggregate` |
131-
| Persistent context | `start_codex_session`, `continue_codex_session` |
132-
| Long-running sessions | `start_codex_session_async`, `send_codex_session_prompt`, `steer_codex_session`, `wait_codex_session` |
133-
| Async one-shot jobs | `start_agent_run`, `get_agent_run`, `wait_agent_run`, `cancel_agent_run` |
128+
| One read-only Codex task | `codex_task` |
129+
| Several independent tasks | `codex_task_group` |
130+
| Persistent context | `codex_session_start`, `codex_session_prompt` |
131+
| Long-running sessions | `codex_session_start`, `codex_session_status`, `codex_session_wait`, `codex_session_steer` |
132+
| Session recovery | `codex_sessions`, `codex_session_recover`, `codex_session_cancel` |
134133
| Diagnostics | `codex_status`, `codex_doctor`, `codex_export_debug_bundle` |
135134

136-
Compatibility tools such as `run_agent`, `run_agents`, `start_session`, and
137-
`send_session_prompt` remain available for lower-level control.
135+
Legacy tools such as `ask_codex`, `run_agent`, `run_agents`, `start_session`, and
136+
`send_session_prompt` are hidden by default. Set
137+
`CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS=1` only for older clients that still call the
138+
pre-refactor names.
138139

139140
## Development
140141

dist/index.js

Lines changed: 743 additions & 124 deletions
Large diffs are not rendered by default.

docs/ARCHITECTURE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ metadata needed to reattach to a Codex thread; prompt text and environment value
6262
are not persisted.
6363

6464
After an MCP runtime shutdown, app-server sessions with a Codex thread id are
65-
preserved as recoverable. `recover_codex_session` reattaches with `thread/resume`
65+
preserved as recoverable. `codex_session_recover` reattaches with `thread/resume`
6666
and treats `thread/read` as an optional capability.
6767

6868
Async one-shot jobs are process-local and do not survive MCP restarts. Their tool

docs/KNOWN_LIMITATIONS.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,18 +22,18 @@ Disable logging entirely:
2222
export CODEX_SUBAGENTS_LOG_LEVEL=silent
2323
```
2424

25-
## Async Jobs Are Not Durable
25+
## Legacy Async Jobs Are Not Durable
2626

27-
`start_agent_run` and `start_agents_run` are process-local async jobs. They keep
28-
Claude responsive during long one-shot work, but they do not survive MCP process
29-
restart.
27+
Legacy async one-shot jobs are process-local. They keep Claude responsive during
28+
long one-shot work, but they do not survive MCP process restart. These legacy
29+
tools are hidden unless `CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS=1` is set.
3030

31-
Use `start_codex_session_async` for long-running work that should be recoverable
32-
after Claude Code or the MCP server restarts.
31+
Use `codex_session_start` for long-running work that should be recoverable after
32+
Claude Code or the MCP server restarts.
3333

3434
## Real Steering Requires App-Server
3535

36-
`steer_codex_session` delivers live steering only when the session is running
36+
`codex_session_steer` delivers live steering only when the session is running
3737
through Codex app-server and reports `supportsRealSteering: true`.
3838

3939
If app-server is unavailable and the session falls back to `codex exec`, steering
@@ -69,4 +69,3 @@ still make better choices when the user names the intended shape:
6969
- "run three Codex agents in parallel"
7070
- "start a long-running Codex session"
7171
- "steer the running Codex session"
72-

docs/TROUBLESHOOTING.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -56,19 +56,17 @@ the low-level tool names.
5656

5757
Prefer session or async tools for long work:
5858

59-
- `start_codex_session_async`
60-
- `get_codex_session`
61-
- `wait_codex_session`
62-
- `start_agent_run`
63-
- `get_agent_run`
64-
- `wait_agent_run`
59+
- `codex_session_start`
60+
- `codex_session_status`
61+
- `codex_session_wait`
62+
- `codex_session_steer`
6563

6664
Persistent sessions are the better choice when the work must survive an MCP
6765
restart. Async one-shot jobs are process-local and do not survive restarts.
6866

6967
## Session Recovery Fails
7068

71-
Use `get_codex_session`, then `recover_codex_session`.
69+
Use `codex_session_status`, then `codex_session_recover`.
7270

7371
Check:
7472

docs/USAGE.md

Lines changed: 34 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -24,30 +24,17 @@ subdirectory that Claude is working in. If omitted, the server uses
2424

2525
Prefer these tools in normal Claude usage:
2626

27-
- `ask_codex` - one blocking Codex task.
28-
- `ask_codex_parallel` - several independent blocking Codex tasks.
29-
- `run_agents_aggregate` - parallel tasks plus deterministic aggregation.
30-
- `start_codex_session` - create a persistent session and wait for the first turn.
31-
- `continue_codex_session` - send another prompt into an existing session.
32-
- `start_codex_session_async` - start a persistent session and return immediately.
33-
- `send_codex_session_prompt` - queue a normal follow-up prompt.
34-
- `steer_codex_session` - steer the active app-server turn when supported.
35-
- `get_codex_session` and `wait_codex_session` - inspect or wait on sessions.
36-
37-
Lower-level compatibility tools remain available:
38-
39-
- `run_agent`
40-
- `run_agents`
41-
- `start_agent_run`
42-
- `start_agents_run`
43-
- `get_agent_run`
44-
- `wait_agent_run`
45-
- `cancel_agent_run`
46-
- `start_session`
47-
- `send_session_prompt`
48-
- `get_session`
49-
- `list_sessions`
50-
- `cancel_session`
27+
- `codex_task` - one Task-like Codex subagent with an answer-first result.
28+
- `codex_task_group` - several independent Task-like Codex subagents in parallel.
29+
- `codex_session_start` - start a persistent session and return a session id.
30+
- `codex_session_prompt` - send another prompt into an existing session.
31+
- `codex_session_steer` - steer the active app-server turn when supported.
32+
- `codex_session_status` and `codex_session_wait` - inspect or wait on sessions.
33+
- `codex_sessions`, `codex_session_recover`, and `codex_session_cancel` - manage session lifecycle.
34+
35+
Legacy compatibility tools are hidden by default. Set
36+
`CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS=1` only for older clients that still call
37+
pre-refactor names such as `ask_codex`, `run_agent`, or `start_session`.
5138

5239
Diagnostics tools:
5340

@@ -63,15 +50,13 @@ Use this decision path when writing prompts or debugging Claude tool choice:
6350

6451
| User intent | Best tool |
6552
| --- | --- |
66-
| One normal read-only second opinion | `ask_codex` |
67-
| Two or more independent workstreams | `ask_codex_parallel` |
68-
| Several agents plus a merged summary | `run_agents_aggregate` |
69-
| Same Codex agent should keep context | `start_codex_session`, then `continue_codex_session` |
70-
| Long first turn, user wants to keep working | `start_codex_session_async` |
71-
| Add a normal follow-up to a running session | `send_codex_session_prompt` |
72-
| Redirect the active app-server turn | `steer_codex_session` |
73-
| Recover a session after Claude/MCP restart | `recover_codex_session` |
74-
| Slow one-shot job that need not be durable | `start_agent_run` |
53+
| One normal read-only second opinion | `codex_task` |
54+
| Two or more independent workstreams | `codex_task_group` |
55+
| Same Codex agent should keep context | `codex_session_start`, then `codex_session_prompt` |
56+
| Long first turn, user wants to keep working | `codex_session_start` |
57+
| Add a normal follow-up to a running session | `codex_session_prompt` |
58+
| Redirect the active app-server turn | `codex_session_steer` |
59+
| Recover a session after Claude/MCP restart | `codex_session_recover` |
7560

7661
When in doubt, ask Claude to call `codex_choose_tool` before delegating.
7762

@@ -87,9 +72,9 @@ Representative tool arguments:
8772

8873
```json
8974
{
90-
"task": "Review the MCP server read-only. Return the top reliability risks with file paths and line references.",
75+
"description": "Review MCP server reliability",
76+
"prompt": "Review the MCP server read-only. Return the top reliability risks with file paths and line references.",
9177
"project_dir": "/path/to/project",
92-
"model_preset": "spark",
9378
"reasoning_effort": "medium"
9479
}
9580
```
@@ -109,22 +94,24 @@ Representative tool arguments:
10994
"tasks": [
11095
{
11196
"name": "api",
112-
"task": "Review MCP tool schemas and runtime behavior read-only. Return concrete risks with paths.",
97+
"description": "Review API behavior",
98+
"prompt": "Review MCP tool schemas and runtime behavior read-only. Return concrete risks with paths.",
11399
"project_dir": "/path/to/project"
114100
},
115101
{
116102
"name": "tests",
117-
"task": "Review test coverage read-only. Identify missing scenarios with paths.",
103+
"description": "Review tests",
104+
"prompt": "Review test coverage read-only. Identify missing scenarios with paths.",
118105
"project_dir": "/path/to/project"
119106
},
120107
{
121108
"name": "security",
122-
"task": "Review sandboxing, env forwarding, and logging read-only. Return concrete risks with paths.",
109+
"description": "Review security posture",
110+
"prompt": "Review sandboxing, env forwarding, and logging read-only. Return concrete risks with paths.",
123111
"project_dir": "/path/to/project"
124112
}
125113
],
126114
"max_parallel": 3,
127-
"model_preset": "spark",
128115
"reasoning_effort": "medium"
129116
}
130117
```
@@ -135,14 +122,14 @@ Use a persistent session when Codex should keep context across prompts.
135122

136123
```json
137124
{
138-
"task": "Investigate the session manager read-only. Keep a compact working map of the code.",
125+
"description": "Investigate session manager",
126+
"prompt": "Investigate the session manager read-only. Keep a compact working map of the code.",
139127
"project_dir": "/path/to/project",
140-
"model_preset": "spark",
141128
"reasoning_effort": "medium"
142129
}
143130
```
144131

145-
For a long-running first turn, use `start_codex_session_async`. Then:
132+
`codex_session_start` returns a session id immediately by default. Then:
146133

147134
```json
148135
{
@@ -156,7 +143,7 @@ To steer an active app-server turn:
156143
```json
157144
{
158145
"session_id": "session-...",
159-
"steering_prompt": "Prioritize app-server recovery and ignore UI/documentation polish."
146+
"prompt": "Prioritize app-server recovery and ignore UI/documentation polish."
160147
}
161148
```
162149

@@ -165,8 +152,10 @@ protocol and steering becomes a high-priority queued turn.
165152

166153
## Spark And Reasoning
167154

168-
Use `model_preset: "spark"` for fast, focused Codex work. Exact `model` still
169-
wins when both `model` and `model_preset` are provided.
155+
Do not use `model_preset: "spark"` by default. Use Spark only when the user asks
156+
for Spark or when a quick focused sidecar check is clearly more appropriate than
157+
the default Codex model. Exact `model` still wins when both `model` and
158+
`model_preset` are provided.
170159

171160
Recommended reasoning:
172161

docs/assets/demo.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/wiki/Known-Limitations.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ normal work, or `CODEX_SUBAGENTS_LOG_LEVEL=silent` to disable logging.
1111

1212
## Async Jobs Are Not Durable
1313

14-
`start_agent_run` and `start_agents_run` are process-local. Use
15-
`start_codex_session_async` when the work should be recoverable after restart.
14+
Legacy async one-shot jobs are process-local and hidden by default. Use
15+
`codex_session_start` when the work should be recoverable after restart.
1616

1717
## Steering Requires App-Server
1818

@@ -23,4 +23,3 @@ exec protocol, steering becomes a high-priority queued turn.
2323

2424
Full local access requires `dangerously_bypass_approvals_and_sandbox: true`. It
2525
can write files, mutate git state, use network/DNS, and install packages.
26-

docs/wiki/Tool-Guide.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,11 @@ Use the intuitive front-door tools first.
44

55
| Task | Tool |
66
| --- | --- |
7-
| One Codex task | `ask_codex` |
8-
| Several independent tasks | `ask_codex_parallel` |
9-
| Parallel review with merged output | `run_agents_aggregate` |
10-
| Persistent session | `start_codex_session`, `continue_codex_session` |
11-
| Long-running session | `start_codex_session_async`, `send_codex_session_prompt`, `steer_codex_session`, `wait_codex_session` |
12-
| Async one-shot job | `start_agent_run`, `get_agent_run`, `wait_agent_run`, `cancel_agent_run` |
7+
| One Codex task | `codex_task` |
8+
| Several independent tasks | `codex_task_group` |
9+
| Persistent session | `codex_session_start`, `codex_session_prompt` |
10+
| Long-running session | `codex_session_start`, `codex_session_status`, `codex_session_wait`, `codex_session_steer` |
11+
| Session lifecycle | `codex_sessions`, `codex_session_recover`, `codex_session_cancel` |
1312
| Diagnostics | `codex_status`, `codex_doctor`, `codex_export_debug_bundle` |
1413

1514
## One Agent

docs/wiki/Troubleshooting.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,12 +36,10 @@ Ask Codex to review this repository read-only.
3636

3737
Use persistent or async tools instead of one blocking request:
3838

39-
- `start_codex_session_async`
40-
- `get_codex_session`
41-
- `wait_codex_session`
42-
- `start_agent_run`
43-
- `get_agent_run`
44-
- `wait_agent_run`
39+
- `codex_session_start`
40+
- `codex_session_status`
41+
- `codex_session_wait`
42+
- `codex_session_steer`
4543

4644
Persistent sessions are the right path for work that should be recoverable after
4745
an MCP restart.

0 commit comments

Comments
 (0)