Skip to content

Commit 308682a

Browse files
committed
Add Codex session and hardening tools
1 parent 77ae58a commit 308682a

15 files changed

Lines changed: 2062 additions & 78 deletions

File tree

README.md

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ The plugin lets Claude Code launch one Codex agent or several Codex agents in pa
1717
- Codex home: uses the user's Codex home by default; pass `isolated_codex_home: true` to use a temporary Codex home with auth but without inherited `config.toml` MCP servers.
1818
- Concurrency: Codex processes run through a global queue. Defaults are `CODEX_SUBAGENTS_MAX_GLOBAL_PROCESSES=4` and `CODEX_SUBAGENTS_MAX_PROJECT_PROCESSES=2`.
1919
- Progress: long-running tools emit MCP `notifications/progress` events when the client supplies a progress token.
20+
- Security: secret-looking output is redacted before it is returned to Claude, and secret-looking environment variables are not forwarded to Codex unless `forward_sensitive_env` is explicitly true.
21+
- Sessions: `start_session` and `send_session_prompt` use Codex's recorded thread id so a Codex subagent can keep context across multiple prompts without a background daemon.
2022

2123
Optional environment overrides:
2224

@@ -43,6 +45,24 @@ To let a Codex agent spawn its own Codex subagents, pass:
4345

4446
Custom subagents are passed to Codex as `agents.<name>...` config overrides and also materialized in a temporary Codex home for the duration of one run. The target project is not modified, and the default sandbox remains `read-only`.
4547

48+
## Structured Output And MCP Config
49+
50+
Pass `output_contract` when Claude needs machine-readable Codex results:
51+
52+
- `review_findings`
53+
- `plan`
54+
- `risk_matrix`
55+
- `patch_suggestions`
56+
57+
You can also pass `output_schema` with a custom JSON Schema. The plugin passes schemas to Codex through `--output-schema`, parses the final JSON message, and returns it as `structuredOutput`.
58+
59+
MCP sharing is explicit:
60+
61+
- `mcp_config_policy: "inherit_codex"` uses the user's normal Codex config.
62+
- `mcp_config_policy: "isolated"` uses a temporary Codex home without inherited MCP servers.
63+
- `mcp_config_policy: "explicit"` uses only `codex_mcp_servers`.
64+
- `mcp_config_policy: "inherit_claude_project"` imports `.mcp.json` or `.claude/mcp.json` from `project_dir` when present.
65+
4666
## Installation
4767

4868
```sh
@@ -68,12 +88,14 @@ npm run test:claude-desktop
6888

6989
`test:ci` is the GitHub-safe suite. It uses the fake Codex binary and does not require Claude Code, the Codex desktop app, or live model credentials.
7090

71-
`test:comprehensive` runs the TypeScript build, unit tests, stdio MCP smoke test, reliability matrix, MCP stress test, MCP progress notification test, Codex desktop runtime probe, Claude plugin validation, and desktop-shipped Claude Code CLI plugin/auth checks. The runtime probe validates local Codex capabilities without invoking a model.
91+
`test:comprehensive` runs the TypeScript build, unit tests, stdio MCP smoke test, reliability matrix, MCP stress test, MCP progress notification test, advanced MCP behavior test, Codex desktop runtime probe, Claude plugin validation, and desktop-shipped Claude Code CLI plugin/auth checks. The runtime probe validates local Codex capabilities without invoking a model.
7292

7393
`test:stress` uses the fake Codex binary to exercise queued async jobs, noisy output, malformed JSONL, and truncation behavior.
7494

7595
`test:progress` verifies that SDK clients receive monotonically increasing MCP progress notifications from blocking, async start, parallel, and wait-style tool calls.
7696

97+
`test:advanced` verifies structured output, secret redaction, safe env forwarding, partial job snapshots, persistent sessions, result aggregation, doctor diagnostics, and explicit MCP config materialization through the stdio MCP server.
98+
7799
`test:claude-orchestration` is an opt-in live Claude Code test. It loads the plugin inside Claude Code, lets Claude call the plugin MCP tools, and uses the fake Codex binary so no Codex model tokens are spent. It is kept out of `test:comprehensive` because it does spend Claude tokens.
78100

79101
`test:claude-real-codex` is the full opt-in live path: Claude Code loads the plugin and calls real Codex through the desktop app binary, including one single agent, one parallel run, and one nested Spark subagent run. It spends both Claude and Codex tokens, so it is intentionally not part of the default suite.
@@ -100,18 +122,26 @@ After startup, ask Claude to use Codex subagents, or invoke the plugin skill:
100122

101123
`run_agents` launches multiple Codex `exec` processes concurrently with a bounded `max_parallel` setting and the global queue.
102124

125+
`run_agents_aggregate` launches multiple agents and returns both raw agent results and a deterministic aggregation object.
126+
103127
`start_agent_run` starts one queued Codex run and returns a `job.id` immediately.
104128

105129
`start_agents_run` starts a queued parallel Codex run and returns a `job.id` immediately.
106130

107131
`get_agent_run`, `wait_agent_run`, and `cancel_agent_run` inspect, wait for, or cancel async jobs.
108132

109-
`codex_status` reports the resolved Codex binary, server working directory, Claude project directory, default model, default reasoning effort, and version probe.
133+
`start_session`, `send_session_prompt`, `get_session`, `list_sessions`, and `cancel_session` manage daemonless persistent Codex sessions using Codex's own resumable thread ids.
134+
135+
`codex_status` reports the resolved Codex binary, server working directory, Claude project directory, default model, default reasoning effort, feature sets, and version probe.
136+
137+
`codex_doctor` runs installation and safety diagnostics without invoking a model.
110138

111139
Each agent accepts model, reasoning effort, sandbox, project directory, timeout, isolated Codex home, and output-size controls. Pass `project_dir` when Claude Code wants Codex to inspect the same repository or subdirectory Claude is currently working in. If `project_dir` is omitted, the server uses `CLAUDE_PROJECT_DIR` when Claude Code provides it. Omit model to use Codex's configured default or the plugin's optional configured default model.
112140

113141
Prefer `start_agent_run` or `start_agents_run` for work that may run longer than a normal MCP request. The async job API keeps Claude responsive, supports cancellation, and avoids request failures caused by long-running Codex subprocesses.
114142

143+
Async job snapshots expose partial stdout/stderr and parsed event summaries through `get_agent_run` while work is still running.
144+
115145
When a client supports MCP progress tokens, `run_agent`, `run_agents`, `start_agent_run`, `start_agents_run`, `get_agent_run`, `wait_agent_run`, and `cancel_agent_run` send progress notifications. SDK clients should pass an `onprogress` handler and enable timeout reset on progress for long waits.
116146

117147
## License

0 commit comments

Comments
 (0)