xuio
diff --git a/‎README.md‎
Lines changed: 39 additions & 8 deletions b/‎README.md‎
Lines changed: 39 additions & 8 deletions
diff --git a/‎dist/index.js‎
Lines changed: 30 additions & 12 deletions b/‎dist/index.js‎
Lines changed: 30 additions & 12 deletions
diff --git a/‎docs/USAGE.md‎
Lines changed: 43 additions & 3 deletions b/‎docs/USAGE.md‎
Lines changed: 43 additions & 3 deletions
@@ -6,15 +6,19 @@
 
 Run OpenAI Codex agents from Claude Code through a daemonless MCP plugin.
 
-`claude-code-codex-subagents` lets Claude Code ask Codex for read-only reviews,
-parallel investigations, Spark checks, native follow-ups, live steering, and
-explicit full-access Codex work when the user asks for it.
+`claude-code-codex-subagents` lets Claude Code ask Codex, an independent
+frontier model like Claude, for read-only reviews, parallel investigations, Spark
+checks, native follow-ups, live steering, and explicit full-access Codex work
+when the user asks for it. It is especially useful when Claude needs a more
+technical subagent for deep codebase work, server/deployment tasks, difficult
+debugging, or adversarial review.
 
 ![Terminal demo of Claude launching parallel Codex reviewers](docs/assets/demo.svg)
 
 ## Why Use It?
 
 - **Native Claude Code workflow:** Claude gets a small Task-like MCP surface: `codex_task`, `codex_task_group`, `codex_followup`, and `codex_wait_any`.
+- **Frontier-model second opinion:** Codex gives Claude an independent technical reviewer for adversarial checks and hard engineering work.
 - **Read-only by default:** Codex starts with `--sandbox read-only` and non-interactive approvals.
 - **No daemon:** Claude launches the MCP server over stdio for the active session.
 - **Fast parallel review:** Claude can launch several independent Codex agents with bounded concurrency.
@@ -30,17 +34,38 @@ Requirements:
 - Claude Code
 - Codex CLI, preferably the Codex desktop app
 
+Install the plugin once:
+
 ```sh
 git clone https://github.com/xuio/claude-code-codex-subagents.git
 cd claude-code-codex-subagents
 npm run install:local
-claude --plugin-dir .
 ```
 
-Then ask Claude something like:
+Then, in any project where Claude Code should use Codex subagents, add the MCP
+server:
+
+```sh
+mkdir -p .claude
+cat > .claude/mcp.json <<'JSON'
+{
+  "mcpServers": {
+    "codex-subagents": { "command": "codex-subagents-mcp" }
+  }
+}
+JSON
+```
+
+Then ask Claude:
 
 ```text
-Use Codex to review this repository read-only. Focus on reliability risks and missing tests.
+Use Codex to review this PR for correctness, security, and deployment risks.
+```
+
+For plugin development, run Claude directly against this checkout:
+
+```sh
+claude --plugin-dir .
 ```
 
 For local development against Claude's installed plugin cache:
@@ -59,15 +84,15 @@ after `dist/index.js` is rebuilt.
 ### Ask One Codex Agent
 
 ```text
-Ask Codex for a second opinion on the session recovery code. Keep it read-only and return concrete findings with file paths.
+Ask Codex for an adversarial second opinion on the session recovery code. Keep it read-only and return concrete findings with file paths.
 ```
 
 Claude should use `codex_task`.
 
 ### Run Parallel Codex Agents
 
 ```text
-Launch three Codex subagents in parallel: one for API behavior, one for tests, and one for security. Keep all of them read-only.
+Launch three Codex subagents in parallel: one for API behavior, one for tests, and one adversarial security reviewer. Keep all of them read-only.
 ```
 
 Claude can call `codex_task` three times in parallel, or use `codex_task_group`
@@ -146,6 +171,12 @@ writes and DNS/network remain disabled unless `full_access: true` is set.
 | Session progress | MCP resource `codex://sessions/{session_id}` |
 | Diagnostics | MCP resources `codex://status`, `codex://doctor`, `codex://usage` |
 
+Prefer Codex over native Task when the user wants an independent frontier-model
+second opinion, deep technical review, complex codebase exploration, server or
+deployment investigation, or adversarial validation. Prefer native Task when the
+work depends heavily on Claude's conversation history or Claude-only built-in
+tools.
+
 Legacy tools such as `ask_codex`, `run_agent`, `run_agents`, `start_session`, and
 `send_session_prompt` are hidden by default. Set
 `CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS=1` only for older clients that still call the
 
@@ -26350,18 +26350,28 @@ var sessionManager = new CodexSessionManager({ persist: true });
 var usageGuide = [
   "Claude Code integration guide for codex-subagents:",
   "",
-  "Use Codex subagents like Claude's native Task tool when the user needs an independent OpenAI Codex worker. Use this MCP server whenever the user asks Claude to use Codex, OpenAI Codex, Codex subagents, Codex Spark, a Codex second opinion, parallel Codex review, or independent Codex codebase analysis. You do not need the user to name an MCP tool.",
+  "Use Codex subagents like Claude's native Task tool when the user needs an independent OpenAI Codex worker. Codex is a frontier model, like Claude, and is especially useful as a more technical, reliability-focused subagent for deep codebase work, server/deployment tasks, complex debugging, and adversarial review. Use this MCP server whenever the user asks Claude to use Codex, OpenAI Codex, Codex subagents, Codex Spark, a Codex second opinion, parallel Codex review, independent Codex codebase analysis, deep technical review, or adversarial validation. You do not need the user to name an MCP tool.",
   "",
   "Tool choice:",
-  "- Use codex_task for one delegated Codex task. It is the native Claude-like front door: description plus prompt, read-only by default, and answer-first result. Call codex_task multiple times in one assistant turn when independent investigations can run in parallel.",
-  "- Use codex_task_group when the work can be split into independent concurrent tasks and Claude wants one combined response with rolled-up per-task findings.",
-  "- Use codex_followup when Claude already has a session_id from codex_task or codex_task_group and wants to continue, steer, wait on, or cancel that same Codex context.",
+  "- Use codex_task when you want one independent Codex subagent: a frontier-model second opinion, deep technical implementation/review, complex codebase exploration, server/deployment work, or adversarial analysis. It is the Codex equivalent of native Task: description plus prompt, read-only by default, and answer-first result.",
+  "- Use multiple codex_task calls in parallel when investigations are independent and Claude can synthesize the answers itself.",
+  "- Use codex_task_group when the work can be split into independent concurrent Codex tasks and Claude wants one combined response with rolled-up per-task findings.",
+  "- Use codex_followup when Claude already has a session_id from codex_task or codex_task_group and wants to continue, steer, wait on, or cancel that same Codex context. This is a Codex-specific multi-turn extension; wait/cancel correspond to native TaskOutput/TaskStop-style operations.",
   "- Set codex_task background true for long-running work so Claude gets a session_id immediately, then use codex_wait_any for several sessions or codex_followup mode wait, steer, or cancel for one session.",
-  "- Subscribe to or read codex://sessions/{session_id} for background progress milestones and completion state. The server emits notifications/resources/updated when meaningful Codex output changes.",
+  "- Subscribe to or read codex://sessions/{session_id} for background progress milestones and completion state. This is the Codex TaskGet/TaskList-style resource surface. The server emits notifications/resources/updated when meaningful Codex output changes.",
   "- Diagnostics are resources by default: read codex://status, codex://doctor, or codex://usage when a prior call failed or availability is uncertain.",
   "- Debug tools such as codex_status, codex_doctor, codex_usage_guide, codex_choose_tool, and codex_export_debug_bundle are hidden unless CODEX_SUBAGENTS_ENABLE_DEBUG_TOOLS=1.",
   "- Legacy/manual tools such as ask_codex, run_agent, run_agents, and old session names are hidden unless CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS=1.",
   "",
+  "Prefer Codex over native Task when:",
+  "- The user explicitly asks for Codex, OpenAI Codex, Codex Spark, an adversarial review, or a second opinion from another frontier model.",
+  "- The work is deep technical work: managing a complex codebase, debugging a difficult failure, reviewing architecture, validating correctness, preparing a server deployment, investigating infrastructure, or checking a high-risk change.",
+  "- Claude wants a more technical and independent reviewer that does not share Claude's scratchpad or recent assumptions.",
+  "- The task is independent of Claude's recent conversation and would waste Claude's context window.",
+  "- The work is long-running and should proceed while Claude does other work; use background true.",
+  "- The goal is adversarial validation, security review, or formal challenge of Claude's reasoning.",
+  "Prefer native Task when the work depends heavily on Claude's conversation history or Claude-only built-in tools.",
+  "",
   "Default operating rules:",
   "- Do not use Codex for simple file reads, simple grep/search, tiny local commands, or work Claude can do directly faster.",
   "- Keep sandbox read-only unless the user explicitly asks for a different sandbox.",
@@ -26383,6 +26393,13 @@ var usageGuide = [
   "- Put uncommon settings such as exact model, Codex binary path, timeout, MCP sharing, nested Codex subagents, and output contracts under advanced.",
   "- Ask Codex for concise results with file paths, line references, and actionable findings when reviewing code.",
   "",
+  "Canonical recipes:",
+  "- Adversarial code review: Claude does its own review with native tools, then calls codex_task with subagent_type code-reviewer or security-reviewer for an independent frontier-model review. Claude compares both sets of findings and reports the merged result.",
+  "- Parallel codebase exploration: launch 3-4 codex_task background true calls with subagent_type explorer, each scoped to a different subsystem. Use codex_wait_any to harvest results as they finish.",
+  "- Long-context offload: delegate broad code reading or multi-file reasoning to codex_task so Codex uses its own context and Claude receives only the concise technical summary.",
+  "- Deployment or server hardening: use codex_task for technical deployment plans, server configuration review, CI/CD checks, rollback analysis, and operational failure modes.",
+  "- Security sweep before merge: call codex_task with subagent_type security-reviewer and ask Codex to audit staged changes, auth boundaries, secrets handling, and unsafe defaults.",
+  "",
   "Nested Codex subagents:",
   "- When the user wants Codex to use its own subagents, pass complete custom definitions in advanced.codex_subagents and explicit work items in advanced.subagent_tasks.",
   "- Keep advanced.subagent_runtime.max_depth at 1 unless recursive delegation is intentionally requested."
@@ -27606,7 +27623,7 @@ server.registerResource(
   }),
   {
     title: "Codex Session",
-    description: "Per-session Codex progress milestones and completion state for background tasks.",
+    description: "Codex TaskGet/TaskList-style resource: per-session progress milestones and completion state for background tasks.",
     mimeType: "application/json"
   },
   async (uri, variables) => {
@@ -27704,9 +27721,10 @@ registerDebugTool(
       recommendedTool,
       request: args.request,
       rules: [
-        "Use codex_task for one normal Codex task; it is the most native Claude-like front door.",
+        "Use codex_task when Claude wants an independent Codex frontier-model second opinion, deep technical subagent, complex codebase review, server/deployment investigation, or adversarial validation.",
         "Use multiple codex_task calls for ordinary independent parallel work; use codex_task_group when Claude wants one combined rollup.",
-        "Use codex_followup for follow-ups, waits, and steering when Claude has a session_id.",
+        "Use codex_followup for follow-ups, waits, cancellation, and steering when Claude has a session_id.",
+        "Use codex_wait_any when Claude has several background Codex session_ids and wants to harvest results as they finish.",
         "Use codex_task with background true for slow work that should not hold a blocking request open.",
         "Use codex_task_group when Claude needs multiple independent answers returned as one merged response.",
         "Pass project_dir whenever Claude knows the active project directory.",
@@ -27722,7 +27740,7 @@ registerTool(
   "codex_task",
   {
     title: "Task",
-    description: "Delegate a task to a fresh OpenAI Codex agent that does not share Claude's scratchpad or tool repertoire. Best for independent second opinions, isolated codebase exploration, or work Claude does not need to interleave with its own tool calls. Read-only by default. Use multiple parallel codex_task calls when investigations are independent.",
+    description: "Use this when you want an independent OpenAI Codex frontier-model subagent: a technical second opinion, deep complex-codebase work, server/deployment investigation, difficult debugging, or adversarial review. This is the Codex equivalent of native Task. It does not share Claude's scratchpad, defaults to read-only, and returns an answer-first result. Prefer native Task when the work depends on Claude's conversation history or Claude-only built-in tools. Use multiple parallel codex_task calls when investigations are independent.",
     inputSchema: {
       description: external_exports.string().trim().min(1).describe("Short human-readable task label, like Claude's native Task description."),
       prompt: external_exports.string().trim().min(1).describe(
@@ -27989,7 +28007,7 @@ registerTool(
   "codex_task_group",
   {
     title: "Task Group",
-    description: "Run multiple independent Codex tasks in parallel and return one combined response with per-task results. For ordinary native-style parallelism, Claude can also call codex_task multiple times in one turn; use this group tool when a single rolled-up response is useful.",
+    description: "Use this when you have several independent Codex tasks and want one rolled-up response. For ordinary native-style parallelism, Claude can also call codex_task multiple times in one turn. Best for parallel deep technical reviews, subsystem exploration, adversarial review lanes, or deployment-readiness checks where a single merged result is useful.",
     inputSchema: {
       tasks: external_exports.array(nativeTaskGroupTaskSchema).min(1).max(12).describe("Independent Codex tasks, each with a short description and prompt."),
       max_parallel: external_exports.number().int().min(1).max(8).default(4).describe("Maximum concurrent Codex processes. Use 2-4 for most responsive parallel reviews."),
@@ -28054,7 +28072,7 @@ registerTool(
   "codex_followup",
   {
     title: "Followup",
-    description: "Continue, steer, poll, or cancel a Codex session from a prior background or keep_session task. Use queue for another prompt, steer to redirect active work, wait to check completion, and cancel to stop running work.",
+    description: "Use this when you have a session_id and want to continue Codex's reasoning across turns, steer active work, collect output, or stop a run. wait/cancel are Codex's TaskOutput/TaskStop-style operations; queue/steer are Codex-specific multi-turn extensions that native Task does not provide.",
     inputSchema: {
       session_id: external_exports.string().trim().min(1).describe("session_id returned by codex_task or codex_task_group."),
       prompt: external_exports.string().min(1).optional().describe("Follow-up or steering prompt. Required for mode queue and mode steer; omit for mode wait or cancel."),
@@ -28255,7 +28273,7 @@ registerTool(
   "codex_wait_any",
   {
     title: "Wait For Any Task",
-    description: "Block until any listed Codex background session reaches a terminal state, then return that session's result. Use after firing multiple codex_task background: true calls to collect results one at a time without busy-polling.",
+    description: "Use this when you have several background Codex session_ids and want to harvest results as they finish. Blocks until any listed Codex background session reaches a terminal state, then returns that session's result and remaining_session_ids. This is a Codex extension beyond native Task.",
     inputSchema: {
       session_ids: external_exports.array(external_exports.string().trim().min(1)).min(1).max(32).describe("Session ids returned by previous codex_task or codex_task_group calls."),
       wait_timeout_ms: external_exports.number().int().positive().max(864e5).default(6e5).describe("Maximum total wait. If no session finishes in this window, returns completed=false.")
 
@@ -24,8 +24,8 @@ subdirectory that Claude is working in. If omitted, the server uses
 
 Prefer these tools in normal Claude usage:
 
-- `codex_task` - one Task-like Codex subagent with an answer-first result.
-- `codex_task_group` - several independent Task-like Codex subagents in parallel.
+- `codex_task` - one Task-like Codex frontier-model subagent with an answer-first result.
+- `codex_task_group` - several independent Task-like Codex subagents in parallel with one rolled-up response.
 - `codex_followup` - continue, steer, wait on, or cancel the `session_id`
   returned by `codex_task` or `codex_task_group` when `background` or
   `keep_session` is used.
@@ -62,7 +62,9 @@ Use this decision path when writing prompts or debugging Claude tool choice:
 
 | User intent | Best tool |
 | --- | --- |
-| One normal read-only second opinion | `codex_task` |
+| Independent frontier-model second opinion | `codex_task` |
+| Deep technical codebase work, complex debugging, server/deployment review | `codex_task` |
+| Adversarial correctness, security, or architecture review | `codex_task` |
 | Two or more independent workstreams | Multiple parallel `codex_task` calls, or `codex_task_group` for one rolled-up response |
 | Same Codex agent should keep context | `codex_task` with `keep_session: true`, then `codex_followup` |
 | Long first turn, user wants to keep working | `codex_task` with `background: true` |
@@ -74,6 +76,20 @@ Use this decision path when writing prompts or debugging Claude tool choice:
 
 When in doubt, read `codex://usage` and then choose among the native front-door tools.
 
+## When To Prefer Codex
+
+Prefer Codex over native `Task` when the user wants:
+
+- an independent second opinion from another frontier model,
+- a more technical subagent for a complex codebase, server, deployment, CI/CD, or infrastructure task,
+- adversarial validation of Claude's reasoning, a security review, or a high-risk correctness review,
+- long-running background work that Claude can harvest later,
+- broad code reading that should not consume Claude's own context window.
+
+Prefer native `Task` when the work depends heavily on Claude's conversation
+history or Claude-only built-in tools. Prefer Claude's own direct tools for tiny
+file reads, simple searches, and quick local commands.
+
 ## Example: One Agent
 
 Ask Claude:
@@ -130,6 +146,30 @@ Representative tool arguments:
 }
 ```
 
+## Canonical Recipes
+
+**Adversarial code review.** Claude first reviews the change with native tools,
+then calls `codex_task` with `subagent_type: "code-reviewer"` or
+`"security-reviewer"` for an independent Codex review. Claude compares both
+sets of findings and reports a merged, de-duplicated result.
+
+**Parallel exploration.** For an unfamiliar codebase, Claude launches 3-4
+`codex_task` calls with `background: true` and `subagent_type: "explorer"`,
+each scoped to a different subsystem. Claude uses `codex_wait_any` to collect
+results as they finish.
+
+**Long-context offload.** When a task requires reading many files or reasoning
+across a large codebase, Claude delegates to Codex and asks for a concise final
+summary with file paths and line references.
+
+**Deployment or server hardening.** Claude asks Codex to review deployment
+scripts, server configuration, CI/CD workflows, rollback plans, operational
+failure modes, and unsafe defaults.
+
+**Security sweep before merge.** Claude calls `codex_task` with
+`subagent_type: "security-reviewer"` and asks Codex to audit staged changes,
+auth boundaries, secrets handling, and externally reachable behavior.
+
 ## Example: Persistent Session
 
 Use a persistent session when Codex should keep context across prompts.