docs(orchestration): address PR review and senior-writer pass

hongyi-chen · oz-agent · hongyi-chen · commit 358ee8d9a687 · 2026-05-16T14:04:37.000-07:00
- Replace broken CLI fan-out example (no `--parent-run-id` flag on `oz agent run-cloud`) with a curl-based scripted fan-out that uses the public `POST /agent/runs` endpoint with `parent_run_id`. Add a note pointing script-driven callers at the API. - Replace the fleet-cancellation example: use `oz run list --ancestor-run` (supported) instead of the nonexistent `--parent-run-id` flag, and curl the public `POST /agent/runs/{runId}/cancel` endpoint instead of the nonexistent `oz run cancel` CLI command. - Remove the SSE and `/agent/messages` / `/agent/events` API sections (still internal-only / dogfood-gated) and reframe lifecycle events and messaging conceptually. - Add a new "Retrieving conversations and artifacts" section covering `GET /agent/runs?ancestor_run_id=`, `GET /agent/runs/{runId}`, `GET /agent/runs/{runId}/conversation`, and the `artifacts` field on RunItem for both parent and child runs. - Fix `parent_run_id` query param to `ancestor_run_id` everywhere. - Replace "service account" with "agent identity" in deployment-patterns. - Senior-writer pass: tighten frontmatter descriptions, standardize em-dash usage in bold-term lists, ALL_CAPS placeholders, drop ambiguous modal verbs. Addresses PR #84 review comments r3253425053 and r3253425056. Co-Authored-By: Oz <oz-agent@warp.dev>
diff --git a/src/content/docs/agent-platform/capabilities/agent-notifications.mdx b/src/content/docs/agent-platform/capabilities/agent-notifications.mdx
@@ -103,18 +103,18 @@ If auto-install doesn't work or you're running an agent over SSH, Warp displays
 
 In a [multi-agent orchestration](/agent-platform/cloud-agents/orchestration/), each child agent's lifecycle and messaging activity is surfaced to whoever is watching the parent.
 
-* **Lifecycle events from children** - when a child transitions to `succeeded`, `failed`, `errored`, `blocked`, or `cancelled`, the parent receives the event and the parent's session generates an in-app notification. If you are looking at a different tab, the toast notification identifies which child changed state.
-* **Messages from children** - when a child sends a message to the parent through the orchestration messaging API, the parent generates a **Request** notification so you can review the message and decide whether the parent should respond.
-* **Blocked children** - a child that blocks on user input (for example, a command approval request) raises a notification on the child's own session **and** propagates a `run_blocked` event to the parent. The parent can decide whether to wait, send a follow-up, or cancel the child.
-* **Mailbox grouping** - notifications from children are tagged with the parent's run so the mailbox can group an orchestration's notifications together.
+* **Lifecycle events from children** — when a child transitions to `succeeded`, `failed`, `errored`, `blocked`, or `cancelled`, the parent's session generates an in-app notification. If you're looking at a different tab, the toast notification identifies which child changed state.
+* **Messages from children** — when a child sends a message to the parent, the parent generates a **Request** notification so you can review the message and decide whether the parent should respond.
+* **Blocked children** — a child that blocks on user input (for example, a command approval request) raises a notification on the child's own session **and** propagates a `run_blocked` event to the parent. The parent can wait, send a follow-up, or cancel the child.
+* **Mailbox grouping** — notifications from children are tagged with the parent's run so the mailbox can group an orchestration's notifications together.
 
-Notifications from cloud children are delivered to the user account that owns the orchestration. They do **not** fan out to every member of a team unless the parent run is configured for team-wide notification visibility.
+Notifications from cloud children are delivered to the user account that owns the orchestration. They don't fan out to every member of a team unless the parent run is configured for team-wide notification visibility.
 
 ## Related pages
 
 * [Desktop Notifications](/terminal/more-features/notifications/) - configure system-level notification permissions and troubleshoot delivery
 * [Managing Agents](/agent-platform/cloud-agents/managing-cloud-agents/) - monitor all agent conversations, filter by status, and inspect sessions
-* [Multi-agent orchestration](/agent-platform/cloud-agents/orchestration/) - parent/child model, lifecycle events, and the messaging API
+* [Multi-agent orchestration](/agent-platform/cloud-agents/orchestration/) - parent/child model, lifecycle events, and messaging between agents
 * [Third-Party CLI Agents](/agent-platform/cli-agents/overview/) - overview of supported CLI agents and Warp features
 * [Claude Code](/agent-platform/cli-agents/claude-code/) - setup and notification plugin installation
 * [Codex](/agent-platform/cli-agents/codex/) - setup and notification configuration
diff --git a/src/content/docs/agent-platform/cloud-agents/deployment-patterns.mdx b/src/content/docs/agent-platform/cloud-agents/deployment-patterns.mdx
@@ -53,7 +53,7 @@ Use this when you already have a system that schedules work (CI, dev boxes, inte
 #### Minimal setup checklist
 
 * A Warp team
-* A service account (recommended for automation)
+* An agent identity for non-interactive auth (use an [API key](/reference/cli/api-keys/) for automation)
 * The Oz CLI installed on the runner / box
 * Any needed credentials (often via secrets + environment variables)
 
diff --git a/src/content/docs/agent-platform/cloud-agents/managing-cloud-agents.mdx b/src/content/docs/agent-platform/cloud-agents/managing-cloud-agents.mdx
@@ -106,8 +106,8 @@ This is the fastest way to isolate "everything that failed today," "runs from Sl
 
 When a parent agent spawns one or more child agents through [multi-agent orchestration](/agent-platform/cloud-agents/orchestration/), the management view renders the hierarchy:
 
-* **Parent rows** include a child count chip showing how many children the parent has spawned and how many are still running.
-* **Child rows** appear nested under their parent and inherit a reference to the parent's run for quick navigation.
-* **Filtering** by parent: open any parent row to drill into a view scoped to its descendants.
+* **Parent rows** — include a child count chip showing how many children the parent has spawned and how many are still running.
+* **Child rows** — appear nested under their parent and link back to the parent's run.
+* **Drill-in** — open any parent row to scope the view to its descendants.
 
 Each child is a full Oz run, so credit usage, status, transcript, and artifacts are all available from the same row click as any other run. The parent row's status reflects only the parent's own work — a parent can finish successfully while a child is still running or has failed, so check the child rows separately when verifying that an orchestration completed.
diff --git a/src/content/docs/agent-platform/cloud-agents/orchestration/index.mdx b/src/content/docs/agent-platform/cloud-agents/orchestration/index.mdx
@@ -1,65 +1,62 @@
 ---
 title: Multi-agent orchestration
-description: >-
-  Coordinate parent and child agents across local and cloud runs. Spawn,
-  message, and subscribe to lifecycle events to build supervisor/worker,
-  fan-out, critic, DAG, and swarm patterns.
+description: Coordinate parent and child agents across local and cloud runs to build supervisor/worker, fan-out, critic, DAG, and swarm workflows on the Oz Platform.
 sidebar:
   label: "Orchestration"
 ---
 
-Multi-agent orchestration lets one agent spawn and coordinate other agents to parallelize work, delegate specialized tasks, or check each other's output. Every primitive — spawning a child, sending a message, subscribing to lifecycle events — is exposed as a public REST endpoint on the [Oz Platform](/agent-platform/cloud-agents/platform/), so the same patterns work from the Warp app, the [Oz CLI](/reference/cli/), and any system that can call the [Oz API](/reference/api-and-sdk/).
+Multi-agent orchestration lets one agent spawn and coordinate other agents to parallelize work, delegate specialized tasks, or check each other's output. The same parent/child model works from the Warp app, the [Oz CLI](/reference/cli/), and any system that can call the [Oz API](/reference/api-and-sdk/) — orchestration is built on the same run, conversation, and artifact primitives as any other [Oz Platform](/agent-platform/cloud-agents/platform/) run.
 
 This page covers the orchestration model and the patterns it supports. For how to start an orchestrated run, see [Running orchestrated agents](/agent-platform/cloud-agents/orchestration/multi-agent-runs/).
 
 ## The parent/child model
 
 An orchestrated workflow always has one **parent agent** and one or more **child agents**.
 
-* **Parent agent** - the agent that decides what work needs to be done, spawns child agents, and (optionally) merges their results. The parent is just an Oz agent; any Oz agent can become a parent the first time it spawns a child.
-* **Child agent** - an Oz agent spawned by a parent with its own prompt, environment, and (optionally) a different model or harness. A child can spawn its own children, so orchestrations are arbitrarily deep.
+* **Parent agent** — the agent that decides what work needs to be done, spawns child agents, and (optionally) merges their results. Any Oz agent can become a parent the first time it spawns a child.
+* **Child agent** — an Oz agent spawned by a parent with its own prompt, environment, and (optionally) a different model or harness. A child can spawn its own children, so orchestrations are arbitrarily deep.
 
-The parent and child each have an independent **run** with its own lifecycle, transcript, and credit usage. The child run records its parent in `parent_run_id` so the management view, API, and web app can render the hierarchy.
+The parent and child each have an independent **run** with its own lifecycle, transcript, conversation, and credit usage. Each child run records its parent's run ID so the management view, API, and web app can render the hierarchy.
 
 ### Where each side can run
 
 The parent and child don't have to run in the same place. Orchestration supports four combinations:
 
-* **Local → local** - A [Warp Agent](/agent-platform/local-agents/overview/) conversation in the desktop app spawns child Warp Agent conversations on the same machine. Useful for dogfooding orchestration patterns without spinning up cloud infrastructure.
-* **Local → cloud** - A local parent spawns one or more cloud children that run in [environments](/agent-platform/cloud-agents/environments/) on Warp-hosted or self-hosted infrastructure. The parent can keep working while children execute in parallel.
-* **Cloud → cloud** - A cloud parent spawns cloud children. This is the canonical pattern for review swarms, large fan-outs, and any orchestration triggered from Slack, Linear, a schedule, or the API.
-* **Cloud → local** - Less common, but supported when a cloud parent needs to hand off a task to a local Warp Agent for human-in-the-loop review.
+* **Local → local** — a [Warp Agent](/agent-platform/local-agents/overview/) conversation in the desktop app spawns child Warp Agent conversations on the same machine. Useful for trying orchestration patterns without spinning up cloud infrastructure.
+* **Local → cloud** — a local parent spawns one or more cloud children that run in [environments](/agent-platform/cloud-agents/environments/) on Warp-hosted or self-hosted infrastructure. The parent keeps working while children execute in parallel.
+* **Cloud → cloud** — a cloud parent spawns cloud children. This is the canonical pattern for review swarms, large fan-outs, and any orchestration triggered from Slack, Linear, a schedule, or the API.
+* **Cloud → local** — less common, but supported when a cloud parent needs to hand off a task to a local Warp Agent for human-in-the-loop review.
 
-Children can also use a different [harness](/agent-platform/cli-agents/overview/) than the parent. A parent Warp Agent can spawn Claude Code or Codex children, and vice versa.
+Children can also use a different harness than the parent. A parent running with the default Oz harness can spawn children that run with [Claude Code](/agent-platform/cli-agents/claude-code/) or [Codex](/agent-platform/cli-agents/codex/), and vice versa.
 
 ## Lifecycle events
 
-Each run emits lifecycle events that the parent (or any subscriber) can react to. The server records every event in an append-only log; the parent receives events in real time over SSE and can also poll the event log for catch-up.
+Each run emits lifecycle events as it progresses. The parent observes these events to decide what to do next — keep waiting, send a follow-up, spawn a replacement, or finish.
 
 The supported event types are:
 
-* **`run_in_progress`** - the run started executing (or restarted after being paused or blocked).
-* **`run_succeeded`** - the run completed successfully.
-* **`run_failed`** - the run hit a terminal failure.
-* **`run_errored`** - the run encountered an error during startup or execution. Includes a `stage` (`startup` or `runtime`) and an error reason on the event payload.
-* **`run_blocked`** - the run is waiting on a user action (for example, command approval, permission request, or `ask_user_question`).
-* **`run_cancelled`** - the run was cancelled before reaching a terminal state.
+* **`run_in_progress`** — the run started executing (or restarted after being paused or blocked).
+* **`run_succeeded`** — the run completed successfully.
+* **`run_failed`** — the run hit a terminal failure.
+* **`run_errored`** — the run encountered an error during startup or execution.
+* **`run_blocked`** — the run is waiting on a user action (for example, command approval or a permission request).
+* **`run_cancelled`** — the run was cancelled before reaching a terminal state.
 
-The parent's harness sees these events as inputs and decides what to do next: keep waiting, send a follow-up message, spawn a replacement, or finish.
+Observe lifecycle events in any of these surfaces:
 
-:::note
-Lifecycle subscription is opt-in per child. When the parent calls the spawn tool, it passes an optional `lifecycle_subscription` list that filters which event types the parent wants to receive. Omit it to subscribe to all event types.
-:::
+* **The parent's transcript** — the parent agent receives child events as it runs and reflects them in its own conversation.
+* **The management view and the Oz web app** — child rows show their current state and update as events arrive.
+* **The Oz API** — `GET /agent/runs/{runId}` returns the latest state of any run, and `GET /agent/runs?ancestor_run_id=PARENT_RUN_ID` lists every descendant in one call.
 
 ## Messaging between agents
 
-The parent and children communicate through a per-run inbox. Messages have a sender run, one or more recipient runs, a subject, and a body.
+The parent and children can exchange short, durable messages — handoffs, status updates, and final results — without piping full transcripts around. Each run has its own message inbox.
 
-* The sender writes to `POST /agent/messages` with `to`, `subject`, `body`, and `sender_run_id`.
-* The recipient sees the message in its inbox; the parent receives a `MessagesReceivedFromAgents` input the next time it runs.
-* Messages are durable. A child that wakes up after the parent sent a message still receives the inbox contents.
+* The sender names one or more recipient runs, a subject, and a body.
+* The recipient sees the message the next time it runs. Messages persist, so a child that wakes up after the parent sent a message still receives the inbox contents.
+* The sender must have edit access to both its own run and every recipient run.
 
-Messages are intentionally lightweight — they are not streamed transcripts. Use them for handoffs, status updates, and final results, not for piping a full conversation back to the parent.
+Messages are intentionally lightweight. Use them for coordination signals; use the parent's prompt to the child and the child's final output for the work itself.
 
 ## Common patterns
 
@@ -87,26 +84,26 @@ The parent encodes a directed acyclic graph of subtasks where some nodes depend
 
 ### Swarm
 
-A flat group of peer agents discover each other through the messaging API and coordinate without a strict hierarchy. The parent acts more like a coordinator than a supervisor. Use sparingly — swarms are powerful but harder to debug than hierarchical patterns.
+A flat group of peer agents discover each other through messaging and coordinate without a strict hierarchy. The parent acts more like a coordinator than a supervisor. Use sparingly — swarms are powerful but harder to debug than hierarchical patterns.
 
 ## Approval mode
 
-When an agent runs in `orchestrate` mode (set by the `/orchestrate` slash command or the API's `mode: orchestrate` field), the agent **proposes** an orchestration plan and waits for approval before spawning any children. This gives you a chance to review the breakdown — number of children, prompts, environments, parallelism — before paying for it. Once you approve, the parent starts spawning children.
+When an agent runs in `orchestrate` mode (set by the `/orchestrate` slash command or the API's `mode: orchestrate` field), the agent **proposes** an orchestration plan and waits for approval before spawning any children. Use this to review the breakdown — number of children, prompts, environments, parallelism — before paying for it. Once you approve, the parent starts spawning children.
 
-`orchestrate` mode is distinct from `plan` mode. A plan-mode agent produces a written plan for *itself* to execute. An orchestrate-mode agent produces a plan for itself plus a fleet of children.
+`orchestrate` mode is distinct from `plan` mode. A `plan`-mode agent produces a written plan for *itself* to execute. An `orchestrate`-mode agent produces a plan for itself plus a fleet of children.
 
 ## Observability
 
-Because every parent and child is a normal Oz run, all existing observability surfaces work without changes:
+Because every parent and child is a normal Oz run, the existing observability surfaces work without changes:
 
-* **[Managing cloud agents](/agent-platform/cloud-agents/managing-cloud-agents/)** - parent and child rows appear in the management view with the parent rendered above its children.
-* **[Oz web app](/agent-platform/cloud-agents/oz-web-app/)** - the Runs page renders the hierarchy and lets you open any run's transcript.
-* **[Oz API](/reference/api-and-sdk/)** - `GET /agent/runs?parent_run_id=...` lists a parent's children; `GET /agent/runs/{runId}` returns `parent_run_id` on each run.
-* **[Agent notifications](/agent-platform/capabilities/agent-notifications/)** - lifecycle and message events from children show up in the parent's notification mailbox.
+* **[Managing cloud agents](/agent-platform/cloud-agents/managing-cloud-agents/)** — parent and child rows appear in the management view with the parent rendered above its children.
+* **[Oz web app](/agent-platform/cloud-agents/oz-web-app/)** — the Runs page renders the hierarchy and lets you open any run's transcript.
+* **[Oz API](/reference/api-and-sdk/)** — `GET /agent/runs?ancestor_run_id=PARENT_RUN_ID` lists every descendant of a parent; `GET /agent/runs/{runId}` returns the run's state, `parent_run_id`, `conversation_id`, and produced artifacts.
+* **[Agent notifications](/agent-platform/capabilities/agent-notifications/)** — lifecycle events and messages from children show up in the parent's notification mailbox.
 
 ## Related pages
 
-* [Running orchestrated agents](/agent-platform/cloud-agents/orchestration/multi-agent-runs/) - how to start an orchestrated run from the CLI, slash command, web app, or API
-* [Oz API and SDK](/reference/api-and-sdk/) - REST endpoints for messages, events, and runs
-* [Cloud agents overview](/agent-platform/cloud-agents/overview/) - what a cloud agent run is and how it fits into the Oz Platform
-* [Deployment patterns](/agent-platform/cloud-agents/deployment-patterns/) - higher-level deployment models that orchestration composes with
+* [Running orchestrated agents](/agent-platform/cloud-agents/orchestration/multi-agent-runs/) — how to start an orchestrated run from the CLI, slash command, web app, or API.
+* [Oz API and SDK](/reference/api-and-sdk/) — REST endpoints for runs, conversations, and artifacts.
+* [Cloud agents overview](/agent-platform/cloud-agents/overview/) — what a cloud agent run is and how it fits into the Oz Platform.
+* [Deployment patterns](/agent-platform/cloud-agents/deployment-patterns/) — higher-level deployment models that orchestration composes with.
diff --git a/src/content/docs/agent-platform/cloud-agents/orchestration/multi-agent-runs.mdx b/src/content/docs/agent-platform/cloud-agents/orchestration/multi-agent-runs.mdx
diff --git a/src/content/docs/agent-platform/cloud-agents/oz-web-app.mdx b/src/content/docs/agent-platform/cloud-agents/oz-web-app.mdx