Swarm Command implements a 5-layer hierarchical multi-agent architecture derived from the SwarmSpeed 250 protocol. This document explains the system at two levels: a fast mental model first, then the full layer-by-layer breakdown.
If you only remember four things, remember these:
- Nexus decomposes the mission into domain-level work.
- Commanders own domains and turn them into smaller shards.
- Workers stay atomic β leaf nodes never spawn more agents.
- Review + Shadow Score decide quality before Nexus emits a final bundle.
Mission
β
Nexus
β
Commanders
β
Squad Leads (SS-250 only)
β
Workers
β
Reviewers + Shadow Score
β
Final synthesis
Note: At SS-50 and SS-100, the Squad Lead layer is skipped β Commanders spawn Workers directly (depth 2). The full 4-tier spawn chain (Nexus β Commander β Squad Lead β Worker) only applies at SS-250.
Read this doc when: you want the system model. Jump to: architecture diagrams for visuals, consensus for merge mechanics, and scaling for deployment choices.
βββββββββββββββββββ
L0 β NEXUS (1) β Model: claude-opus-4.6
β 128K ctx budget β Type: general-purpose
ββββββββββ¬βββββββββ
β
ββββββββββββββββββββββΌβββββββββββββββββββββ
β β β
βββββββ΄ββββββ βββββββ΄ββββββ βββββββ΄ββββββ
L1 β CMD-A (1) β β CMD-B (1) β ... β CMD-E (1) β Γ 5 Commanders
β 64K ctx β β 64K ctx β β 64K ctx β Type: general-purpose
βββββββ¬βββββββ βββββββ¬βββββββ βββββββ¬βββββββ Model: mixed
β β β
ββββββββββΌβββββββββ β β
β β β β β
ββββ΄βββ ββββ΄βββ ββββ΄βββ
L2 βSQ-1β βSQ-2β ... βSQ-10β Γ 10 per Commander = 50 Squad Leads
β32K β β32K β β32K β Type: general-purpose (can_launch=true)
ββββ¬βββ ββββ¬βββ ββββ¬βββ Model: claude-haiku-4.5
β β β
ββββ΄βββ ββββ΄βββ ββββ΄βββ
L3 βWΓ5 β βWΓ5 β βWΓ5 β Γ 5 per Squad Lead = 250 Workers
β 8K β β 8K β β 8K β Type: explore | task (LEAF β no spawning)
βββββββ βββββββ βββββββ Model: claude-haiku-4.5 | gpt-5.4-mini
ββββββββββββββββ
L4 β REVIEWERSΓ10 β Cross-review mesh
β 16K ctx β Type: general-purpose (can_launch=false)
ββββββββββββββββ Model: mixed (cross-family pairs)
+ SHADOW SCORING (Nexus-internal, sealed acceptance criteria, Shadow Score Spec L2)
Total agents for SS-250: ~316
Agent counts include all deployed agents across all layers: Nexus + Commanders + Squad Leads + Workers + Reviewers.
| Design choice | What it buys you |
|---|---|
| Single Nexus at the top | One decomposition authority and one final synthesis point |
| Domain-owning Commanders | Parallel workstreams without losing task ownership |
| Squad Leads between Commanders and Workers | Controlled fan-out and better task compression |
| Leaf-node Workers | Strict depth control and predictable cost |
| Independent Reviewers | Scoring from outside the execution path |
| Nexus-internal Shadow Score | Hidden validation without revealing the acceptance rubric |
| Property | Value |
|---|---|
| Agent type | general-purpose |
| Model | claude-opus-4.6 |
| Context budget | 128K tokens |
can_launch |
true |
| Responsibilities | Task decomposition, commander assignment, reviewer dispatch, sealed criteria generation (Phase 1.5), shadow score validation (Phase 6), final synthesis, circuit breaker authority |
| Spawns | 5 Commanders + 10 Reviewers |
The Nexus is the brain of the swarm. It receives the user's task, decomposes it into domains, creates Context Capsules for each Commander, monitors the swarm, and synthesizes the final output from bundles plus review scores.
| Property | Value |
|---|---|
| Agent type | general-purpose |
| Model | Commander pool (9): claude-opus-4.6, claude-opus-4.5, claude-opus-4.6-1m, claude-sonnet-4.6, claude-sonnet-4.5, claude-sonnet-4, gpt-5.4, gpt-5.2, gpt-5.1 |
| Context budget | 64K tokens |
can_launch |
true |
| Max children | 10 Squad Leads each |
| Responsibilities | Domain decomposition, squad lead dispatch, canary verification, result merging within domain |
Domain assignments:
| Commander | Domain | Focus |
|---|---|---|
| CMD-ARCH | Architecture & Structure | Patterns, interfaces, module boundaries |
| CMD-IMPL | Implementation & Logic | Core logic, algorithms, data flow |
| CMD-TEST | Testing & Validation | Test cases, edge cases, validation |
| CMD-DOCS | Documentation & Examples | Docs, comments, examples, guides |
| CMD-INTG | Integration & Review | Cross-cutting concerns, glue code, API contracts |
| Property | Value |
|---|---|
| Agent type | general-purpose |
| Model | claude-haiku-4.5 or gpt-5.4-mini (alternating) |
| Context budget | 32K tokens |
can_launch |
true |
| Max children | 5 Workers each |
| Responsibilities | Micro-task decomposition, canary deployment, worker dispatch, atom collection, local consensus |
| Property | Value |
|---|---|
| Agent type | explore or task |
| Model | Worker pool (6): claude-haiku-4.5, gpt-5.4-mini, gpt-5-mini, gpt-4.1, gpt-5.3-codex, gpt-5.2-codex |
| Context budget | 8K tokens |
can_launch |
false β structurally enforced |
| Responsibilities | Execute one atomic task, emit structured JSON atom |
Pod composition per Squad Lead:
| Role | Count | Agent Type | Purpose |
|---|---|---|---|
| Canary | 1 | explore |
Pre-flight check before full pod |
| Scout | 3 | explore |
Research, search, read files |
| Executor | 1 | task |
Run commands, build, test |
| Property | Value |
|---|---|
| Agent type | general-purpose |
| Model | Mixed cross-family pairs |
| Context budget | 16K tokens |
can_launch |
false |
| Responsibilities | Cross-domain scoring, conflict detection, consensus voting |
Shadow Scoring (Shadow Score Spec L2)
Shadow scoring is Nexus-internal β no separate validator agents are spawned. The Nexus generates sealed acceptance criteria in Phase 1.5 and validates commander outputs against them in Phase 6.
| Property | Value |
|---|---|
| Implementation | Nexus-internal sealed-envelope protocol |
| Criteria | 10 binary pass/fail acceptance criteria |
| Formula | Shadow Score = (failures / total) Γ 100 |
| Hardening | 1 cycle if score > 15% |
| Conformance | Shadow Score Spec L2 |
T+0s T+2s T+5s T+12s T+45s T+65s T+80s T+90s
β β β β β β β β
βΌ βΌ βΌ βΌ βΌ βΌ βΌ βΌ
ββββββ ββββββββ βββββββββββ ββββββββββββ ββββββββββ βββββββββ ββββββ ββββββ
βNEXUSββ βCMDs ββ βSQUAD ββ βWORKERS β βREVIEW β βMERGE β βVOTEβ βEMITβ
βBOOT β βSPAWN β βLEADS β βEXECUTE β βMESH β βRESULTSβ β β β β
β β β β β+ CANARY β β(parallel)β β(overlapβ β β β β β β
β β β β βVERIFY β β β βstart) β β β β β β β
ββββββ ββββββββ βββββββββββ ββββββββββββ ββββββββββ βββββββββ ββββββ ββββββ
2s 3s 7s 33s 20s 15s 10s 5s
βββββ LAUNCH PHASE βββββΊβββ EXECUTION βββΊβββββ CONVERGENCE PHASE βββββββββΊ
(~12s) (~33s) (~45s)
Key insight: pipeline overlap. Reviewers start before every worker is done. The review mesh begins as soon as the first commander pair completes, which removes review time from the critical path.
CONTEXT DOWN (shrinking) RESULTS UP (compressing)
======================== ========================
L0 Full Task Brief βββ 4K tokens ββββΊ Final Report βββ 4K tokens
β β²
L1 Context Capsule βββ 2K tokens ββββΊ Bundle βββ 1K tokens
β β²
L2 Shard βββ 512 tokens βββΊ Atom Set βββ 512 tokens
β β²
L3 Micro-Brief βββ 128 tokens βββΊ Atom βββ 256 tokens
β β²
L4 Review Capsule βββ 1K tokens ββββΊ Score Card βββ 512 tokens
Compression ratio: 1024:1 β from 128K tokens at Nexus down to 128 tokens at Worker level.
- Strip rationale at each layer β children need the task, not the history
- File scope narrows monotonically β a child scope is always a subset of its parent
- Constraints tighten monotonically β timeouts and token caps can only decrease
- Parent context stays short β at most ~50 tokens of βwhy this mattersβ
- Conflicts bubble up β disagreements survive until a higher layer resolves them
- Confidence is geometric mean β
(cβ Γ cβ Γ ... Γ cβ)^(1/n) - Failed atoms are replaced β If a worker fails, the Squad Lead may re-launch ONE replacement (using its own retry budget of 1). Workers have retry budget = 0.
- Deduplication is content-hash based β identical atoms merge and confidence rises
For maximum insight diversity, models from different families are paired within the same pod:
| Pod Role | Primary Model | Alternate Model | Why alternate |
|---|---|---|---|
| Commander | claude-opus-4.6 | gpt-5.4, gpt-5.2, gpt-5.1 | Reduce same-family blind spots |
| Squad Lead | claude-haiku-4.5 | gpt-5.4-mini | Keep fan-out cheap while mixing reasoning styles |
| Scout Worker | claude-haiku-4.5 | gpt-5.4-mini, gpt-5-mini, gpt-4.1 | Increase search and interpretation diversity |
| Executor Worker | gpt-5.3-codex | gpt-5.2-codex | Prefer code execution specialists for build/test |
| Reviewer | 7 cross-family pairs | β | Final scoring should not be self-referential |
- Parent-controlled spawning β children never decide whether they can launch descendants
- Signal compression at every layer β context shrinks going down, results compress going up
- Canary-before-swarm β deploy one canary worker before the whole pod
- Fail parsably β structured outputs, structured failures, no silent collapse
- Pipeline overlap β review starts before total execution finishes
The architecture is designed for concurrent execution at scale with wave deployment to respect platform rate limits. Wall-clock time grows slower than agent count because the expensive work runs in parallel, but launches are staggered in waves (Canary β Probe β Remainder) to avoid concentrated bursts:
Agents Wall-Clock Ratio vs SS-50
50 ~30s 1.0Γ
100 ~45s 1.5Γ
250 ~70s 2.3Γ
These are design targets, not measured benchmarks. Actual performance depends on task decomposability and platform concurrency limits. Wave deployment adds ~4-6s per layer but prevents rate-limit failures that would cost more time in recovery.
The main serial bottlenecks are Nexus decomposition (~2s), canary verification (~3s), wave gate checks (~2s per gate), and final synthesis (~10s). Everything else overlaps via hierarchical fan-out.