Skip to content

Commit 6515e6f

Browse files
Gregg CochranCopilot
andcommitted
fix: comprehensive swarm self-audit — 15 files, 30+ fixes across all scales
SS-100 swarm audit (36 agents, 5 domains) + SS-50 verification pass identified and fixed 30+ issues across the entire repo. SKILL.md: - Consensus formula clamped to [0.0, 1.0] - Sealed criteria count scale-aware (SS-50=6, SS-100=8, SS-250=10) - Scale-conditional Squad Lead logic in Phase 3 - Depth Guard updated for SS-50/100 flat hierarchy (depth 2) - Circuit breaker: phase-specific behavior, scale-adjusted thresholds - Recovery levels L1-L5 fully defined - Hardening regression policy added - JSON schema validation recovery path - Phase sequencing rule corrected - Phase 7/8 templates scale-aware - SS-50 agent count ~36-52, SS-100 all 5 domains - Context capsule timeout/max_depth scale-aware - Model names canonicalized, Phase 8 banner consistent Cross-repo (14 files synced with SKILL.md): config.yml, README.md, CONTRIBUTING.md, agents/, docs/scaling.md, docs/shadow-scoring.md, docs/use-cases.md, docs/architecture-diagrams.md, docs/architecture.md, protocols/circuit-breaker.md, protocols/context-capsule.md, protocols/depth-guard.md, templates/commander.md, site/src/app/page.tsx, .github/skills/ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 2a6203e commit 6515e6f

15 files changed

Lines changed: 261 additions & 206 deletions

File tree

.github/skills/swarm-command/SKILL.md

Lines changed: 108 additions & 83 deletions
Large diffs are not rendered by default.

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@ Before opening a PR, ensure:
3030
- [ ] Both SKILL.md copies are identical:
3131
- `skills/swarm-command/SKILL.md`
3232
- `.github/skills/swarm-command/SKILL.md`
33-
- [ ] Agent counts match across all files (SS-50: ~52, SS-100: ~89, SS-250: ~316)
34-
- [ ] Agent counts verified via `grep -r "~52\|~89\|~316" .`
33+
- [ ] Agent counts match across all files (SS-50: ~36-52, SS-100: ~89, SS-250: ~316)
34+
- [ ] Agent counts verified via `grep -r "~36-52\|~89\|~316" .`
3535
- [ ] Shadow scoring references use Shadow Score Spec format (no separate shadow validator agents)
3636
- [ ] docs/example-output.md reflects current output format (if output format changed)
3737

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ If the canary fails, the full pod never deploys. One cheap test prevents many ex
136136

137137
| Scale | Agents | Typical Cost | Hard Cap | Wall-Clock |
138138
|---|---|---|---|---|
139-
| **SS-50** | ~52 | $2.50 | $5 | ~30s |
139+
| **SS-50** | ~36-52 | $2.50 | $5 | ~30s |
140140
| **SS-100** | ~89 | $5.50 | $10 | ~45s |
141141
| **SS-250** | ~316 | $10 | $20 | ~65–90s |
142142

@@ -249,7 +249,7 @@ T+0s T+2s T+5s T+12s T+45s T+65s T+80s T+90s
249249

250250
| Scale | Agents | Commanders | Workers | Reviewers | Best For | Wall-Clock |
251251
|---|---|---|---|---|---|---|
252-
| **SS-50** | ~52 | 3 | 45 | 3 | Fast bounded tasks | ~30s |
252+
| **SS-50** | ~36-52 | 2-3 | 30-45 | 3 | Fast bounded tasks | ~30s |
253253
| **SS-100** | ~89 | 5 | 75 | 8 | Multi-file features and reviews | ~45s |
254254
| **SS-250** | ~316 | 5 | 250 | 10 | Repo-wide or high-stakes work | ~65–90s |
255255

@@ -536,9 +536,9 @@ shadow_scoring:
536536
enabled: true
537537
spec_version: "1.0.0"
538538
conformance_level: "L2"
539-
sealed_criteria_count: 10
539+
sealed_criteria_count: 10 # max; per-scale: SS-50=6, SS-100=8, SS-250=10
540540
hardening:
541-
enabled: true
541+
enabled: true # SS-50 overrides to disabled
542542
threshold: 15
543543
```
544544
@@ -552,7 +552,7 @@ See [docs/scaling.md](docs/scaling.md) for full scaling configuration and cost e
552552
|---|---|
553553
| **Nexus** | claude-opus-4.6 |
554554
| **Commanders** (pool: 9) | claude-opus-4.6, claude-opus-4.5, claude-opus-4.6-1m, claude-sonnet-4.6, claude-sonnet-4.5, claude-sonnet-4, gpt-5.4, gpt-5.2, gpt-5.1 |
555-
| **Squad Leads** | claude-haiku-4.5, gpt-5.4-mini |
555+
| **Squad Leads** (SS-250 only) | claude-haiku-4.5, gpt-5.4-mini |
556556
| **Workers** (pool: 6) | claude-haiku-4.5, gpt-5.4-mini, gpt-5-mini, gpt-4.1, gpt-5.3-codex, gpt-5.2-codex |
557557
| **Reviewers** (7 pairs) | claude-opus-4.6↔gpt-5.4, claude-opus-4.5↔gpt-5.2, claude-opus-4.6-1m↔gpt-5.1, claude-sonnet-4.6↔gpt-5.3-codex, claude-sonnet-4.5↔gpt-5.2-codex, claude-sonnet-4↔gpt-5.4-mini, claude-haiku-4.5↔gpt-5-mini |
558558

agents/swarm-command.agent.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ You are **Swarm Command** 🐝 — a multi-model consensus swarm orchestrator ru
1414

1515
**Personality:** Calm, authoritative swarm commander. Military precision meets collective intelligence. Efficient status updates, clear phase transitions, structured output. You are the Nexus — the brain of the hive.
1616

17-
**⚠️ MANDATORY: Execute ALL phases in sequence. NEVER skip phases.**
17+
**⚠️ MANDATORY: Execute ALL phases 0-8 in sequence. Phase 5 may overlap with Phase 4. If the circuit breaker trips, proceed to Phase 6 with available bundles, then Phase 7 for partial synthesis.**
1818

1919
**🎭 OUTPUT RULE:** Your visible output is the MISSION BRIEFING and RESULTS. Show phase banners, progress tables, and the final synthesized report. Do not narrate your internal process.
2020

config.yml

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@ swarm_command:
22
default_scale: "ss-100"
33
scales:
44
ss-50:
5-
commanders: 3
5+
commanders: 3 # max; runtime selects 2-3 most relevant domains
66
workers_per_commander: 15 # No squad leads — commanders spawn workers directly
77
max_depth: 2 # Nexus[0] → Commander[1] → Worker[2]
88
reviewers: 3
99
sealed_criteria_count: 6
1010
cost_ceiling_usd: 5.00
1111
timeout_s: 60
12-
total: ~52
12+
total: "~36-52" # 2 commanders = 36, 3 commanders = 52
1313
ss-100:
1414
commanders: 5
1515
workers_per_commander: 15 # No squad leads — commanders spawn workers directly
@@ -41,7 +41,7 @@ swarm_command:
4141
conflict_penalty_cap: 0.30
4242

4343
depth_guard:
44-
max_spawn_depth: 3
44+
max_spawn_depth: 3 # SS-250 max; SS-50/SS-100 use max_depth=2 (no Squad Leads)
4545
max_workers_per_squad_lead: 5
4646
worker_agent_types: ["explore", "task"]
4747
commander_agent_type: "general-purpose"
@@ -121,11 +121,14 @@ swarm_command:
121121
enabled: true
122122
spec_version: "1.0.0"
123123
conformance_level: "L2"
124-
sealed_criteria_count: 10 # criteria per task
124+
sealed_criteria_count: 10 # max criteria; per-scale overrides in scales section (6/8/10)
125125
hardening:
126-
enabled: true
126+
enabled: true # global default; SS-50 overrides to disabled
127127
max_cycles: 1
128128
threshold: 15 # hardening triggers if score > 15%
129+
scale_overrides:
130+
ss-50:
131+
enabled: false # SS-50: score computed but no fix cycle
129132
categories:
130133
- happy_path
131134
- edge_case

docs/architecture-diagrams.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ flowchart TD
238238
```mermaid
239239
graph LR
240240
subgraph SS50["🐝 SS-50"]
241-
A50["3 Commanders<br/>15 Workers each<br/>3 Reviewers<br/>~52 agents"]
241+
A50["2-3 Commanders<br/>15 Workers each<br/>3 Reviewers<br/>~36-52 agents"]
242242
end
243243
244244
subgraph SS100["🐝🐝 SS-100"]

docs/scaling.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Is the task bounded to 1–2 files or one very narrow question?
3939

4040
| Scale | Total Agents | Commanders | Squad Leads | Workers | Reviewers | Best For | Wall-Clock |
4141
|---|---|---|---|---|---|---|---|
42-
| **SS-50** | ~52 | 3 || 45 | 3 | Fast bounded tasks | ~30s |
42+
| **SS-50** | ~36-52 | 2-3 || 30-45 | 3 | Fast bounded tasks | ~30s |
4343
| **SS-100** | ~89 | 5 || 75 | 8 | Default for real software work | ~45s |
4444
| **SS-250** | ~316 | 5 | 50 | 250 | 10 | Repo-wide or maximum-confidence work | ~65–90s |
4545

@@ -55,11 +55,11 @@ Default: **SS-100**. Use `swarm command ss-250` for full deployment or `swarm co
5555

5656
```text
5757
L0: 1 Nexus (claude-opus-4.6)
58-
L1: 3 Commanders (commander pool — 9 models)
59-
L2: 45 Workers (worker pool — 6 models) — 15 per commander, spawned directly
58+
L1: 2-3 Commanders (commander pool — 10 models)
59+
L2: 30-45 Workers (worker pool — 6 models) — 15 per commander, spawned directly
6060
3 Reviewers (cross-family pairs, spawned by Nexus)
6161
──────────────────────────
62-
Total: ~52 agents
62+
Total: ~36-52 agents
6363
Cost: $1.50 – $3.50
6464
Time: ~30s wall-clock
6565
```
@@ -68,7 +68,7 @@ Time: ~30s wall-clock
6868

6969
| Parameter | Value |
7070
|---|---|
71-
| Commanders | 3 |
71+
| Commanders | 2-3 |
7272
| Domains covered | 2–3 of 5 (auto-selected by task type) |
7373
| Squad Leads per Commander ||
7474
| Workers per Commander | 15 |
@@ -93,7 +93,7 @@ Time: ~30s wall-clock
9393

9494
```text
9595
L0: 1 Nexus (claude-opus-4.6)
96-
L1: 5 Commanders (commander pool — 9 models)
96+
L1: 5 Commanders (commander pool — 10 models)
9797
L2: 75 Workers (worker pool — 6 models) — 15 per commander, spawned directly
9898
8 Reviewers (cross-family pairs, spawned by Nexus)
9999
Shadow Scoring (Nexus-internal, sealed criteria)
@@ -108,10 +108,10 @@ Time: ~45s wall-clock
108108
| Parameter | Value |
109109
|---|---|
110110
| Commanders | 5 |
111-
| Domains covered | 3 of 5 (auto-selected by task type) |
111+
| Domains covered | All 5 |
112112
| Squad Leads per Commander ||
113113
| Workers per Commander | 15 |
114-
| Reviewers | 8 reviewers (7 cross-family pairs) |
114+
| Reviewers | 8 reviewers (3-4 cross-family review pairs) |
115115
| Shadow scoring | 8 sealed criteria, hardening at >15% |
116116
| Cost ceiling | $10.00 |
117117
| Timeout cascade | 75/50/35/25s |
@@ -132,10 +132,10 @@ Time: ~45s wall-clock
132132

133133
```text
134134
L0: 1 Nexus (claude-opus-4.6)
135-
L1: 5 Commanders (commander pool — 9 models)
135+
L1: 5 Commanders (commander pool — 10 models)
136136
L2: 50 Squad Leads (claude-haiku-4.5 | gpt-5.4-mini) — 10 per commander
137137
L3: 250 Workers (worker pool — 6 models) — 5 per squad lead
138-
L4: 10 Reviewers (7 cross-family pairs)
138+
L4: 10 Reviewers (7 cross-family pairs, cycled to fill 10 slots)
139139
Shadow Scoring (Nexus-internal, sealed criteria)
140140
──────────────────────────
141141
Total: ~316 agents
@@ -152,7 +152,7 @@ Time: ~65–90s wall-clock
152152
| Squad Leads per Commander | 10 |
153153
| Workers per Squad Lead | 5 |
154154
| Reviewers | 10 reviewers forming 7 cross-family pairs |
155-
| Shadow scoring | 10 sealed criteria, hardening at >15% |
155+
| Shadow scoring | 10 sealed criteria (SS-50: 6, SS-100: 8), hardening at >15% (SS-50: disabled) |
156156
| Cost ceiling | $20.00 |
157157
| Timeout cascade | 90/60/40/30s |
158158

docs/shadow-scoring.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ After commanders complete and cross-review finishes:
9898
4. **Classify** using the interpretation scale
9999
5. **Produce a Gap Report** for each bundle
100100

101-
### Phase 4: HARDENING (Swarm Command Phase 6, continued)
101+
### HARDENING (Swarm Command Phase 6, continued)
102102

103103
If Shadow Score > 15% for any bundle:
104104

@@ -217,9 +217,9 @@ shadow_scoring:
217217
enabled: true
218218
spec_version: "1.0.0"
219219
conformance_level: "L2"
220-
sealed_criteria_count: 10
220+
sealed_criteria_count: 10 # max; per-scale: SS-50=6, SS-100=8, SS-250=10
221221
hardening:
222-
enabled: true
222+
enabled: true # SS-50 overrides to disabled
223223
max_cycles: 1
224224
threshold: 15
225225
categories:

docs/use-cases.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ This guide turns the swarm from an impressive concept into a practical tool. Pic
2222

2323
## SS-50 — Fast Expert Panels
2424

25-
*~52 agents · 3 commanders · 45 workers · ~30 seconds · $1.50–$3.50*
25+
*~36-52 agents · 2-3 commanders · 30-45 workers · ~30 seconds · $1.50–$3.50*
2626

2727
### 1. 🔥 Stack Trace Whisperer
2828

protocols/circuit-breaker.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,11 @@ The HALF-OPEN probe MUST:
7575

7676
| Layer | Agents | Threshold to OPEN | Cooldown | Probe Size |
7777
|---|---|---|---|---|
78-
| **Nexus (L0)** | Monitors 5 Commanders | 3/5 commanders fail (60%) | 10s | 1 commander re-dispatch |
79-
| **Commander (L1)** | Monitors 10 Squad Leads | 5/10 squad leads fail (50%) | 5s | 1 squad lead re-dispatch |
80-
| **Squad Lead (L2)** | Monitors 5 Workers | 3/5 workers fail (50%) | 3s | 1 canary worker |
81-
| **Reviewer (L4)** | Monitors review mesh | 3/5 reviews fail (50%) | 5s | 1 review re-dispatch |
78+
| **Nexus (L0)** | Monitors 2-5 Commanders | SS-50: 2+ fail (≥50%), SS-100/250: 3/5 fail (60%) | 10s | 1 commander re-dispatch |
79+
| **Commander (L1, SS-250)** | Monitors 10 Squad Leads | 5/10 squad leads fail (50%) | 5s | 1 squad lead re-dispatch |
80+
| **Commander (L1, SS-50/100)** | Monitors 15 Workers | 8/15 workers fail (50%) | 5s | 1 canary worker |
81+
| **Squad Lead (L2, SS-250 only)** | Monitors 5 Workers | 3/5 workers fail (50%) | 3s | 1 canary worker |
82+
| **Reviewer** | Monitors review mesh | 3/5 reviews fail (50%) | 5s | 1 review re-dispatch |
8283

8384
### Failure Definitions
8485

0 commit comments

Comments
 (0)