Skip to content

Commit d966a73

Browse files
Gregg CochranCopilot
andcommitted
fix: scale test failures — goldeneye in commander pool, 8 reviewer pairs, SS-100 domains
Scale verification tests (SS-50, SS-100, SS-250) found remaining issues: SKILL.md: - Commander pool: 9 → 10 models (add goldeneye) - Reviewer pairs: 7 → 8 (add goldeneye↔gpt-4.1) - SS-250 pool reference: 'pool of 9' → 'pool of 10' README.md: - Commander pool: 9 → 10, add goldeneye - Reviewer pairs: 7 → 8, add goldeneye↔gpt-4.1 - Squad Leads marked (SS-250 only) docs/scaling.md: - SS-100 domains: '3 of 5' → 'All 5' - SS-100 reviewers: '7 cross-family pairs' → '3-4 review pairs' - SS-250 reviewers: '7 pairs' → '8 pairs' (3 locations) docs/architecture.md: - Reviewer pairs: 7 → 8 config.yml: - SS-50 commanders comment: clarify 2-3 range .github/skills/: synced with skills/ copy Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 440b1ec commit d966a73

6 files changed

Lines changed: 128 additions & 103 deletions

File tree

.github/skills/swarm-command/SKILL.md

Lines changed: 113 additions & 88 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -551,10 +551,10 @@ See [docs/scaling.md](docs/scaling.md) for full scaling configuration and cost e
551551
| Role | Models |
552552
|---|---|
553553
| **Nexus** | claude-opus-4.6 |
554-
| **Commanders** (pool: 9) | claude-opus-4.6, claude-opus-4.5, claude-opus-4.6-1m, claude-sonnet-4.6, claude-sonnet-4.5, claude-sonnet-4, gpt-5.4, gpt-5.2, gpt-5.1 |
555-
| **Squad Leads** | claude-haiku-4.5, gpt-5.4-mini |
554+
| **Commanders** (pool: 10) | claude-opus-4.6, claude-opus-4.5, claude-opus-4.6-1m, claude-sonnet-4.6, claude-sonnet-4.5, claude-sonnet-4, gpt-5.4, gpt-5.2, gpt-5.1, goldeneye |
555+
| **Squad Leads** (SS-250 only) | claude-haiku-4.5, gpt-5.4-mini |
556556
| **Workers** (pool: 6) | claude-haiku-4.5, gpt-5.4-mini, gpt-5-mini, gpt-4.1, gpt-5.3-codex, gpt-5.2-codex |
557-
| **Reviewers** (7 pairs) | claude-opus-4.6↔gpt-5.4, claude-opus-4.5↔gpt-5.2, claude-opus-4.6-1m↔gpt-5.1, claude-sonnet-4.6↔gpt-5.3-codex, claude-sonnet-4.5↔gpt-5.2-codex, claude-sonnet-4↔gpt-5.4-mini, claude-haiku-4.5↔gpt-5-mini |
557+
| **Reviewers** (8 pairs) | claude-opus-4.6↔gpt-5.4, claude-opus-4.5↔gpt-5.2, claude-opus-4.6-1m↔gpt-5.1, claude-sonnet-4.6↔gpt-5.3-codex, claude-sonnet-4.5↔gpt-5.2-codex, claude-sonnet-4↔gpt-5.4-mini, claude-haiku-4.5↔gpt-5-mini, goldeneye↔gpt-4.1 |
558558
559559
---
560560

config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ swarm_command:
22
default_scale: "ss-100"
33
scales:
44
ss-50:
5-
commanders: 3
5+
commanders: 3 # max; runtime selects 2-3 most relevant domains
66
workers_per_commander: 15 # No squad leads — commanders spawn workers directly
77
max_depth: 2 # Nexus[0] → Commander[1] → Worker[2]
88
reviewers: 3

docs/architecture.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ For maximum insight diversity, models from different families are paired within
246246
| Squad Lead | claude-haiku-4.5 | gpt-5.4-mini | Keep fan-out cheap while mixing reasoning styles |
247247
| Scout Worker | claude-haiku-4.5 | gpt-5.4-mini, gpt-5-mini, gpt-4.1 | Increase search and interpretation diversity |
248248
| Executor Worker | gpt-5.3-codex | gpt-5.2-codex | Prefer code execution specialists for build/test |
249-
| Reviewer | 7 cross-family pairs || Final scoring should not be self-referential |
249+
| Reviewer | 8 cross-family pairs || Final scoring should not be self-referential |
250250

251251
---
252252

docs/scaling.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -108,10 +108,10 @@ Time: ~45s wall-clock
108108
| Parameter | Value |
109109
|---|---|
110110
| Commanders | 5 |
111-
| Domains covered | 3 of 5 (auto-selected by task type) |
111+
| Domains covered | All 5 |
112112
| Squad Leads per Commander ||
113113
| Workers per Commander | 15 |
114-
| Reviewers | 8 reviewers (7 cross-family pairs) |
114+
| Reviewers | 8 reviewers (3-4 cross-family review pairs) |
115115
| Shadow scoring | 8 sealed criteria, hardening at >15% |
116116
| Cost ceiling | $10.00 |
117117
| Timeout cascade | 75/50/35/25s |
@@ -135,7 +135,7 @@ L0: 1 Nexus (claude-opus-4.6)
135135
L1: 5 Commanders (commander pool — 10 models)
136136
L2: 50 Squad Leads (claude-haiku-4.5 | gpt-5.4-mini) — 10 per commander
137137
L3: 250 Workers (worker pool — 6 models) — 5 per squad lead
138-
L4: 10 Reviewers (7 cross-family pairs)
138+
L4: 10 Reviewers (8 cross-family pairs, cycled to fill 10 slots)
139139
Shadow Scoring (Nexus-internal, sealed criteria)
140140
──────────────────────────
141141
Total: ~316 agents
@@ -173,7 +173,7 @@ Time: ~65–90s wall-clock
173173
| Commanders (L1) | 5 | commander pool | 30K × 5 | 4K × 5 | $0.75 |
174174
| Squad Leads (L2) | 50 | haiku / gpt-5.4-mini | 8K × 50 | 2K × 50 | $0.72 |
175175
| Workers (L3) | 250 | worker pool | 2K × 250 | 0.5K × 250 | $0.90 |
176-
| Reviewers (L4) | 10 | 7 cross-family pairs | 10K × 10 | 2K × 10 | $0.60 |
176+
| Reviewers (L4) | 10 | 8 cross-family pairs | 10K × 10 | 2K × 10 | $0.60 |
177177
| **Total** | **316** | | | | **$4.32** (optimistic) |
178178

179179
---

skills/swarm-command/SKILL.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -245,10 +245,10 @@ Launch Commanders in PARALLEL using the `task` tool:
245245

246246
### Scale-Specific Deployment
247247

248-
**Commander pool (9 models — draw in order, alternate Claude↔GPT for diversity):**
248+
**Commander pool (10 models — draw in order, alternate Claude↔GPT for diversity):**
249249
```
250250
claude-opus-4.6, claude-opus-4.5, claude-opus-4.6-1m, claude-sonnet-4.6, claude-sonnet-4.5,
251-
claude-sonnet-4, gpt-5.4, gpt-5.2, gpt-5.1
251+
claude-sonnet-4, gpt-5.4, gpt-5.2, gpt-5.1, goldeneye
252252
```
253253

254254
**SS-50 (2-3 Commanders):**
@@ -267,7 +267,7 @@ Commander 4: agent_type="general-purpose", model="gpt-5.2"
267267
Commander 5: agent_type="general-purpose", model="claude-sonnet-4"
268268
```
269269

270-
**SS-250 (5 Commanders — drawn from commander pool of 9):**
270+
**SS-250 (5 Commanders — drawn from commander pool of 10):**
271271
```
272272
Commander 1 (ARCH): agent_type="general-purpose", model="claude-opus-4.6"
273273
Commander 2 (IMPL): agent_type="general-purpose", model="gpt-5.4"
@@ -421,7 +421,7 @@ Pair bundles from different domains for cross-review:
421421
| 6 | CMD-ARCH | CMD-TEST | claude-sonnet-4 ↔ gpt-5.4-mini |
422422
| 7 | CMD-IMPL | CMD-DOCS | claude-haiku-4.5 ↔ gpt-5-mini |
423423

424-
For SS-50/SS-100: Use 3-4 review pairs based on available bundles. For SS-250: Use all 7 cross-family pairs (10 reviewer slots filled by cycling through pairs).
424+
For SS-50/SS-100: Use 3-4 review pairs based on available bundles. For SS-250: Use all 8 cross-family pairs (10 reviewer slots filled by cycling through pairs).
425425

426426
### Reviewer Prompt
427427

@@ -1012,10 +1012,10 @@ Apply these 7 critical optimizations:
10121012
| Role | Model Pool | Rule |
10131013
|---|---|---|
10141014
| Nexus (you) | `claude-opus-4.6` | Always opus — top reasoning model |
1015-
| Commander (pool: 9) | `claude-opus-4.6`, `claude-opus-4.5`, `claude-opus-4.6-1m`, `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.2`, `gpt-5.1` | Draw in order; alternate Claude↔GPT for diversity |
1015+
| Commander (pool: 10) | `claude-opus-4.6`, `claude-opus-4.5`, `claude-opus-4.6-1m`, `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.2`, `gpt-5.1`, `goldeneye` | Draw in order; alternate Claude↔GPT for diversity |
10161016
| Squad Lead (SS-250 only) | `claude-haiku-4.5`, `gpt-5.4-mini` | Alternate within commander for cross-family diversity |
10171017
| Worker (pool: 6) | `claude-haiku-4.5`, `gpt-5.4-mini`, `gpt-5-mini`, `gpt-4.1`, `gpt-5.3-codex`, `gpt-5.2-codex` | Mix within pod; Codex variants for build/test tasks |
1018-
| Reviewer (7 pairs) | `claude-opus-4.6`↔`gpt-5.4`, `claude-opus-4.5`↔`gpt-5.2`, `claude-opus-4.6-1m`↔`gpt-5.1`, `claude-sonnet-4.6`↔`gpt-5.3-codex`, `claude-sonnet-4.5`↔`gpt-5.2-codex`, `claude-sonnet-4`↔`gpt-5.4-mini`, `claude-haiku-4.5`↔`gpt-5-mini` | Always cross-family pairs |
1018+
| Reviewer (8 pairs) | `claude-opus-4.6`↔`gpt-5.4`, `claude-opus-4.5`↔`gpt-5.2`, `claude-opus-4.6-1m`↔`gpt-5.1`, `claude-sonnet-4.6`↔`gpt-5.3-codex`, `claude-sonnet-4.5`↔`gpt-5.2-codex`, `claude-sonnet-4`↔`gpt-5.4-mini`, `claude-haiku-4.5`↔`gpt-5-mini`, `goldeneye`↔`gpt-4.1` | Always cross-family pairs |
10191019
| Shadow Scoring | Nexus-internal | Nexus validates against sealed criteria (Shadow Score Spec L2) |
10201020
10211021
---

0 commit comments

Comments
 (0)