feat: parallelize arm execution via SDK subagents (one subagent per arm × seed)

## TL;DR
After DESIGN produces `experiment_plan.yaml`, fan out arms to subagents in parallel, each in its own git worktree. Reduce wall-clock 2–4× on multi-arm iterations and eliminate the 90-minute "single mega-session" failure mode.

## Why this matters
`mech-design-enforcement` iter-2 ran 8 conditions × 3 seeds = 24 simulations sequentially in one Sonnet session. That 2.5-hour session was the proximate cause of the connection drops the user complained about on 5/18 — and when one of those connection drops triggered `--max-cli-retries 10`, a *second* worktree spawned while the first was still alive, leading to two executors racing on the same `iter-2/results/` directory.

Subagents share the parent's CLAUDE.md / skills / auto-memory, so per-arm context is small. Each arm is independent (different seed / different policy / same patched binary). Worktree-per-arm gives clean isolation.

## What's already shipped
- `worktree.py` already manages per-iteration worktrees.
- #71 + #111 handle transient failures gracefully — but per-arm subagents fail in smaller, more recoverable units.

## Proposed approach
1. Parse `experiment_plan.yaml` into a list of `(arm, seed, command)` units.
2. For each unit, spawn a subagent via the SDK with:
   - Read access to `iter-N/inputs/` and the patched binary.
   - Write access to `iter-N/results/<arm>/<seed>/`.
   - Worktree isolation (see #13).
3. Cap concurrency by a campaign-level `max_parallel_arms:` setting (default = min(CPU, 4)).
4. Parent agent awaits all subagents, then runs the existing deterministic merge into `findings.json` and `principle_updates.json`.
5. On partial failure, retry only the failed units (not the whole iteration).

## Acceptance criteria
- [ ] `examples/campaign-best-of-field.yaml` runs end-to-end with `max_parallel_arms: 4` in significantly less wall-clock than the serial baseline.
- [ ] Failure of one arm does not abort the iteration; it surfaces in the retry log and merges as `arm_status: "failed"` in `findings.json`.
- [ ] No two subagents ever write to the same `results/` subpath.

## Out of scope
- Cross-iteration parallelism (still one iteration at a time).
- Background-agents / multi-campaign supervision (see #14).

---
*Part of #120.*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: parallelize arm execution via SDK subagents (one subagent per arm × seed) #123

TL;DR

Why this matters

What's already shipped

Proposed approach

Acceptance criteria

Out of scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: parallelize arm execution via SDK subagents (one subagent per arm × seed) #123

Description

TL;DR

Why this matters

What's already shipped

Proposed approach

Acceptance criteria

Out of scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions