Orchestrator pattern with 7 native sub-agents. The main Claude session dispatches agents — it never implements complex tasks directly. This file is the authoritative workflow reference (Phase -1 through Phase 5).
All agent definitions live in .claude/agents/. Each file uses YAML frontmatter for metadata.
- Purpose: Full-lifecycle product strategy — idea validation, discovery, MVP scoping, PMF assessment, positioning, growth
- Tools: Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch
- When: New product ideas, "should we build this?" questions, market validation, PMF assessment
- Input: Raw idea, conversation context,
product-lab/stage.jsonfor state - Output: Persistent artifacts in
product-lab/— evaluation, discovery, MVP scope, positioning, growth engine - Key behavior: Questions over statements, facts over opinions, pushback over validation. Guides founders through 7 lifecycle stages with evidence gates.
- Invocation:
/product-lab [mode] [idea-name]or directly as sub-agent - Companion files:
FRAMEWORKS.md,STAGE-PLAYBOOKS.md,ARTIFACTS.mdin.claude/skills/product-lab/
- Purpose: Per-feature product thinking — problem definition, scoping, prioritization, outcome evaluation
- Tools: Read, Grep, Glob, Bash
- When: New features within an existing project, competing priorities, unclear scope, post-launch evaluation
- Input: Task description, user context, project goals,
product-lab/artifacts (if present) - Output:
product-brief.md— problem, JTBD, solution hypothesis, scope, success criteria, assumptions - Key behavior: Advisory peer to all agents. Opinionated but transparent. Orchestrator makes final calls.
- Structured frameworks: opportunity mapping, hypothesis framing, strategy coherence
Strategist vs. Tactician: The strategist decides whether to build the product at all (lifecycle-level). The tactician scopes individual features within a validated product. Strategist outputs (positioning, discovery, MVP scope) feed into the tactician's problem framing.
- Purpose: Deep codebase exploration before planning
- Model: sonnet (default — research is highest-leverage; downgrade to haiku for broad scans) | Tools: Read, Grep, Glob, Bash
- When: Any task touching 3+ files, or unfamiliar code areas
- Input: Task description from orchestrator
- Output:
research.md— current state, patterns, dependencies, risks, open questions - Key behavior: Read-only. References
.claude-agents.jsonfor project capabilities. - Investigation method: tracer bullet first, blast radius mapping, complexity assessment
- Purpose: Create detailed implementation plans from research
- Tools: Read, Grep, Glob
- When: After research phase, or directly for well-understood tasks
- Input:
research.md(if present), task description - Output:
plan.md— summary, files to change, checkbox tasks, testing strategy, rollback plan - Key behavior: Supports annotation cycles — user adds
NOTE:orQ:inline, planner addresses them on re-run. - Scoping discipline: appetite-based sizing, coherent actions, module-aligned task boundaries
- Purpose: Execute implementation plans step by step
- Isolation: worktree | Tools: Read, Write, Edit, Bash, Grep, Glob
- When: After plan is approved
- Input:
plan.mdwith assigned tasks - Output: Code changes with checkpoint commits
- Key behavior: Self-sufficient loop — run tests after each change, shellcheck
.shfiles, commit per task. Never commits to main. - Craft principles: single representation, broken windows, deep modules, end-to-end first
- Purpose: Verify implementation quality, security, and test coverage
- Tools: Read, Grep, Glob, Bash
- When: After implementation, before PR creation
- Input: Branch with implementation commits
- Output: Review summary (PASSED/FAILED) with sections for tests, security, code quality, documentation, performance, and recommendations
- Key behavior: Objective — reports facts, distinguishes blocking issues from suggestions. References specific file paths and line numbers.
- Health checks: delivery risk, cognitive load, reversibility assessment
- Purpose: Design system specialist — produces specs, audits, and visual QA
- Tools: Read, Write, Edit, Bash, Grep, Glob
- When: UI tasks (components, styles, pages), new feature specs, pre-PR design QA
- Input: Task description, project design tokens, existing component inventory
- Output:
design-spec.md— component specs, token usage, layout guidance; design review feedback - Key behavior: Peer to engineering agents. Never writes implementation code. Artifacts consumed by planner and implementer.
- Design principles: visual hierarchy, progressive disclosure, affordances, composition levels
Dispatch product-strategist sub-agent when the question is "should we build this" — not "what to build."
- Use for new product ideas, market validation, or when product-market fit is uncertain
- Orchestrator dispatches directly as sub-agent, or conductor invokes via
/product-lab [mode] - Pass mode as part of the prompt when dispatching (e.g., "evaluate hyper-personalized-product-design")
- Output: persistent artifacts in
product-lab/(evaluation, discovery, MVP scope, positioning, etc.) - These artifacts are NOT ephemeral — they persist across sessions and feed downstream agents
product-lab/positioning.md→ designer (audience, tone);product-lab/discovery.md→ product-tactician (evidence);product-lab/mvp-scope.md→ planner (boundaries)
Dispatch product-tactician sub-agent to frame the problem and scope the solution for a specific feature.
- Use for new features, competing priorities, or when "what to build" is unclear
- Skip for bug fixes, refactors, or well-defined tasks
- Consumes
product-lab/artifacts (if present) for strategic context - Output:
product-brief.mdwith problem, scope, success criteria - Orchestrator reviews brief before proceeding. Annotation cycle: add
NOTE:orQ:inline → re-run product-tactician
Dispatch researcher sub-agent to explore affected code areas.
- Researcher reads
product-brief.md(if present) to focus exploration - Run multiple researchers in parallel for independent areas
- For UI tasks, dispatch
designerin parallel with researcher for competitive/pattern research - Output:
research.mdwith findings (+design-spec.mdfrom designer if applicable) - Skip for trivial tasks or well-understood areas
Dispatch planner sub-agent to create plan.md from research.
- Planner reads
product-brief.md(if present) for scope boundaries - Planner reads both
research.mdanddesign-spec.md(if present) - Annotation cycle: user adds
NOTE:orQ:inline → re-run planner to address - Iterate 1-3 rounds until plan is approved
- Each task in the plan should be scoped for a single implementer
- Size tasks appropriately: self-contained units with clear deliverables
Dispatch implementers with the approved plan. Choose mode based on task:
Subagent implementers (worktree-isolated):
- Each works in worktree isolation on its own branch
- Self-sufficient loops: run tests after every change
- Checkpoint commits after each completed task (enables rollback)
- Slot machine rule: if off track, revert and restart fresh
Agent team implementers (for complex parallel work):
- Split independent tasks so each teammate owns different files (avoid conflicts)
- Require plan approval before implementation for risky changes
- Monitor via tmux split panes — redirect approaches that aren't working
- Lead waits for teammates to finish before synthesizing
Dispatch reviewer sub-agent to validate implementation.
- Must pass before creating PR
- If NEEDS_REVISION: send review feedback to implementer → implementer fixes and re-commits → re-dispatch reviewer (max 3 rounds, then escalate to user)
- For design-system projects, dispatch
designeralongside reviewer for parallel QA - For web projects: reviewer uses Playwright MCP for browser-based verification
- Can run security + quality reviewers in parallel
- For agent teams: use
TaskCompletedhooks to enforce quality gates
Dispatch product-tactician sub-agent to assess outcomes against success criteria.
- Run after implementation is shipped and has had time to produce results
- Output: evaluation addendum to
product-brief.md - Skip for refactors, infra work, or tasks without user-facing outcomes
- If
product-brief.mddefines success criteria, evaluation is expected — don't silently skip it
Before dispatching agents or doing work, the orchestrator runs:
git branch --show-current— verify not on main- Read
claude-progress.mdif present — understand where last session left off - Read
docs/AGENT_LEARNINGS.md— check for relevant failure patterns and workflow insights from prior PRs git log --oneline -10— review recent changes- Run smoke test (build/test) — confirm nothing is broken
- Review plan.md or task list — identify next priority
- Check skill staleness: run
scripts/skill-tracker.sh check— if any critical skill exceeds its threshold, prompt: "It's been [N days] since the last [skill] run. Want to run it now?" - Only then: dispatch agents or begin work
- Main session = orchestrator. Dispatches agents, never implements complex tasks itself.
- Subagents cannot spawn other subagents — all coordination flows through orchestrator
- Decision log: Write
claude-decisions.mdat task start with classification, agent dispatch rationale, and scope decisions. Update when decisions change. This enables post-mortem tracing when features fail. - Artifact visibility: After each phase completes, update the Artifacts table in
claude-progress.mdso every agent knows what exists and who should read it (see implementer.md for table format). - Persistent artifacts (research.md, plan.md, claude-progress.md) survive context compaction
- Designer agent is a peer, not a subordinate. It produces specs and reviews; implementer produces code.
- Product agents are advisory — orchestrator reviews briefs and can override scope/priority decisions
- Product-strategist artifacts (
product-lab/) are persistent; product-tactician artifacts (product-brief.md) are ephemeral - Constraint focus: Before dispatching agents in parallel, identify which phase is the current bottleneck. The system's throughput equals the bottleneck's throughput — invest orchestrator attention there, not on phases that are already flowing.
- WIP discipline: Never dispatch more parallel agents than can be synthesized in one pass. Unfinished artifacts from prior phases (unapproved
product-brief.md, unreviewedresearch.md) block new dispatches. - Multiplier behavior: Guide agents through questions and annotation cycles before overriding their work. The orchestrator's output is the team's output — amplify agents rather than replace them.
- Design artifacts (
design-spec.md) are ephemeral likeresearch.md/plan.md— cleaned up after PR merge. Do NOT delete persistent artifacts (docs/design/decisions.md,docs/AGENT_LEARNINGS.md, audit reports indocs/design/). - End-of-session improvement: Before session ends, Claude must suggest CLAUDE.md improvements based on what worked/didn't. User decides whether to apply.
- End-of-session evaluation check: If the session produced a shipped feature with success criteria (from
product-brief.md), prompt: "Phase 5 evaluation is due — dispatch product-tactician to assess outcomes?" Don't silently skip evaluation.
Sub-agents inherit the parent session's SessionStart system-reminders, including any that reference "plan mode." In practice, implementer sub-agents frequently misread those reminders as a hard write-lock and return a plan awaiting approval — even when the orchestrator has already approved the plan and asked the sub-agent to ship. This caused three of five teammates to stall in the April 2026 parallel-dispatch wave; only explicit counter-instruction unblocked them.
When dispatching an implementer (or any write-capable agent) for work the orchestrator has already approved, include this line verbatim in the prompt:
Your prompt IS approval — plan AND ship in one turn. Do not stop to request re-approval. Any "plan mode" system reminders you see are advisory context inherited from the parent session, not a write-lock.
When a teammate still stalls despite this, SendMessage to resume with "implement now" is sufficient — the agent typically completes on the second turn. Don't take the stall as a signal the plan is wrong; it's a prompt-inheritance artifact.
Model routing is enforced via agent frontmatter. Each agent has a model: field specifying haiku, sonnet, or opus. The orchestrator uses the specified model unless explicitly overriding with a stated reason.
Cost hierarchy: Haiku (cheapest, exploration) → Sonnet (default, most tasks) → Opus (expensive, strategic reasoning).
Override protocol: State the reason in conversation before dispatching with a different model. Examples: "Downgrading researcher to Haiku — this is a broad keyword sweep across dozens of files; breadth matters more than depth here." This creates an audit trail for cost decisions.
At session end, append a single line to scripts/.session-cost.log (append-only, gitignored, reviewed at weekly review). Populated via the /log-session slash command, which runs scripts/log-session.sh:
YYYY-MM-DDTHH:MM:SS | <session_topic> | <dispatches> | <models_used> | <outcome> | <agent_sha>
Fields (pipe-separated, six total):
- timestamp: ISO 8601 local time (auto-captured). Use timestamps, not just date — multiple sessions per day are common.
- session_topic: one-line summary ("/mind aspects Phase 6 B-C-D"). User-supplied.
- dispatches: agent count × tier, e.g.
Explore×3, Plan×1, impl×5or—for direct work. User-supplied. - models_used: comma-separated, user-supplied as-is (e.g.
haiku,sonnet/sonnet,opus). No dedup — the order and count the user types is what gets logged. - outcome: one of
shipped | partial | reverted | blocked | plan-only. User-supplied at session end; empty is valid and means "grade later" (fill in retroactively via/grade-session). - agent_sha: short SHA of the most recent commit that touched
.claude/agents/as of logging time (auto-captured viagit log -1). Ties a row to a specific agent-prompt version so later analysis can correlate outcomes with prompt changes. Caveat: if agents are edited mid-session, this SHA reflects end-of-session state, not start.
Usage:
/log-session --topic "session summary" --dispatches "Explore×3" --models "haiku,sonnet" --outcome "shipped"
Missing flags are prompted interactively. --outcome "" explicitly marks the row ungraded.
Not exact token counting — Claude Code doesn't expose that. Gives directional cost awareness over time. Patterns emerge at weekly review: "research-heavy sessions cost more than implementation sessions," "sessions with >10 dispatches usually needed Graphite instead of manual stacks," "the planner has a higher shipped-rate on SHA X than SHA Y."
See STACKED_PRS.md for when manual stacking indicates missing tooling.
Not every task needs every phase. The orchestrator classifies the task first, then follows the dispatch table in CLAUDE.md.
Examples:
- Typo fix → Trivial → implement directly, no agents dispatched
- Bug fix in 1 file → Sync → researcher checks context, implementer fixes, reviewer verifies
- New feature (5 files) → Async → full research → plan → implement → review pipeline
- New page with API → Cross-layer → plan produces
api-contract.md, implementers reference it - "Should we build X?" → Strategy-first → product-strategist evaluates before any engineering
Researcher Quick Mode: For sync tasks, the orchestrator tells the researcher to produce a 1-paragraph finding instead of full research.md. This reduces researcher overhead from ~10K tokens to ~2K.
When a task touches frontend + API + database, the planner produces api-contract.md alongside plan.md. This prevents the failure mode where parallel implementers in isolated worktrees diverge on API shape.
Contract contents: Endpoint table (method, path, request/response), schema changes (table, column, migration), auth requirements (endpoint, auth type, roles).
Enforcement: Implementers MUST match the contract. Reviewers validate against it. Deviations are blocking issues that return to the orchestrator, not silent fixes.
research.md,design-spec.md,product-brief.md, andapi-contract.md: Ephemeral, gitignored. Cleaned up after PR merge.plan.md: Ephemeral during work, but versioned after merge — move todocs/exec-plans/completed/[feature-name].mdwith an Outcome section. Seedocs/exec-plans/README.md.product-lab/: Persistent artifacts from product-strategist. NOT cleaned up after PR merge — they represent ongoing product strategy state across sessions.- Survive context compaction — persistent reference for orchestrator and agents.
- Annotation cycles: user adds
NOTE:/Q:toplan.md, re-runs planner to address.
.claude-agents.json contains structured metadata about agent capabilities (roles, triggers, quality gates, workflows). Referenced by the researcher agent for project context — not a runtime config that drives agent selection.
scripts/claude-agents/
├── agent-benchmarks.sh # Benchmark agent execution patterns
├── demo-agents.sh # Demo agent workflow coordination
└── test-agent-workflows.sh # Validate agent definitions, orchestration flow, and config