What This Document Covers:
- Specialized agents for heavy work with concise summaries (context preservation)
- Opus token conservation when using Claude Code in plan mode (90-100% savings)
- Core agents: code-analyzer, file-analyzer, test-runner, parallel-worker, commit-manager
- Agent patterns and philosophy (context firewalls, heavy lifting, concise returns)
- Mandatory usage protocols for Opus model in plan mode
- Anti-patterns and best practices for agent design
Sections in This Document:
- Critical: Opus Token Conservation
- Core Philosophy
- Available Agents
- Why Agents?
- How Agents Preserve Context
- Plan Mode Protocol
- Example Usage
- Creating New Agents
- Anti-Patterns to Avoid
- Integration with PM System
Related Documentation:
- → ../../README.md - Project overview
- → ./COMMANDS.md - Commands that invoke agents
- → ./.claude/agents/ - Agent implementations
- → ./.claude/CLAUDE.md - Sub-agent usage guidelines
Context Tags: #agents #subagents #token-optimization #context-preservation #automation
Specialized agents that do heavy work and return concise summaries to preserve context.
When using Opus model in plan mode, these agents are MANDATORY for token conservation. All agents run on Sonnet-4 underneath, providing massive cost savings while preserving Opus tokens for high-level reasoning.
Token Savings Examples:
- 📄 Reading 10 files directly: 15,000 Opus tokens → file-analyzer: 1,500 tokens (90% savings)
- 🔍 Searching 20 files directly: 20,000 Opus tokens → code-analyzer: 1,000 tokens (95% savings)
- 🔄 Manual git commits: 3,000 Opus tokens → commit-manager: 0 tokens (100% savings)
- 🧪 Test execution + analysis: 5,000 Opus tokens → test-runner: 300 tokens (94% savings)
"Don't anthropomorphize subagents. Use them to organize your prompts and elide context. Subagents are best when they can do lots of work but then provide small amounts of information back to the main conversation thread."
– Adam Wolff, Anthropic
- Purpose: Hunt bugs across multiple files without polluting main context
- Pattern: Search many files → Analyze code → Return bug report
- Usage: When you need to trace logic flows, find bugs, or validate changes
- Opus Token Savings: 90-95% reduction for multi-file analysis
- Plan Mode: MANDATORY for any code search/analysis when using Opus
- Sonnet-4 Backend: Handles heavy searching while preserving Opus context
- Returns: Concise bug report with critical findings only
- Purpose: Read and summarize verbose files (logs, outputs, configs)
- Pattern: Read files → Extract insights → Return summary
- Usage: When you need to understand log files or analyze verbose output
- Opus Token Savings: 80-90% reduction for file reading operations
- Plan Mode: MANDATORY for reading 2+ files or files >100 lines when using Opus
- Sonnet-4 Backend: Processes verbose content while preserving Opus context
- Returns: Key findings and actionable insights (80-90% size reduction)
- Purpose: Execute tests without dumping output to main thread
- Pattern: Run tests → Capture to log → Analyze results → Return summary
- Usage: When you need to run tests and understand failures
- Opus Token Savings: 90-95% reduction by avoiding verbose test output in main context
- Plan Mode: MANDATORY for all test execution (regardless of model)
- Sonnet-4 Backend: Captures full output and analyzes failures efficiently
- Returns: Test results summary with failure analysis
- Purpose: Coordinate multiple parallel work streams for an issue
- Pattern: Read analysis → Spawn sub-agents → Consolidate results → Return summary
- Usage: When executing parallel work streams in a worktree
- Returns: Consolidated status of all parallel work
- Purpose: Create intelligent, conventional git commits for session changes
- Pattern: Analyze git status → Group related changes → Create atomic commits
- Usage: When you need to commit changes with proper conventional format
- Opus Token Savings: 100% savings - uses zero Opus tokens (pure Sonnet-4)
- Plan Mode: MANDATORY for all git operations (regardless of model)
- Sonnet-4 Backend: Handles entire git workflow without Opus involvement
- Returns: Commit plan and execution results
Agents are context firewalls that protect the main conversation from information overload:
Without Agent:
Main thread reads 10 files → Context explodes → Loses coherence
With Agent:
Agent reads 10 files → Main thread gets 1 summary → Context preserved
- Heavy Lifting - Agents do the messy work (reading files, running tests, implementing features)
- Context Isolation - Implementation details stay in the agent, not the main thread
- Concise Returns - Only essential information returns to main conversation
- Parallel Execution - Multiple agents can work simultaneously without context collision
When using Opus model in plan mode, follow these mandatory patterns:
❌ WRONG (wastes Opus tokens):
Read file1.py, file2.py, file3.py directly
✅ CORRECT (preserves Opus tokens):
Task agent: file-analyzer
Prompt: "Analyze file1.py, file2.py, file3.py for X pattern"
Returns: Concise summary (90% token reduction)❌ WRONG (wastes Opus tokens):
Search codebase for bug patterns directly
✅ CORRECT (preserves Opus tokens):
Task agent: code-analyzer
Prompt: "Search for memory leaks across codebase"
Returns: Focused bug report (95% token reduction)❌ WRONG (wastes Opus tokens):
git status + git diff + manual commit
✅ CORRECT (preserves Opus tokens):
Task: /commit
Agent: commit-manager
Returns: Professional commits (100% token reduction)❌ WRONG (wastes Opus tokens):
Run pytest directly, see full output
✅ CORRECT (preserves Opus tokens):
Task agent: test-runner
Prompt: "Run test suite X and analyze failures"
Returns: Failure analysis (94% token reduction)# Analyzing code for bugs
Task: "Search for memory leaks in the codebase"
Agent: code-analyzer
Returns: "Found 3 potential leaks: [concise list]"
Main thread never sees: The hundreds of files examined
# Running tests
Task: "Run authentication tests"
Agent: test-runner
Returns: "2/10 tests failed: [failure summary]"
Main thread never sees: Verbose test output and logs
# Parallel implementation
Task: "Implement issue #1234 with parallel streams"
Agent: parallel-worker
Returns: "Completed 4/4 streams, 15 files modified"
Main thread never sees: Individual implementation detailsNew agents should follow these principles:
- Single Purpose - Each agent has one clear job
- Context Reduction - Return 10-20% of what you process
- No Roleplay - Agents aren't "experts", they're task executors
- Clear Pattern - Define input → processing → output pattern
- Error Handling - Gracefully handle failures and report clearly
❌ Creating "specialist" agents (database-expert, api-expert) Agents don't have different knowledge - they're all the same model
❌ Returning verbose output Defeats the purpose of context preservation
❌ Making agents communicate with each other Use a coordinator agent instead (like parallel-worker)
❌ Using agents for simple tasks Only use agents when context reduction is valuable
Agents integrate seamlessly with the PM command system:
/pm:issue-analyze→ Identifies work streams/pm:issue-start→ Spawns parallel-worker agent- parallel-worker → Spawns multiple sub-agents
- Sub-agents → Work in parallel in the worktree
- Results → Consolidated back to main thread
This creates a hierarchy that maximizes parallelism while preserving context at every level.
Last Updated: January 2025 - LaunchAgencyBot v2.0