iterate V3 - Complete Implementation Plan

Based on Research of Top 20 AI Coding Agents

PART 1: TOP 20 AI CODING AGENTS - COMPREHENSIVE LIST

Category A: CLI/TUI Terminal Agents

#	Agent	Stars	Developer	Key Feature	Price
1	Claude Code	98K ⭐	Anthropic	Best autonomous agent, MCP support	$20/mo
2	Gemini CLI	99K ⭐	Google	1M token context, free	Free
3	OpenCode	103K ⭐	OpenCode AI	Provider-agnostic, LSP support	Free (BYOK)
4	Aider	42K ⭐	Community	Git-native, unified diffs	Free (BYOK)
5	Goose	33K ⭐	Block	MCP extensions, recipes	Free
6	Codex CLI	45K ⭐	OpenAI	Rust-based, permission levels	$20/mo
7	Amp	-	Sourcegraph	Deep reasoning, Oracle agent	Free tier
8	Warp AI	-	Warp Inc	Terminal replacement, Oz agents	$20/mo
9	NovaKit CLI	-	NovaKit	Gemini-powered terminal	Free
10	AI Magicx CLI	-	AI Magicx	Multi-provider	Free

Category B: IDE-Based Agents

#	Agent	Rating	Developer	Key Feature	Price
11	Cursor	96/100	Anysphere	Best IDE integration	$20/mo
12	Windsurf	91/100	Codeium	Cascade agent	Free tier
13	GitHub Copilot	-	Microsoft	IDE integration	$10/mo
14	Cline	59K ⭐	Cline	Autonomous IDE coding	Free
15	Kiro	-	Kiro	AI-first IDE	Free

Category C: Autonomous/Cloud Agents

#	Agent	SWE-bench	Developer	Key Feature	Price
16	Devin	13.86%	Cognition	Fully autonomous, sandboxed	$20/mo
17	Claude Devin	-	Anthropic	Claude-powered Devin	$20/mo
18	DeepWiki	-	Cognition	Codebase documentation	Free
19	Devin Search	-	Cognition	Q&A about codebases	Free

Category D: Specialized Agents

#	Agent	Focus	Developer	Key Feature	Price
20	CodeRabbit	Code review	CodeRabbit	PR review automation	Free tier

PART 2: KEY TECHNICAL INSIGHTS FROM RESEARCH

Aider (42K stars) - The Git-Native Pioneer

Unified diff format - 3X better than custom SEARCH/REPLACE
Flexible patching - 9X improvement with error recovery
Model-agnostic - Works with any LLM
Auto git commits - Descriptive commit messages
Lesson: Use familiar formats (git diffs), be flexible with errors

Claude Code (98K stars) - The Autonomous Leader

MCP (Model Context Protocol) - Standardized tool integration
Repo mapping - Maps entire codebase before changes
Context compaction - Handles long sessions without crashing
Multi-step execution - Plans, executes, verifies
Subagents - Specialized agents for different tasks
Lesson: Build robust scaffolding, use MCP for extensibility

OpenCode (103K stars) - The Provider-Agnostic

Client/server architecture - Bun runtime + Go TUI
AI SDK - Provider-agnostic LLM access
Plan/Build agents - Separate planning from execution
LSP integration - Real-time diagnostics
Snapshot/restore - Git-based state management
Lesson: Separate concerns, build for flexibility

Gemini CLI (99K stars) - The Free Powerhouse

1M token context - Entire codebase in one shot
Free tier - 60 req/min, 1000 req/day
ReAct loop - Reason + Act pattern
Lesson: Leverage context window, minimize cost

Devin (Cognition) - The Autonomous Pioneer

Sandboxed execution - Isolated VM per session
Interactive planning - Human reviews before execution
Desktop use - Can interact with GUI apps
DeepWiki - Auto-generates documentation
Lesson: Safety through sandboxing, human-in-loop for safety

Goose (33K stars) - The Extensible

MCP-first - Everything via MCP
Recipes - Reusable agent patterns
Headless mode - CI/CD integration
Sandboxed execution - Secure by default
Lesson: Build extension points, think headless

Codex CLI (OpenAI) - The Lightweight

Rust-based - Fast, small binary
3-tier permissions - Read-only, Auto, Full
ChatGPT integration - Already paid for by users
Lesson: Fast execution, trust levels

Amp (Sourcegraph) - The Deep Reasoning

Deep mode - Extended reasoning for complex problems
Oracle agent - Codebase analysis
Librarian agent - Documentation Q&A
Sub-agents - Team of specialized agents
Lesson: Multi-agent architecture

PART 3: COMMON SUCCESS PATTERNS

1. Edit Format Matters (Aider Research)

Format	Success Rate	Notes
Custom JSON	20%	LLMs unfamiliar
SEARCH/REPLACE	20%	Our current issue
Unified Diff	61%	3X better - familiar to LLMs
Tool Calling (JSON)	Varies	Model-dependent

Key insight: Use formats LLMs have seen millions of times in training (git diffs)

2. Error Recovery is Critical (Aider Research)

Without flexible patching: 9X more failures
Strategies that work:
1. Normalize whitespace
2. Try fuzzy matching
3. Split into smaller hunks
4. Expand context window

3. Context Management

Repo mapping (Claude Code)
Incremental context
Session summarization
Snapshot/restore (OpenCode)

4. Safety Mechanisms

Sandboxed execution (Devin, Goose)
Permission levels (Codex)
Pre-commit verification
Post-change testing

5. Multi-Agent Architecture

Plan vs Build separation (OpenCode)
Sub-agents for specialized tasks (Amp)
Task agents (Claude Code)
Review agents (CodeRabbit)

PART 4: ITERATE V3 IMPLEMENTATION PLAN

Phase 1: Core Architecture (Week 1)

1.1 Unified Diff Format ✅ ALREADY IMPLEMENTED

Replace SEARCH/REPLACE with unified diffs
Implement 3 fallback strategies
Update prompts

1.2 Multi-Agent Architecture

Implementation:

// Plan Agent - analyzes but doesn't edit
type PlanAgent struct {
    Name string
    Tools []string // read, grep, glob only
    Prompt string
}

// Build Agent - can edit files
type BuildAgent struct {
    Name string
    Tools []string // all tools
    Prompt string
}

// Review Agent - verifies changes
type ReviewAgent struct {
    Name string
    Tools []string
    Prompt string
}

1.3 Flexible Error Recovery

Implementation:

func ApplyDiffWithRetry(content, diff string) (string, error) {
    // Strategy 1: Direct patch
    // Strategy 2: Normalize whitespace  
    // Strategy 3: Fuzzy matching
    // Strategy 4: Split into smaller hunks
    // Strategy 5: Expand context window
}

Phase 2: Context & Safety (Week 2)

2.1 Repo Mapping

Implementation:

type RepoMap struct {
    Files map[string]FileInfo
    Functions map[string][]Function
    Imports map[string][]string
}

func BuildRepoMap(root string) *RepoMap {
    // Use AST parsing
    // Extract functions, classes, imports
    // Build dependency graph
}

2.2 Sandboxed Execution

Implementation:

type SandboxConfig struct {
    AllowedCmds []string
    BlockedPatterns []string
    Timeout time.Duration
    MemoryLimit int64
}

func (e *Engine) ExecuteInSandbox(cmd string) error {
    // Whitelist allowed commands
    // Block destructive patterns
    // Set timeout
    // Monitor resources
}

2.3 Snapshot/Restore

Implementation:

func (e *Engine) CreateSnapshot() (string, error) {
    // Git add + write-tree
    return hash, nil
}

func (e *Engine) RestoreSnapshot(hash string) error {
    // Git read-tree + checkout-index
}

Phase 3: Provider & Tools (Week 2-3)

3.1 Multi-Provider Support

Implementation:

type ProviderPool struct {
    providers map[string]Provider
    current Provider
    rateLimitCount int
}

func (p *ProviderPool) CallWithFallback(req Request) error {
    for _, provider := range p.providers {
        if resp, err := provider.Call(req); err == nil {
            return resp
        }
        if isRateLimit(err) {
            continue // Try next
        }
        return err // Real error
    }
    return ErrAllProvidersFailed
}

3.2 MCP Integration

Implementation:

type MCPClient struct {
    serverPath string
    transport string // stdio, http
}

func (m *MCPClient) ListTools() ([]Tool, error)
func (m *MCPClient) CallTool(name string, args map[string]interface{}) (string, error)

3.3 Tool Permission Levels

Implementation:

const (
    PermissionReadOnly  Permission = "read"   // glob, grep, read
    PermissionAuto      Permission = "auto"   // can edit, needs approval
    PermissionFull      Permission = "full"   // can do anything
)

Phase 4: Verification & Testing (Week 3)

4.1 Pre-Change Verification

Implementation:

func PreChangeCheck(changes []string) error {
    // 1. Check protected files
    // 2. Check for destructive commands
    // 3. Check test coverage impact
}

4.2 Post-Change Verification

Implementation:

func PostChangeCheck() error {
    // 1. Run go build
    // 2. Run go vet
    // 3. Run go test
    // 4. Check for regressions
}

4.3 TDD Enforcement

Implementation:

func EnforceTDD(task Task) error {
    // 1. Require test file in diff
    // 2. Run test - should FAIL initially
    // 3. Make code change
    // 4. Run test - should PASS
}

Phase 5: Learning & Adaptation (Week 3-4)

5.1 Failure Pattern Memory

Implementation:

type FailurePattern struct {
    ErrorType string
    Model string
    Frequency int
    FixSuggestion string
}

func (e *Engine) LearnFromFailure(err error) {
    pattern := categorizeError(err)
    saveToJSONL("memory/failures.jsonl", pattern)
}

5.2 Model-Specific Prompts

Implementation:

func GetModelSpecificPrompt(model string) string {
    switch {
    case strings.Contains(model, "claude"):
        return claudePrompt // Use JSON tool calls
    case strings.Contains(model, "gpt"):
        return gptPrompt // Use unified diffs
    case strings.Contains(model, "gemini"):
        return geminiPrompt // Use step-by-step
    }
}

PART 5: DETAILED FILE CHANGES

Files to Create:

File	Purpose
`internal/evolution/agents.go`	Multi-agent architecture
`internal/evolution/repo_map.go`	Repository mapping
`internal/evolution/sandbox.go`	Sandboxed execution
`internal/evolution/snapshot.go`	Git-based snapshots
`internal/evolution/provider_pool.go`	Multi-provider fallback
`internal/evolution/mcp.go`	MCP client support
`internal/evolution/permissions.go`	Tool permission levels
`internal/evolution/tdd.go`	TDD enforcement

Files to Modify:

File	Changes
`internal/evolution/prompts_aider.go`	Already updated with unified diffs
`internal/evolution/phases.go`	Already updated for unified diffs
`internal/evolution/engine.go`	Add agent routing
`scripts/evolution/evolve.sh`	Already has API rotation

PART 6: SUCCESS METRICS

Current vs Target

Metric	Current	Target
Code changes/evolution	~3	10+
Test inclusion	50%	100%
Build pass rate	70%	95%
Unified diff success	0%	70%+
Rate limit recovery	Manual	Auto
Context relevance	40%	80%

Key KPIs to Track

Task Success Rate - % of tasks completed successfully
Test Coverage - Coverage maintained/increased
Error Recovery - % of errors recovered with retries
Code Quality - Lint/vet pass rate
API Cost - Cost per successful task

PART 7: IMPLEMENTATION PRIORITY

P0 - Must Fix This Week:

✅ Unified diff format (done)
⏳ Flexible diff application (in progress)
⏳ Multi-agent architecture (plan/build separation)
⏳ Better error messages

P1 - Should Fix Next Week:

Repo mapping for better context
Sandboxed command execution
Multi-provider fallback
Pre/post verification gates

P2 - Week 3:

MCP integration
Failure pattern learning
Model-specific prompts
Snapshot/restore

KEY TAKEAWAYS FROM TOP 20 AGENTS

What Works:

Familiar formats - Unified diffs > custom formats (3X better)
Flexible error handling - 9X improvement with retries
Plan/Build separation - OpenCode, Claude Code
Sandboxing - Devin, Goose for safety
MCP - Standardized tool integration
Git-native - Aider's commit workflow
Context management - Claude Code compaction

What Doesn't Work:

Custom edit formats - LLMs don't follow
Single agent - Need specialized agents
No error recovery - Fail fast = fail often
Unbounded execution - Need sandbox + timeout
Single provider - Rate limits happen

iterate's Competitive Advantage:

Autonomous - Self-evolving (unique)
Go-native - Built in Go, for Go projects
Evolution - Can modify itself (unique)
GitHub-native - Built-in CI/CD integration

Research compiled from: Aider docs, Claude Code docs, OpenCode source, Gemini CLI docs, Devin technical deep-dives, Goose architecture, Codex CLI, Amp, and 15+ comparison articles.

FilesExpand file tree

ROADMAP_V3.md

Latest commit

History

ROADMAP_V3.md

File metadata and controls

iterate V3 - Complete Implementation Plan

Based on Research of Top 20 AI Coding Agents

PART 1: TOP 20 AI CODING AGENTS - COMPREHENSIVE LIST

Category A: CLI/TUI Terminal Agents

Category B: IDE-Based Agents

Category C: Autonomous/Cloud Agents

Category D: Specialized Agents

PART 2: KEY TECHNICAL INSIGHTS FROM RESEARCH

Aider (42K stars) - The Git-Native Pioneer

Claude Code (98K stars) - The Autonomous Leader

OpenCode (103K stars) - The Provider-Agnostic

Gemini CLI (99K stars) - The Free Powerhouse

Devin (Cognition) - The Autonomous Pioneer

Goose (33K stars) - The Extensible

Codex CLI (OpenAI) - The Lightweight

Amp (Sourcegraph) - The Deep Reasoning

PART 3: COMMON SUCCESS PATTERNS

1. Edit Format Matters (Aider Research)

2. Error Recovery is Critical (Aider Research)

3. Context Management

4. Safety Mechanisms

5. Multi-Agent Architecture

PART 4: ITERATE V3 IMPLEMENTATION PLAN

Phase 1: Core Architecture (Week 1)

1.1 Unified Diff Format ✅ ALREADY IMPLEMENTED

1.2 Multi-Agent Architecture

1.3 Flexible Error Recovery

Phase 2: Context & Safety (Week 2)

2.1 Repo Mapping

2.2 Sandboxed Execution

2.3 Snapshot/Restore

Phase 3: Provider & Tools (Week 2-3)

3.1 Multi-Provider Support

3.2 MCP Integration

3.3 Tool Permission Levels

Phase 4: Verification & Testing (Week 3)

4.1 Pre-Change Verification

4.2 Post-Change Verification

4.3 TDD Enforcement

Phase 5: Learning & Adaptation (Week 3-4)

5.1 Failure Pattern Memory

5.2 Model-Specific Prompts

PART 5: DETAILED FILE CHANGES

Files to Create:

Files to Modify:

PART 6: SUCCESS METRICS

Current vs Target

Key KPIs to Track

PART 7: IMPLEMENTATION PRIORITY

P0 - Must Fix This Week:

P1 - Should Fix Next Week:

P2 - Week 3:

KEY TAKEAWAYS FROM TOP 20 AGENTS

What Works:

What Doesn't Work:

iterate's Competitive Advantage: