Open Code CLI Agents Guide

How OpenCode Works: The Agentic Loop

OpenCode (and tools like Claude Code) are agentic frameworks — they wrap a language model with tools and an execution layer. Understanding this explains why OpenCode can write files when ollama run cannot.

You → OpenCode → Model (with tool definitions in context)
                    ↓
              Model outputs a tool call (e.g., write_file)
                    ↓
              OpenCode executes it on your filesystem
                    ↓
              Result fed back to model
                    ↓
              Model continues / calls next tool

The model never directly touches your filesystem — it emits structured tool calls and OpenCode executes them. This is why:

ollama run is a raw chat interface: no tools are defined, no execution layer exists
opencode is an agentic loop: tools are defined in the system prompt, and OpenCode executes whatever the model requests

The same pattern applies to Claude Code (Anthropic's models), Cursor, and other AI coding tools. What differs is the model, the available tools, and how reliably the model produces valid tool calls. Local models like ministral-3:8b support tool calling but are less consistent than cloud models trained specifically for it — which is why tool-use confirmation testing matters (see scripts/tool-call-test.sh).

What are Agents in Open Code CLI?

Open Code CLI supports agent mode, which allows the LLM to autonomously plan and execute multi-step tasks. Agents can:

Break down complex tasks into smaller steps
Execute commands and tools iteratively
Self-correct based on feedback
Maintain context across multiple operations

Important: Only models with confirmed tool use support can act as agents. Read-only models (mistral-nemo, granite3.1-moe, qwen3.5 family) can analyze and plan but cannot create or modify files.

Agent Modes

Open Code CLI has two primary agents that you can switch between using the Tab key:

1. Build Agent (Default)

Optimized for construction tasks and code generation.

opencode
# Build agent is active by default
> Create a REST API with authentication

Switch to build agent: Press Tab key

Best for:

Creating new projects from scratch
Generating boilerplate code
File creation and modification
Rapid prototyping
Tasks requiring action over analysis

2. Plan Agent

Focuses on analysis, planning, and understanding.

opencode
# Press Tab to switch to plan agent
> How should I refactor this authentication system?

Switch to plan agent: Press Tab key

Best for:

Complex architectural decisions
Code review and analysis
Understanding large codebases
Planning multi-step implementations
Exploring codebase structure

Note: The /mode command does not exist. Use Tab to switch between build and plan agents.

Agent Capabilities by Model

Only models with confirmed tool use are listed here. For read-only models, see MODEL-SELECTION.md.

ministral-3:8b-16k (6.0 GB, 16k context) — Recommended

Agent suitability: ⭐⭐⭐⭐⭐ Excellent

Fastest warm inference (~4s) with no think-mode overhead
16k context enables multi-file planning and execution
Consistent, reliable tool calls
Best overall choice for agentic workflows

Example tasks:

> Analyze the authentication system across multiple files and refactor it
> Create a complete REST API with error handling and tests
> Review and update all TypeScript types across the project

ministral-3:8b (6.0 GB, 4k context)

Agent suitability: ⭐⭐⭐⭐☆ Good

Same speed and reliability as the 16k variant (~4s warm)
No think-mode overhead
Limited to single-file or small-scope tasks due to 4k context

Example tasks:

> Add error handling to routes/auth.py
> Create a utility function for date formatting
> Fix the validation bug in models/user.py

qwen3:8b-16k (5.2 GB, 16k context)

Agent suitability: ⭐⭐⭐⭐☆ Good

Extended context enables multi-file analysis and planning
Enters verbose thinking mode (adds latency, ~26s+)
Reliable tool use; strong reasoning

Example tasks:

> Analyze the entire authentication system and suggest improvements
> Refactor the API layer to use async/await throughout
> Review all error handling and add comprehensive logging

qwen3:8b (5.2 GB, 8k context)

Agent suitability: ⭐⭐⭐☆☆ Moderate

Handles 2-3 step tasks reliably
Verbose thinking mode adds overhead (~26s)
Suitable for standard single-module workflows

Example tasks:

> Create a Flask app with user authentication
> Refactor this module to use dependency injection
> Write tests for the UserService class

qwen3:4b (2.5 GB, 4k context)

Agent suitability: ⭐⭐☆☆☆ Limited

Can handle simple single-step tasks
Struggles with complex multi-step planning
Use for quick, straightforward operations

Example tasks:

> Create a simple Python function to calculate factorial
> Add error handling to this code snippet
> Format this JSON file

Agent Workflow Patterns

Pattern 1: Autonomous Task Execution

Let the agent break down and execute complex tasks independently.

opencode
# Build agent is default, or press Tab to ensure build mode
> Create a complete REST API for a todo application with:
> - Express.js backend
> - PostgreSQL database
> - JWT authentication
> - Full CRUD operations
> - Input validation
> - Error handling
> - Unit tests

Best models: ministral-3:8b-16k, qwen3:8b-16k

Pattern 2: Iterative Refinement

Work with the agent to progressively improve code quality.

opencode
> Create a user authentication system
# Agent creates basic implementation

> Add password strength validation
# Agent enhances the implementation

> Add rate limiting for login attempts
# Agent adds security features

> Write comprehensive tests
# Agent completes with test coverage

Best models: ministral-3:8b-16k, ministral-3:8b, qwen3:8b

Pattern 3: Analysis-Then-Action

Use plan agent first, then switch to build agent to execute.

opencode
# Press Tab to switch to plan agent
> Analyze my database schema and suggest optimizations

# Review the suggested plan

# Press Tab to switch to build agent
> Implement the suggested index optimizations from the analysis

Best models: qwen3:8b-16k or ministral-3:8b-16k (both phases)

Pattern 4: Batch Operations

Run multiple tasks in sequence within one session.

opencode
> Create models/user.py with User model
# Wait for completion, then:
> Create routes/auth.py with authentication routes
# Wait for completion, then:
> Create tests/test_auth.py with auth tests

Best models: ministral-3:8b (fast turnaround per task), qwen3:8b

Controlling Agent Behavior

Think Mode Behavior

Observation: Qwen3 models enter verbose thinking mode during code generation tasks. Ministral-3 models do not.

Understanding:

This is model behavior, not an Open Code CLI issue
Build mode is already the default mode in Open Code CLI
Tasks complete correctly despite verbosity

Best approach:

Use ministral-3:8b or ministral-3:8b-16k to avoid think-mode overhead entirely
If using Qwen3, accept the think mode — it completes correctly and provides insight into reasoning

Temperature and Sampling

Control via Ollama Modelfile:

FROM ministral-3:8b
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20

For agent tasks:

Lower temperature (0.3-0.5): More deterministic, better for code generation
Higher temperature (0.7-0.9): More creative, better for brainstorming

Context Window Management

4k context (ministral-3:8b, qwen3:4b):

Keep tasks focused on single files
Avoid multi-file analysis
Break complex tasks into smaller chunks

8k context (qwen3:8b):

Can handle 1-2 medium files simultaneously
Suitable for most single-module development tasks

16k context (ministral-3:8b-16k, qwen3:8b-16k):

Can analyze 3-5 medium files
Suitable for cross-module refactoring
Better for architecture analysis

Performance Benchmarks

Based on real-world usage on M1 16GB:

Task Type	ministral-3:8b	ministral-3:8b-16k	qwen3:4b	qwen3:8b	qwen3:8b-16k
Simple file creation	4-8s	4-10s	5-15s	15-30s	45-90s
Code review (1 file)	8-15s	8-20s	10-25s	20-45s	60-120s
Multi-file analysis	N/A	15-40s	N/A	40-90s	90-180s
Complex refactoring	N/A	20-60s	N/A	60-120s	120-240s
Test generation	8-20s	10-25s	15-30s	30-60s	60-120s

Notes:

Qwen3 times include think-mode overhead; Ministral-3 has none
Claude API (cloud) is typically 3-10x faster for equivalent tasks
Performance varies based on hardware (Apple Silicon M-series is optimal)

Best Practices

1. Choose the Right Model for the Task

Quick edits / daily driver: ministral-3:8b (~4s warm, no think-mode tax)
Multi-file agentic work: ministral-3:8b-16k
Standard development: qwen3:8b
Multi-file analysis: qwen3:8b-16k

2. Provide Clear Context

# Bad
> Fix the bug

# Good
> Fix the authentication bug in routes/auth.py where users can login without password verification

3. Break Down Complex Tasks

# Instead of one massive prompt
> Create a complete e-commerce application

# Break it down
> Create the product model with fields: id, name, price, description
# Then in next prompt
> Create the shopping cart functionality that uses the Product model

4. Use Agent Switching Strategically

# Press Tab to switch to plan agent
> How should I refactor this authentication system?
# Review the plan
# Press Tab to switch to build agent
> Implement the refactoring plan from above

5. Validate Agent Output

Always review generated code for:

Security vulnerabilities (SQL injection, XSS, etc.)
Logic errors and edge cases
Performance implications
Compliance with project standards

Troubleshooting Agent Issues

Agent Gets Stuck in Thinking Mode

Symptoms: Verbose analysis, slow response, excessive planning

Understanding:

Build agent is already the default
Thinking mode is a Qwen3 model behavior, not a mode setting
Tasks complete successfully despite verbosity

Best approach:

Switch to ministral-3:8b or ministral-3:8b-16k — no think-mode overhead
Or accept Qwen3 think-mode as part of using those models

Agent Loses Context

Symptoms: Forgets previous steps, contradicts earlier decisions

Solutions:

Use models with larger context windows (ministral-3:8b-16k, qwen3:8b-16k)
Break tasks into smaller, independent chunks
Explicitly reference earlier steps in prompts
Press Tab to switch to plan agent to establish a clear plan before execution

Agent Produces Low-Quality Code

Symptoms: Bugs, security issues, poor practices

Solutions:

Provide more specific requirements
Ask agent to add tests and validation
Switch to a larger model (qwen3:8b-16k, ministral-3:8b-16k)

Agent is Too Slow

Symptoms: Long wait times, frustration with iteration speed

Solutions:

Switch to ministral-3:8b — fastest warm inference, no think-mode overhead
Break tasks into smaller pieces
Use standard context when extended context isn't needed
Consider switching to cloud models (Claude API) for time-sensitive work

Agent Hallucinates or Creates Invalid Code

Symptoms: References non-existent libraries, invalid syntax

Solutions:

Use lower temperature (via Modelfile)
Provide explicit examples of desired output
Validate all generated code before committing

When to Use Cloud Models vs Local Agents

Use Local Models (Ollama + Open Code)

Working offline
Processing sensitive/proprietary code
Running batch operations overnight
Privacy requirements mandate local processing
Learning/experimenting without API costs
Tasks where speed is not critical

Use Cloud Models (Claude API + Open Code)

Real-time interactive development
Complex multi-file operations requiring fast iteration
Time-sensitive tasks
Working with very large codebases (200k+ context)
When you need the absolute best code quality
When speed is more important than cost

Uh oh!

FilesExpand file tree

AGENTS-USAGE.md

Latest commit

History

AGENTS-USAGE.md

File metadata and controls

Open Code CLI Agents Guide

How OpenCode Works: The Agentic Loop

What are Agents in Open Code CLI?

Agent Modes

1. Build Agent (Default)

2. Plan Agent

Agent Capabilities by Model

ministral-3:8b-16k (6.0 GB, 16k context) — Recommended

ministral-3:8b (6.0 GB, 4k context)

qwen3:8b-16k (5.2 GB, 16k context)

qwen3:8b (5.2 GB, 8k context)

qwen3:4b (2.5 GB, 4k context)

Agent Workflow Patterns

Pattern 1: Autonomous Task Execution

Pattern 2: Iterative Refinement

Pattern 3: Analysis-Then-Action

Pattern 4: Batch Operations

Controlling Agent Behavior

Think Mode Behavior

Temperature and Sampling

Context Window Management

Performance Benchmarks

Best Practices

1. Choose the Right Model for the Task

2. Provide Clear Context

3. Break Down Complex Tasks

4. Use Agent Switching Strategically

5. Validate Agent Output

Troubleshooting Agent Issues

Agent Gets Stuck in Thinking Mode

Agent Loses Context

Agent Produces Low-Quality Code

Agent is Too Slow

Agent Hallucinates or Creates Invalid Code

When to Use Cloud Models vs Local Agents

Use Local Models (Ollama + Open Code)

Use Cloud Models (Claude API + Open Code)

Resources