Model Configuration

Complete reference for configuring AI models in JOC

Overview
Default Models
Model Properties
Provider Configuration
Custom Models
Best Practices
Troubleshooting

Overview

JOC uses ollama cloud models by default, providing a balance of capability and cost-effectiveness. Models are configured in opencode.jsonc.

Default Models

Model	Context	Output	Best For	Notes
glm-5.1:cloud	202K	131K	General purpose, most tasks	Default for most agents
kimi-k2.5:cloud	262K	262K	Extended context, long documents	Same input/output context
minimax-m2.7:cloud	205K	128K	High performance tasks	Balanced performance
qwen3.5:cloud	262K	32K	Long document processing	Limited output

Model Selection by Task

Task Type	Recommended Model	Reason
Implementation	glm-5.1:cloud	Balanced, cost-effective
Architecture	opus	Deep reasoning (override)
Search/Explore	haiku	Fast, efficient
Documentation	haiku	Simple generation
Security Review	opus	Critical analysis (override)
Long documents	kimi-k2.5:cloud	Extended context
Complex analysis	minimax-m2.7:cloud	High performance

Model Properties

Context Window

The maximum input context size in tokens.

Context = System Prompt + User Message + Conversation History

Example:
System: 1,000 tokens
User: 500 tokens
History: 10,000 tokens
Total: 11,500 tokens (must be < context_limit)

Output Limit

The maximum output size in tokens.

Output = Generated Response

For code generation:
- Short functions: ~500 tokens
- Full files: ~2,000 tokens
- Multiple files: Limited by output

For long outputs, models may need continuation.

Launch Behavior

The _launch property controls automatic model startup:

{
  "glm-5.1:cloud": {
    "_launch": true  // Auto-start on first use
  }
}

Provider Configuration

Basic Configuration

{
  "provider": {
    "opencode": {
      "options": {}
    },
    "ollama": {
      "models": {
        "glm-5.1:cloud": {
          "_launch": true,
          "limit": {
            "context": 202752,
            "output": 131072
          },
          "name": "glm-5.1:cloud"
        }
      }
    }
  }
}

Multiple Providers

{
  "provider": {
    "opencode": {
      "options": {}
    },
    "ollama": {
      "models": {
        "glm-5.1:cloud": { "_launch": true },
        "kimi-k2.5:cloud": { "_launch": true }
      }
    },
    "openrouter": {
      "models": {
        "glm-5:cloud": {
          "limit": { "context": 200000, "output": 131072 }
        }
      }
    }
  }
}

Environment Variables

{
  "provider": {
    "ollama": {
      "models": {
        "glm-5.1:cloud": {
          "_launch": true,
          "env": {
            "OLLAMA_HOST": "${OLLAMA_HOST}",
            "OLLAMA_API_KEY": "${OLLAMA_API_KEY}"
          }
        }
      }
    }
  }
}

Custom Models

Adding a New Model

Define in opencode.jsonc:

{
  "provider": {
    "ollama": {
      "models": {
        "my-custom-model": {
          "_launch": true,
          "limit": {
            "context": 128000,
            "output": 4096
          },
          "name": "my-custom-model"
        }
      }
    }
  }
}

Assign to Agents:

---
name: my-agent
description: Uses custom model
model: ollama/my-custom-model
mode: subagent
---

Use in Skills:

<Configuration>
model: my-custom-model
</Configuration>

Model-Specific Settings

{
  "provider": {
    "ollama": {
      "models": {
        "code-specialist": {
          "_launch": true,
          "limit": { "context": 100000, "output": 16000 },
          "parameters": {
            "temperature": 0.1,      // Lower = more focused
            "top_p": 0.95,
            "frequency_penalty": 0.1,
            "presence_penalty": 0.1
          },
          "defaults": {
            "system_prompt": "You are a code specialist...",
            "stop_sequences": ["```", "---END---"]
          }
        }
      }
    }
  }
}

Tiered Model Strategy

Configure different models for different agent types:

{
  "provider": {
    "ollama": {
      "models": {
        // Fast tier - reading, searching, simple generation
        "fast-model": {
          "limit": { "context": 50000, "output": 2000 },
          "tier": "fast"
        },
        // Standard tier - most operations
        "glm-5.1:cloud": {
          "limit": { "context": 200000, "output": 100000 },
          "tier": "standard"
        },
        // Deep tier - complex reasoning
        "deep-model": {
          "limit": { "context": 300000, "output": 20000 },
          "tier": "deep"
        }
      }
    }
  }
}

Model Routing

{
  "model_routing": {
    "default": "glm-5.1:cloud",
    "routing": {
      "explore": "fast-model",
      "executor": "glm-5.1:cloud",
      "architect": "deep-model",
      "security-reviewer": "deep-model"
    }
  }
}

Best Practices

Context Management

Monitor context usage:

// Large files consume context
// Split into chunks if needed
const chunks = await splitLargeFile(file, maxChunkSize)

Use conversation compaction:
- Let JOC compact when needed
- Key state is preserved
Prefer focused prompts:
- Be specific
- Avoid redundant context

Output Optimization

Request appropriate sizes:

// Bad: Request entire file if only need function
"Write the entire UserService.ts file"

// Good: Request specific part
"Write the authenticate method for UserService"

Use streaming for long outputs:

// Stream long responses
for await (const chunk of stream) {
  process(chunk)
}

Break down complex tasks:
- Multiple smaller requests
- Assemble results

Cost Optimization

Use tiered models:

{
  "model_routing": {
    "explore": "haiku",      // Fast, cheap
    "default": "glm-5.1:cloud", // Standard
    "architecture": "opus"    // Expensive, use sparingly
  }
}

Cache frequently used context:

// Store common context once
await agentContext({
  action: "setMemory",
  data: { techStack: { ... } }
})

Batch similar operations:

// Bad: Multiple agent calls
await agent("fix", { file: "a.ts" })
await agent("fix", { file: "b.ts" })

// Good: Single call with multiple files
await agent("fix", { files: ["a.ts", "b.ts"] })

Troubleshooting

Model Not Found

Error: Model 'glm-5.1:cloud' not found

Solutions:

Check model name spelling
Verify provider configuration
Ensure _launch: true is set

Context Exceeded

Error: Context window exceeded (150000 > 100000)

Solutions:

Reduce input size
Use model with larger context
Clear conversation history

// Clear history to free context
await clearHistory()

Output Truncated

Error: Output truncated at 8000 tokens

Solutions:

Use model with larger output limit
Request smaller chunks
Use streaming

Rate Limiting

Error: Rate limit exceeded

Solutions:

Implement backoff
Reduce concurrent requests
Use multiple API keys (rotate)

// Exponential backoff
let delay = 1000
while (retries < maxRetries) {
  try {
    return await generate(prompt)
  } catch (e) {
    if (e.status === 429) {
      await sleep(delay)
      delay *= 2
      retries++
    } else throw e
  }
}

Slow Response

Causes:

Large context
Complex reasoning required
Network latency
Model under load

Solutions:

Use faster tier model
Reduce context
Use streaming for progress feedback
Check network connectivity

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Configuration

Table of Contents

Overview

Default Models

Model Selection by Task

Model Properties

Context Window

Output Limit

Launch Behavior

Provider Configuration

Basic Configuration

Multiple Providers

Environment Variables

Custom Models

Adding a New Model

Model-Specific Settings

Tiered Model Strategy

Model Routing

Best Practices

Context Management

Output Optimization

Cost Optimization

Troubleshooting

Model Not Found

Context Exceeded

Output Truncated

Rate Limiting

Slow Response

See Also

FilesExpand file tree

model-configuration.md

Latest commit

History

model-configuration.md

File metadata and controls

Model Configuration

Table of Contents

Overview

Default Models

Model Selection by Task

Model Properties

Context Window

Output Limit

Launch Behavior

Provider Configuration

Basic Configuration

Multiple Providers

Environment Variables

Custom Models

Adding a New Model

Model-Specific Settings

Tiered Model Strategy

Model Routing

Best Practices

Context Management

Output Optimization

Cost Optimization

Troubleshooting

Model Not Found

Context Exceeded

Output Truncated

Rate Limiting

Slow Response

See Also