Skip to content

Feature Request: Token-Efficient Mode for OpenCode #4600

@boyingliu01

Description

@boyingliu01

Feature Request: Token-Efficient Mode for OpenCode

Repository: code-yeongyu/oh-my-openagent
Author: @submitting-user
Labels: enhancement, performance, token-optimization


Summary

Add configurable token-efficient modes that allow users to reduce request JSON size by controlling verbosity of tool descriptions, message structure, context blocks, and system prompt caching.


Problem Statement

OpenCode's request JSON is significantly bloated, with typical requests reaching ~181KB. This causes:

Issue Impact Measured Overhead
Verbose tool descriptions Every tool includes full docs, examples, regex explanations, and enum value lists ~40-60% of request size
Structured message wrapper [{type:"text", text:"..."}] array syntax instead of simple strings ~5-10% overhead
[search-mode]/[analyze-mode] blocks Inserted into every user prompt ~7KB per request
XML wrapper tags <skill-instruction> and <user-request> add parsing overhead ~1-2KB overhead
Full system prompt repetition Entire system prompt (~64KB) sent every request ~35% of total request

Specific Example: skill Tool

The skill tool description alone contains 38,854 characters containing all skill listings, their descriptions, parameters, and usage patterns. This is sent with every request even when only 2-3 tools are actively used.


Proposed Solution

1. Tool Description Minimal Mode

Goal: Send only essential tool metadata without verbose documentation.

Current behavior:

{
  "name": "skill",
  "description": "Load a skill or execute a slash command to get detailed instructions for a specific task.\n\nSkills and commands provide specialized knowledge and step-by-step guidance.\n\nUse this when a task matches an available skill's or command's description.\n\nTool API:\n- name: skill name (e.g., 'review-work' or 'publish'). Use without leading slash for commands.\n...",
  "parameters": {
    "properties": {
      "name": {"description": "The skill or command name...", "type": "string"},
      "user_message": {"description": "Optional arguments or context...", "type": "string"}
    },
    "required": ["name"]
  }
}

Proposed minimal mode (opencode.json config):

{
  "features": {
    "token_efficient": {
      "tool_descriptions": "minimal"  // "full" | "minimal" | "parameters-only"
    }
  }
}

Minimal output would be:

{
  "name": "skill",
  "description": "Execute slash commands or load specialized skills for specific tasks",
  "parameters": {
    "properties": {
      "name": {"type": "string"},
      "user_message": {"type": "string", "optional": true}
    },
    "required": ["name"]
  }
}

2. Tool List Filtering

Goal: Only send tools that are actually used per request.

{
  "features": {
    "token_efficient": {
      "enabled_tools": ["skill", "read", "edit", "bash"]  // whitelist, or...
      "disabled_tools": ["legacy_*", "debug_*"]  // blacklist patterns
    }
  }
}

Impact: Reducing from 25 tools to 8-10 active tools saves ~15-25KB per request.

3. Structured Message Toggle

Goal: Allow simple string content instead of [{type:"text", text:"..."}] arrays.

Current:

{
  "role": "user", 
  "content": [{"type": "text", "text": "Fix the login bug"}]
}

Proposed (when simple_messages: true):

{
  "role": "user",
  "content": "Fix the login bug"
}

4. Context Mode Block Toggle

Goal: Option to disable [search-mode] and [analyze-mode] block insertion.

{
  "features": {
    "token_efficient": {
      "context_blocks": false  // default: true
    }
  }
}

Impact: Saves ~7KB per request when disabled.

5. System Prompt Caching / Differential Updates

Goal: Send only changed portions of system prompt.

The system prompt is ~64KB. Currently sent in full every request.

Proposed options:

  1. Hash-based caching: Send prompt hash first, only send full prompt if hash differs from previous
  2. Incremental updates: Send only changed AGENTS.md sections or config diffs
  3. Template caching: Model-side caching with invalidation on config change

Token Reduction Estimates

Mode Before After Reduction
Baseline 181KB - -
+ Minimal tool descriptions 181KB ~120-140KB ~25-30%
+ Tool filtering (8-10 tools) 140KB ~100-115KB ~15-20%
+ Simple messages 115KB ~100-108KB ~5-10%
+ Disable context blocks 100KB ~93KB ~7KB
+ System prompt caching 93KB ~50-70KB ~30-40%
Combined (all options) 181KB ~50-80KB ~55-65%

Impact Assessment

Positive Impacts

  • Cost reduction: Fewer tokens = lower API costs
  • Faster requests: Smaller JSON = faster serialization/deserialization
  • Better model focus: Less noise allows model to focus on actual task

Risk: Code Quality Degradation

Verbose tool descriptions contain critical information:

Information Type Example Risk if Removed
Usage patterns "Use when: 'implement X'" Agent may misapply tools
Edge cases "Do NOT use for file operations" Incorrect tool selection
Examples ast_grep pattern syntax Pattern errors
Parameter details "required vs optional" Missing required params

Mitigation Strategy

Phase 1 (Low risk): Remove only truly redundant information:

  • Regex explanations (verbose syntax docs)
  • Enum value lists (can be inferred from schema)
  • Duplicate examples (keep 1-2 essential ones)

Phase 2 (Medium risk): Offer minimal mode as opt-in:

  • Remove all documentation prose
  • Keep only name + parameters schema + return type
  • Allow users to test and validate

Phase 3 (Higher risk): Consider parameters-only mode:

  • Only schema, no descriptions at all
  • Requires extensive testing and user feedback

Implementation Guidance

Configuration Schema

// Proposed config in opencode.json
interface TokenEfficientConfig {
  enabled: boolean;                    // Master switch
  tool_descriptions: 'full' | 'minimal' | 'parameters-only';
  tool_filtering: {
    mode: 'all' | 'whitelist' | 'blacklist';
    tools: string[];
  };
  simple_messages: boolean;
  context_blocks: boolean;
  system_prompt: 'full' | 'cached' | 'incremental';
}

Phase 1 Implementation (Recommended Starting Point)

  1. Add tool_descriptions: "minimal" option that strips:

    • Full documentation paragraphs
    • Regex explanations
    • Extended examples
    • Duplicate enum lists
  2. Keep essential information:

    • Tool name and purpose (1-line)
    • Parameter names and types
    • Required vs optional markers
    • Return type information
  3. Add disabled_tools config to filter unused tools

This achieves ~30-40% reduction with minimal risk to code quality.

Files Likely to Modify

  • src/core/request-builder.ts - Request serialization logic
  • src/tools/registry.ts - Tool metadata handling
  • src/config/schema.ts - Configuration definitions
  • src/agents/sisyphus.ts - Agent prompt construction

Phased Rollout Plan

Phase 1: Documentation Stripping (v4.x)

  • tool_descriptions: "minimal" option
  • Remove verbose docs, keep essentials
  • Target: 25-30% token reduction, low risk

Phase 2: Tool Filtering (v4.x)

  • enabled_tools / disabled_tools config
  • Per-request tool list optimization
  • Target: Additional 15-20% reduction

Phase 3: Structural Optimizations (v5.0)

  • simple_messages: true option
  • [search-mode] / [analyze-mode] toggle
  • Target: Additional 10-15% reduction

Phase 4: System Prompt Caching (v5.x)

  • Hash-based prompt caching
  • Incremental update mechanism
  • Target: Additional 30-40% reduction

Alternatives Considered

  1. Automatic token budgeting: Let OpenCode auto-truncate. Risk: Unpredictable behavior, may cut critical info.

  2. Per-tool granularity controls: Individual tool description configs. Risk: Config complexity, most users want simple presets.

  3. Server-side prompt caching: Model provider handles caching. Risk: Not universally supported, privacy concerns.


Appendix: Measurement Methodology

To reproduce the token measurements:

  1. Enable request logging in opencode.json:
{
  "debug": {
    "log_requests": true,
    "log_file": "./request-log.jsonl"
  }
}
  1. Run typical tasks and collect request JSONs

  2. Analyze using token counter or token-optimize skill

  3. Compare before/after enabling proposed options


Request for Comments

Maintainers: Would you accept PRs implementing this feature? Should we start with Phase 1 only, or implement all phases together?

Community: Who else is experiencing high token costs? What additional optimizations would be valuable?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions