Feature Request: Token-Efficient Mode for OpenCode
Repository: code-yeongyu/oh-my-openagent
Author: @submitting-user
Labels: enhancement, performance, token-optimization
Summary
Add configurable token-efficient modes that allow users to reduce request JSON size by controlling verbosity of tool descriptions, message structure, context blocks, and system prompt caching.
Problem Statement
OpenCode's request JSON is significantly bloated, with typical requests reaching ~181KB. This causes:
| Issue |
Impact |
Measured Overhead |
| Verbose tool descriptions |
Every tool includes full docs, examples, regex explanations, and enum value lists |
~40-60% of request size |
| Structured message wrapper |
[{type:"text", text:"..."}] array syntax instead of simple strings |
~5-10% overhead |
| [search-mode]/[analyze-mode] blocks |
Inserted into every user prompt |
~7KB per request |
| XML wrapper tags |
<skill-instruction> and <user-request> add parsing overhead |
~1-2KB overhead |
| Full system prompt repetition |
Entire system prompt (~64KB) sent every request |
~35% of total request |
Specific Example: skill Tool
The skill tool description alone contains 38,854 characters containing all skill listings, their descriptions, parameters, and usage patterns. This is sent with every request even when only 2-3 tools are actively used.
Proposed Solution
1. Tool Description Minimal Mode
Goal: Send only essential tool metadata without verbose documentation.
Current behavior:
{
"name": "skill",
"description": "Load a skill or execute a slash command to get detailed instructions for a specific task.\n\nSkills and commands provide specialized knowledge and step-by-step guidance.\n\nUse this when a task matches an available skill's or command's description.\n\nTool API:\n- name: skill name (e.g., 'review-work' or 'publish'). Use without leading slash for commands.\n...",
"parameters": {
"properties": {
"name": {"description": "The skill or command name...", "type": "string"},
"user_message": {"description": "Optional arguments or context...", "type": "string"}
},
"required": ["name"]
}
}
Proposed minimal mode (opencode.json config):
Minimal output would be:
{
"name": "skill",
"description": "Execute slash commands or load specialized skills for specific tasks",
"parameters": {
"properties": {
"name": {"type": "string"},
"user_message": {"type": "string", "optional": true}
},
"required": ["name"]
}
}
2. Tool List Filtering
Goal: Only send tools that are actually used per request.
Impact: Reducing from 25 tools to 8-10 active tools saves ~15-25KB per request.
3. Structured Message Toggle
Goal: Allow simple string content instead of [{type:"text", text:"..."}] arrays.
Current:
{
"role": "user",
"content": [{"type": "text", "text": "Fix the login bug"}]
}
Proposed (when simple_messages: true):
{
"role": "user",
"content": "Fix the login bug"
}
4. Context Mode Block Toggle
Goal: Option to disable [search-mode] and [analyze-mode] block insertion.
Impact: Saves ~7KB per request when disabled.
5. System Prompt Caching / Differential Updates
Goal: Send only changed portions of system prompt.
The system prompt is ~64KB. Currently sent in full every request.
Proposed options:
- Hash-based caching: Send prompt hash first, only send full prompt if hash differs from previous
- Incremental updates: Send only changed
AGENTS.md sections or config diffs
- Template caching: Model-side caching with invalidation on config change
Token Reduction Estimates
| Mode |
Before |
After |
Reduction |
| Baseline |
181KB |
- |
- |
| + Minimal tool descriptions |
181KB |
~120-140KB |
~25-30% |
| + Tool filtering (8-10 tools) |
140KB |
~100-115KB |
~15-20% |
| + Simple messages |
115KB |
~100-108KB |
~5-10% |
| + Disable context blocks |
100KB |
~93KB |
~7KB |
| + System prompt caching |
93KB |
~50-70KB |
~30-40% |
| Combined (all options) |
181KB |
~50-80KB |
~55-65% |
Impact Assessment
Positive Impacts
- Cost reduction: Fewer tokens = lower API costs
- Faster requests: Smaller JSON = faster serialization/deserialization
- Better model focus: Less noise allows model to focus on actual task
Risk: Code Quality Degradation
Verbose tool descriptions contain critical information:
| Information Type |
Example |
Risk if Removed |
| Usage patterns |
"Use when: 'implement X'" |
Agent may misapply tools |
| Edge cases |
"Do NOT use for file operations" |
Incorrect tool selection |
| Examples |
ast_grep pattern syntax |
Pattern errors |
| Parameter details |
"required vs optional" |
Missing required params |
Mitigation Strategy
Phase 1 (Low risk): Remove only truly redundant information:
- Regex explanations (verbose syntax docs)
- Enum value lists (can be inferred from schema)
- Duplicate examples (keep 1-2 essential ones)
Phase 2 (Medium risk): Offer minimal mode as opt-in:
- Remove all documentation prose
- Keep only name + parameters schema + return type
- Allow users to test and validate
Phase 3 (Higher risk): Consider parameters-only mode:
- Only schema, no descriptions at all
- Requires extensive testing and user feedback
Implementation Guidance
Configuration Schema
// Proposed config in opencode.json
interface TokenEfficientConfig {
enabled: boolean; // Master switch
tool_descriptions: 'full' | 'minimal' | 'parameters-only';
tool_filtering: {
mode: 'all' | 'whitelist' | 'blacklist';
tools: string[];
};
simple_messages: boolean;
context_blocks: boolean;
system_prompt: 'full' | 'cached' | 'incremental';
}
Phase 1 Implementation (Recommended Starting Point)
-
Add tool_descriptions: "minimal" option that strips:
- Full documentation paragraphs
- Regex explanations
- Extended examples
- Duplicate enum lists
-
Keep essential information:
- Tool name and purpose (1-line)
- Parameter names and types
- Required vs optional markers
- Return type information
-
Add disabled_tools config to filter unused tools
This achieves ~30-40% reduction with minimal risk to code quality.
Files Likely to Modify
src/core/request-builder.ts - Request serialization logic
src/tools/registry.ts - Tool metadata handling
src/config/schema.ts - Configuration definitions
src/agents/sisyphus.ts - Agent prompt construction
Phased Rollout Plan
Phase 1: Documentation Stripping (v4.x)
Phase 2: Tool Filtering (v4.x)
Phase 3: Structural Optimizations (v5.0)
Phase 4: System Prompt Caching (v5.x)
Alternatives Considered
-
Automatic token budgeting: Let OpenCode auto-truncate. Risk: Unpredictable behavior, may cut critical info.
-
Per-tool granularity controls: Individual tool description configs. Risk: Config complexity, most users want simple presets.
-
Server-side prompt caching: Model provider handles caching. Risk: Not universally supported, privacy concerns.
Appendix: Measurement Methodology
To reproduce the token measurements:
- Enable request logging in
opencode.json:
-
Run typical tasks and collect request JSONs
-
Analyze using token counter or token-optimize skill
-
Compare before/after enabling proposed options
Request for Comments
Maintainers: Would you accept PRs implementing this feature? Should we start with Phase 1 only, or implement all phases together?
Community: Who else is experiencing high token costs? What additional optimizations would be valuable?
Feature Request: Token-Efficient Mode for OpenCode
Repository: code-yeongyu/oh-my-openagent
Author: @submitting-user
Labels: enhancement, performance, token-optimization
Summary
Add configurable token-efficient modes that allow users to reduce request JSON size by controlling verbosity of tool descriptions, message structure, context blocks, and system prompt caching.
Problem Statement
OpenCode's request JSON is significantly bloated, with typical requests reaching ~181KB. This causes:
[{type:"text", text:"..."}]array syntax instead of simple strings<skill-instruction>and<user-request>add parsing overheadSpecific Example:
skillToolThe
skilltool description alone contains 38,854 characters containing all skill listings, their descriptions, parameters, and usage patterns. This is sent with every request even when only 2-3 tools are actively used.Proposed Solution
1. Tool Description Minimal Mode
Goal: Send only essential tool metadata without verbose documentation.
Current behavior:
{ "name": "skill", "description": "Load a skill or execute a slash command to get detailed instructions for a specific task.\n\nSkills and commands provide specialized knowledge and step-by-step guidance.\n\nUse this when a task matches an available skill's or command's description.\n\nTool API:\n- name: skill name (e.g., 'review-work' or 'publish'). Use without leading slash for commands.\n...", "parameters": { "properties": { "name": {"description": "The skill or command name...", "type": "string"}, "user_message": {"description": "Optional arguments or context...", "type": "string"} }, "required": ["name"] } }Proposed minimal mode (
opencode.jsonconfig):{ "features": { "token_efficient": { "tool_descriptions": "minimal" // "full" | "minimal" | "parameters-only" } } }Minimal output would be:
{ "name": "skill", "description": "Execute slash commands or load specialized skills for specific tasks", "parameters": { "properties": { "name": {"type": "string"}, "user_message": {"type": "string", "optional": true} }, "required": ["name"] } }2. Tool List Filtering
Goal: Only send tools that are actually used per request.
{ "features": { "token_efficient": { "enabled_tools": ["skill", "read", "edit", "bash"] // whitelist, or... "disabled_tools": ["legacy_*", "debug_*"] // blacklist patterns } } }Impact: Reducing from 25 tools to 8-10 active tools saves ~15-25KB per request.
3. Structured Message Toggle
Goal: Allow simple string content instead of
[{type:"text", text:"..."}]arrays.Current:
{ "role": "user", "content": [{"type": "text", "text": "Fix the login bug"}] }Proposed (when
simple_messages: true):{ "role": "user", "content": "Fix the login bug" }4. Context Mode Block Toggle
Goal: Option to disable
[search-mode]and[analyze-mode]block insertion.{ "features": { "token_efficient": { "context_blocks": false // default: true } } }Impact: Saves ~7KB per request when disabled.
5. System Prompt Caching / Differential Updates
Goal: Send only changed portions of system prompt.
The system prompt is ~64KB. Currently sent in full every request.
Proposed options:
AGENTS.mdsections or config diffsToken Reduction Estimates
Impact Assessment
Positive Impacts
Risk: Code Quality Degradation
Verbose tool descriptions contain critical information:
ast_greppattern syntaxMitigation Strategy
Phase 1 (Low risk): Remove only truly redundant information:
Phase 2 (Medium risk): Offer
minimalmode as opt-in:Phase 3 (Higher risk): Consider
parameters-onlymode:Implementation Guidance
Configuration Schema
Phase 1 Implementation (Recommended Starting Point)
Add
tool_descriptions: "minimal"option that strips:Keep essential information:
Add
disabled_toolsconfig to filter unused toolsThis achieves ~30-40% reduction with minimal risk to code quality.
Files Likely to Modify
src/core/request-builder.ts- Request serialization logicsrc/tools/registry.ts- Tool metadata handlingsrc/config/schema.ts- Configuration definitionssrc/agents/sisyphus.ts- Agent prompt constructionPhased Rollout Plan
Phase 1: Documentation Stripping (v4.x)
tool_descriptions: "minimal"optionPhase 2: Tool Filtering (v4.x)
enabled_tools/disabled_toolsconfigPhase 3: Structural Optimizations (v5.0)
simple_messages: trueoption[search-mode]/[analyze-mode]togglePhase 4: System Prompt Caching (v5.x)
Alternatives Considered
Automatic token budgeting: Let OpenCode auto-truncate. Risk: Unpredictable behavior, may cut critical info.
Per-tool granularity controls: Individual tool description configs. Risk: Config complexity, most users want simple presets.
Server-side prompt caching: Model provider handles caching. Risk: Not universally supported, privacy concerns.
Appendix: Measurement Methodology
To reproduce the token measurements:
opencode.json:{ "debug": { "log_requests": true, "log_file": "./request-log.jsonl" } }Run typical tasks and collect request JSONs
Analyze using token counter or
token-optimizeskillCompare before/after enabling proposed options
Request for Comments
Maintainers: Would you accept PRs implementing this feature? Should we start with Phase 1 only, or implement all phases together?
Community: Who else is experiencing high token costs? What additional optimizations would be valuable?