Technical Specification: opencode-kimicode-auth

Reference document for the plugin's architecture, data flow, API wire format, and known gaps. Avoids redundant file discovery across sessions.

1. Request Data Flow

User types in OpenCode TUI
  → OpenCode constructs OpenAI-compatible request to provider URL
    (e.g. POST /v1/chat/completions with model "kimicode-kimi-k2.5")
  → Plugin intercepts fetch() call
    → rewriteToKimi(): rewrites URL to https://api.kimi.com/coding/v1/chat/completions
    → resolveKimiModelAlias(): "kimicode-kimi-k2.5" → "kimi-for-coding"
    → Injects headers: Authorization (Bearer), User-Agent (KimiCLI/<ver>), X-Msh-* device headers
    → Body sent essentially UNMODIFIED (OpenAI-compatible JSON)
  → Response streamed back to OpenCode

Key Files in the Chain

File	Role
`src/plugin.ts`	Main fetch interceptor, URL rewrite, auth, retry loop
`src/constants.ts`	API URLs, device headers, User-Agent, OAuth endpoints
`src/plugin/accounts.ts`	Account rotation, rate limits, cooldowns
`src/plugin/token.ts`	OAuth token refresh
`src/plugin/config/models.ts`	Model definitions written to opencode.json
`src/plugin/config/updater.ts`	Writes model defs to ~/.config/opencode/opencode.json

URL Rewriting (`rewriteToKimi`)

/v1/chat/completions → https://api.kimi.com/coding/v1/chat/completions
/v1/models → https://api.kimi.com/coding/v1/models
Generic: strips /v1 prefix, prepends KIMI_API_BASE_URL

Model Alias Resolution (`resolveKimiModelAlias`)

"kimicode-kimi-k2.5" → "kimi-for-coding" (hardcoded)
Any "kimicode-<X>" → "<X>" (prefix strip)
All other names pass through unchanged

Headers Applied Per Request

Authorization: Bearer <access_token>
User-Agent: KimiCLI/<version>          (default 1.12.0, env: KIMI_CODE_CLI_VERSION)
X-Msh-Platform: kimi_cli
X-Msh-Version: <version>
X-Msh-Device-Name: <hostname>
X-Msh-Device-Model: macOS <ver> <arch>
X-Msh-Os-Version: <os.version()>
X-Msh-Device-Id: <per-account fingerprint or generated>

2. Kimi API Wire Format (from kimi-cli source)

Source: kimi-cli/packages/kosong/src/kosong/chat_provider/kimi.py

The Kimi API is OpenAI-compatible. kimi-cli uses the openai Python SDK (AsyncOpenAI) with these parameters:

Request Body (chat/completions)

{
  "model": "kimi-for-coding",
  "messages": [...],
  "tools": [...],
  "stream": true,
  "stream_options": { "include_usage": true },
  "max_tokens": 32000,
  "reasoning_effort": "high",
  "extra_body": {
    "thinking": { "type": "enabled" }
  },
  "prompt_cache_key": "<session-id>"
}

Generation Parameters (kimi-cli defaults)

Parameter	Default	Notes
`max_tokens`	32000	Hard default in kimi.py
`stream`	`true`	Always streams
`stream_options`	`{ "include_usage": true }`	Only when streaming
`temperature`	Not set	Env: `KIMI_MODEL_TEMPERATURE`
`top_p`	Not set	Env: `KIMI_MODEL_TOP_P`
`prompt_cache_key`	session_id	Enables Kimi's prompt caching

Thinking / Reasoning Control

kimi-cli sends two parameters simultaneously via with_thinking():

reasoning_effort (top-level body field, legacy):
- "low" / "medium" / "high" / null (off)
extra_body.thinking (new mechanism):
- { "type": "enabled" } or { "type": "disabled" }

Both are sent together. The with_thinking(effort) method:

# effort = "high" → reasoning_effort="high", thinking.type="enabled"
# effort = "off"  → reasoning_effort=None, thinking.type="disabled"

Response Format

Text content: standard choices[0].delta.content
Thinking content: choices[0].delta.reasoning_content (NOT Anthropic-style thinking blocks)
Tool calls: standard OpenAI format
Usage: usage.prompt_tokens, usage.completion_tokens, usage.cached_tokens (Kimi-specific)

Model Capabilities (from kimi-cli)

For kimi-for-coding / kimi-code:

thinking (toggleable on/off)
image_in (image input)
video_in (video input)

Context length: reported by /models endpoint context_length field (currently 262144).

3. OpenCode Model Configuration

Model Definitions (src/plugin/config/models.ts)

Two separate models (no OpenCode variants), matching kimi-cli / web GUI modes:

"kimicode-kimi-k2.5": {
  name: "Kimi Code (K2.5)",
  limit: { context: 262144, output: 32000 },
  modalities: { input: ["text", "image"], output: ["text"] },
},
"kimicode-kimi-k2.5-thinking": {
  name: "Kimi Code (K2.5) Thinking",
  limit: { context: 262144, output: 32000 },
  modalities: { input: ["text", "image"], output: ["text"] },
}

Both map to model: "kimi-for-coding" on the wire. The plugin detects which model was requested and injects the corresponding thinking parameters.

Thinking Parameter Injection (src/plugin.ts)

The plugin rewrites the JSON request body for /chat/completions:

kimicode-kimi-k2.5 (thinking OFF):

{ "model": "kimi-for-coding", "thinking": { "type": "disabled" } }

kimicode-kimi-k2.5-thinking (thinking ON):

{ "model": "kimi-for-coding", "reasoning_effort": "high", "thinking": { "type": "enabled" } }

This precisely mirrors kimi-cli's with_thinking("off") / with_thinking("high").

Why Two Models Instead of Variants

The antigravity plugin uses OpenCode's variant system (providerOptions.google) because it serves multiple model families (Gemini, Claude) with varying thinking mechanisms. Kimi has exactly two modes — thinking on / off — matching the web GUI. Two separate models is simpler, avoids variant plumbing, and makes model selection explicit in the OpenCode TUI.

4. Resolved Gaps

All identified gaps between this plugin and kimi-cli have been addressed.

Gap	Description	Resolution
Thinking controls	kimi-cli sends `reasoning_effort` + `thinking.type`; plugin didn't	Two models surface thinking on/off; plugin injects parameters. See §3.
Output limit	Plugin had `output: 16384`; kimi-cli uses `max_tokens: 32000`	Both models now define `output: 32000`.
Prompt cache key	kimi-cli sends `prompt_cache_key: <session_id>` for server-side caching	Plugin generates a stable per-instance UUID (`PLUGIN_SESSION_ID`) and injects `prompt_cache_key` into every request body.
"I'm Claude" identity	Model responds as Claude	Not a plugin issue — `kimi-for-coding` model behavior. No plugin fix needed.
Video input	kimi-cli reports `video_in` capability	OpenCode does not support video input modality. Non-actionable.

Prompt Cache Key Details

kimi-cli passes session.id as prompt_cache_key — a top-level field in the chat completions JSON body. This tells the Kimi API to cache prompt tokens for the given key, avoiding re-processing of earlier messages across turns.

OpenCode's plugin interface does not expose a conversation-level session ID to the fetch interceptor. The session.created event provides info.parentID (for subagent detection) but not the session ID itself.

The plugin generates a stable randomUUID() at module load time (PLUGIN_SESSION_ID in src/plugin.ts). This mirrors the antigravity plugin's approach (PLUGIN_SESSION_ID = crypto.randomUUID()). The UUID is stable for the lifetime of the OpenCode process, enabling prompt caching across all turns within a session.

5. OAuth Flow

Endpoints

Endpoint	URL
Device Authorization	`https://auth.kimi.com/api/oauth/device_authorization`
Token Exchange	`https://auth.kimi.com/api/oauth/token`
API Base	`https://api.kimi.com/coding/v1`

Flow

Device Auth: POST to device_authorization with client_id + scope=kimi_for_coding
User Approval: User visits verification URL and authorizes
Token Exchange: Poll token endpoint with device_code until approved
Access Token: JWT containing user_id, used as Bearer token
Refresh: POST to token endpoint with grant_type=refresh_token

Key Constants

Client ID: 17e5f671-d194-4dfb-9706-5516cb48c098
Compat Version: 1.12.0 (overridable via KIMI_CODE_CLI_VERSION)
Refresh Threshold: 300s before expiry
Max Accounts: 10

6. Account Storage

Single-version JSON at ~/.config/opencode/kimicode-accounts.json:

{
  "version": 1,
  "accounts": [{
    "email": "user@example.com",
    "refreshToken": "...",
    "addedAt": 1234567890,
    "lastUsed": 1234567890,
    "enabled": true,
    "rateLimitResetTimes": { "kimi": 1234567890 },
    "fingerprint": { "deviceId": "..." },
    "fingerprintHistory": []
  }],
  "activeIndex": 0,
  "activeIndexByFamily": { "kimi": 0 }
}

File uses proper-lockfile for concurrent access safety. Atomic writes via temp file + rename. Permissions: 0600 on POSIX.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Technical Specification: opencode-kimicode-auth

1. Request Data Flow

Key Files in the Chain

URL Rewriting (`rewriteToKimi`)

Model Alias Resolution (`resolveKimiModelAlias`)

Headers Applied Per Request

2. Kimi API Wire Format (from kimi-cli source)

Request Body (chat/completions)

Generation Parameters (kimi-cli defaults)

Thinking / Reasoning Control

Response Format

Model Capabilities (from kimi-cli)

3. OpenCode Model Configuration

Model Definitions (src/plugin/config/models.ts)

Thinking Parameter Injection (src/plugin.ts)

Why Two Models Instead of Variants

4. Resolved Gaps

Prompt Cache Key Details

5. OAuth Flow

Endpoints

Flow

Key Constants

6. Account Storage

FilesExpand file tree

TECHNICAL-SPEC.md

Latest commit

History

TECHNICAL-SPEC.md

File metadata and controls

Technical Specification: opencode-kimicode-auth

1. Request Data Flow

Key Files in the Chain

URL Rewriting (rewriteToKimi)

Model Alias Resolution (resolveKimiModelAlias)

Headers Applied Per Request

2. Kimi API Wire Format (from kimi-cli source)

Request Body (chat/completions)

Generation Parameters (kimi-cli defaults)

Thinking / Reasoning Control

Response Format

Model Capabilities (from kimi-cli)

3. OpenCode Model Configuration

Model Definitions (src/plugin/config/models.ts)

Thinking Parameter Injection (src/plugin.ts)

Why Two Models Instead of Variants

4. Resolved Gaps

Prompt Cache Key Details

5. OAuth Flow

Endpoints

Flow

Key Constants

6. Account Storage

URL Rewriting (`rewriteToKimi`)

Model Alias Resolution (`resolveKimiModelAlias`)