This file is the single source of truth for any AI agent (or human) modifying this repo. Read it top-to-bottom before touching code. If something you learn here contradicts what you see in the code, the code wins — update this file in the same commit.
User-facing install / usage documentation lives in README.md. Do not duplicate it here.
One plugin, one job: make opencode talk to Kimi's kimi-for-coding endpoint exactly the way the official kimi-cli does. Everything in this repo exists to minimize drift from upstream kimi-cli.
Moonshot's coding backend is entitlement-sensitive: the model-name string alone is not the whole story.
Every design decision here follows from that: we do device-flow OAuth to mirror official kimi-cli, we do not accept API keys in this plugin, and we do not let the upstream SDK attach its own Authorization header.
- No support for any non-
kimi-for-codingmodel. opencode already handles other Moonshot / Baseten / Alibaba-CN / etc. entries itself. - No support for static API keys. Users who want that can use a different opencode provider entry.
- No custom SSE parser, tool-call normalizer, or message rewriter.
@ai-sdk/openai-compatiblealready does SSE/reasoning_contentcorrectly.
Each source file has one job. Do not add new files unless the existing ones genuinely can't hold a new concern.
| File | Responsibility |
|---|---|
src/constants.ts |
Pinned strings that must mirror upstream kimi-cli (version, endpoints, client id). |
src/headers.ts |
The seven X-Msh-* / UA headers + the persistent ~/.kimi/device_id file. |
src/oauth.ts |
Device-code start, device-code poll, refresh-token exchange, and GET /coding/v1/models discovery. |
src/auth-store.ts |
Read/write opencode's auth.json entries for this provider. |
src/auth-refresh.ts |
Lock-based token refresh with cross-instance coordination, ensureFreshStoredAuth for standalone callers. |
src/index.ts |
Plugin entry (v1 PluginModule format). Wires auth (login + loader) plus the Kimi chat hooks/body rewrite. |
src/usage.ts |
Fetch and parse Kimi subscription usage (/coding/v1/usages). |
src/tui.tsx |
TUI slash command /kimi:usage — renders usage in an opencode dialog. |
Data flow on a chat request:
- opencode asks the
@ai-sdk/openai-compatibleprovider for a language model. - Before instantiating it, opencode calls our
auth.loader. We return{ apiKey, fetch }. - The SDK uses our
fetchfor every HTTP call (models, chat, whatever). - Our
fetchcallsensureFresh()→ prefers the live opencode auth-store entry over staleOPENCODE_AUTH_CONTENTsnapshots → maybe refreshes (sharing one in-flight promise in-process and a lock across plugin instances so they don't race the same refresh token) → lazily discovers/coding/v1/modelswhen needed → sets Authorization + the sevenX-Msh-*headers → on 401 refreshes once and retries. - Separately, opencode runs
chat.headersandchat.params.chat.headerscomputesthinking,reasoning_effort, andprompt_cache_keyfrominput.model.optionsplus the selectedinput.message.model.variant, then passes them toloader.fetchvia privatex-opencode-kimi-*headers.loader.fetchstrips those headers and injects the wire fields into the JSON body.chat.paramsmirrors the same keys intooutput.optionsonly as a forward-compat fallback if opencode later fixes its openai-compatible providerOptions namespace mismatch.
These are the invariants that, if broken, silently route requests onto the wrong auth/backend path or produce fingerprint-based throttling. Do not "clean them up" without reading the linked upstream.
-
X-Msh-VersionandUser-Agentmust trackkimi-cli. Bumping involves exactly one line insrc/constants.ts. See upstreamresearch/kimi-cli/src/kimi_cli/constant.py. The UA prefix isKimiCLI/(notKimiCodeCLI/) — Moonshot'skimi-for-codingbackend 403s withaccess_terminated_error: only available for Coding Agents such as Kimi CLI, Claude Code, Roo Code…on any other prefix. Likewise,X-Msh-Device-Modelmust mirror kimi-cli's_device_model()shape, including the Darwin/Windows special cases (macOS <version> <arch>,Windows 10/11 <arch>, Linux"{system} {release} {machine}") — NOT just{arch}— andX-Msh-Os-Versionis the kernel build string fromos.version(), NOT"{type} {release}". Tested live againstapi.kimi.com/coding/v1on 2026-04-17 — any of those three fields off-spec → 403. -
X-Msh-Device-Idmust be stable across runs. Never regenerate a fresh UUID at import time.getDeviceId()reads/writes~/.kimi/device_id; that path is shared withkimi-clion purpose. -
Authorizationheader is owned byloader.fetch. Anything else (opencode core, the SDK, future hooks) must be overridden. Ourloaderdeletes bothauthorizationandAuthorizationbefore setting its own. The privatex-opencode-kimi-*transport headers are also consumed and stripped there; they must never leak upstream. -
Effort ↔ fields mapping (kimi-cli
llm.py/kosong/chat_provider/kimi.py):Effort reasoning_effortthinkingauto(omitted) (omitted) off(omitted) {type:"disabled"}low"low"{type:"enabled"}medium"medium"{type:"enabled"}high"high"{type:"enabled"}xhigh"high"(clamped){type:"enabled"}max"high"(clamped){type:"enabled"}autois the "let the server decide dynamically" variant — neither field is sent, matching kimi-cli's "nothing passed" default.xhighandmaxare clamped to"high"because Kimi's backend does not support higher tiers (kimi-cli'sKimi.with_thinking()does the same). When no effort is set at all, the plugin still emitsthinking: {type: "enabled"}because the model is a reasoner. Compute this frominput.model.optionsplusinput.model.variants[input.message.model.variant], not frominput.provider.info.id. The@opencode-ai/pluginProviderContexttype claims.info.idexists, but the runtime shape opencode passes (seeresearch/opencode/packages/opencode/src/session/llm.ts::stream, ~line 168,provider: item) is the flatProviderConfig(.id).input.model.providerIDis what every first-party plugin uses (cloudflare.ts, codex.ts, github-copilot/copilot.ts) and it avoids the runtime crash "undefined is not an object (evaluating 'input.provider.info.id')". Tested live 2026-04-17. -
prompt_cache_keyonly forkimi-for-coding. Never attach it to unrelated models. The check isinput.model.id === MODEL_IDin the Kimi chat hooks, and the actual wire injection happens inloader.fetch. -
Wire model id comes from
/coding/v1/models, not from user config. The opencode-side model id is a stable alias (MODEL_ID = "kimi-for-coding"); the plugin callsGET /coding/v1/modelsat login and on every token refresh (mirroring kimi-cli'srefresh_managed_modelsinresearch/kimi-cli/src/kimi_cli/auth/platforms.py), caches the first returned{id, context_length, display_name, supports_image_in, supports_video_in}in loader memory, rewrites the JSON bodymodelfield insideloader.fetchwhenever the discovered id differs fromMODEL_ID, and backfills runtime model metadata from the same discovery response. A new loader instance re-discovers on first use if needed. Do not strip thekimi-prefix; send whatever the server returned. Discovery failures are non-fatal (warm cached id still works; 401 retry flushes broken tokens). -
Auth store is opencode's, not kimi-cli's. We use opencode's auth store for tokens under the
kimi-for-coding-oauthprovider id. Do not read/write~/.kimi/credentials/kimi-code.json; that's kimi-cli's file and sharing it across independent apps causes token-race bugs. The plugin may live-read opencode'sauth.jsonentry for this provider to bypass staleOPENCODE_AUTH_CONTENTworkspace snapshots, but writes still go through opencode's auth store (client.auth.set). Also note that opencode's SDK auth schema only persists the standard oauth fields, so model discovery metadata cannot be stored there durably. -
Provider id must not collide with any id in the models.dev catalog. models.dev publishes
kimi-for-codingas a separate API-key-driven integration. If we registered under that same id,opencode auth login kimi-for-codingwould surface two methods under one entry and users could silently land on the wrong integration path. We deliberately usekimi-for-coding-oauthinstead;MODEL_IDon the wire stayskimi-for-coding(rule 6). -
src/index.tsmust have exactly one export — the defaultPluginModuleobject{ id, server }. opencode's plugin loader (research/opencode/packages/opencode/src/plugin/index.ts) first triesreadV1Plugin(detect mode) on the default export. If it finds an object withserver(and optionalid), it uses the v1 path directly. The older legacy path (getLegacyPlugins) iterates every export and throwsPlugin export is not a functionon any non-callable value — a problem that surfaced on Windows where Bun's standalone-binary dynamic imports can produce module namespace objects with unexpected non-function metadata. The v1 format bypassesgetLegacyPluginsentirely. Keep constants insrc/constants.tsand import them insrc/index.tsrather than re-exporting.test/exports.test.tsguards this. The failure mode of a broken export is silent in the CLI (the provider just doesn't appear inopencode auth login); the error only surfaces in~/.local/share/opencode/log/*.log. -
The post-login config hint must not emit a partial
limitobject. opencode's live config schema athttps://opencode.ai/config.jsonrequires bothlimit.contextandlimit.outputwheneverlimitis present, while Kimi'sGET /coding/v1/modelsonly gives uscontext_length. ThereforebuildConfigBlock()omitslimitentirely and leavesprovider.modelsto backfilllimit.contextat runtime. Do not inventoutputor setinputheuristically; opencode's overflow logic treatslimit.inputas authoritative (research/opencode/packages/opencode/src/session/overflow.ts). -
Concurrent refreshes must collapse to one in-flight OAuth exchange, even across plugin instances.
provider.modelsandauth.loadercan both notice an expiring token at about the same time, and separate opencode workspace/plugin instances can inherit stale auth snapshots.refreshAuth()insrc/index.tstherefore shares one promise across overlapping callers, takes a provider-scoped auth-store lock before refreshing, re-reads opencode's live auth-store entry under that lock, and treats a changed on-disk token chain as authoritative.test/plugin.test.tscovers loader-vs-loader, provider.models-vs-loader, cross-instance lock reuse, and theinvalid_grantself-heal path where another process already rotated the refresh token. -
Media-input capabilities must be backfilled from
/coding/v1/models.supports_image_inandsupports_video_infrom Kimi discovery are not cosmetic metadata: opencode's provider transform (research/opencode/packages/opencode/src/provider/transform.ts::unsupportedParts) rewrites every image part into localERROR: Cannot read ... (this model does not support image input)text before the request reaches our loader whencapabilities.input.imageis false. Thereforeprovider.modelsmust patch runtime model metadata forkimi-for-coding, andbuildConfigBlock()must includeattachment: trueplus appropriatemodalities.input/modalities.outputwhen discovery says images/video are supported.test/plugin.test.tscovers both paths.
- Code style: see
tsconfig.json(strict,noUncheckedIndexedAccess, ES2022). Prefer small pure functions, avoidtry/catchexcept where we genuinely convert one error shape to another. - Comments: match the existing density — only explain non-obvious upstream-parity reasoning. Do not narrate the obvious ("// refresh the token"); instead reference upstream files when the reasoning is "because kimi-cli does it that way".
- Dependencies: runtime deps are limited to
@opentui/coreand@opentui/solid(for the TUI slash command). The only dev/peer dep is@opencode-ai/pluginfor types. Do not add further runtime deps. - Git commits: small, logical, imperative subject ("Add oauth device flow"). Do not add a
Co-authored-bytrailer. - Upstream research: the
research/directory is a read-only git-ignored pair of shallow clones (opencode + kimi-cli) for grep. Never edit files there; re-clone if you suspect drift. When citing upstream in a comment, use theresearch/…path so the reference is resolvable. - Version bumps: when kimi-cli bumps, (1) pull a fresh
research/kimi-cli, (2) updateKIMI_CLI_VERSIONinsrc/constants.ts, (3) re-diff_kimi_default_headers()/oauth.pyagainstsrc/headers.tsandsrc/oauth.ts, (4) smoke-test withopencode auth login kimi-for-coding-oauthand a one-turn chat, (5) tag release. - Tests:
test/holds one file per source file plustest/exports.test.ts(the rule-9 guard). Tests mockfetchviatest/_util/fetchMock.ts; no real credentials or network. They use the real~/.kimi/device_idon purpose — it is shared with kimi-cli by design andgetDeviceIdis idempotent, so tests don't clobber state. When adding a new contract to the list above, add the matching offline check to the corresponding test file rather than creating new ones.
- ❌ Don't add heuristics that look at the model id outside of the Kimi chat hooks /
loader.fetch. The auth loader is already scoped to this provider; only the chat hooks and the body rewrite need to match onkimi-for-coding. - ❌ Don't rename the provider id back to
kimi-for-codingor to anything else listed in models.dev. See rule 8. - ❌ Don't add new header values that kimi-cli doesn't send. The fingerprint matters.
- ❌ Don't call out to other files to "share" the kimi-cli credentials. Different OAuth consumers must have independent refresh-token chains or one will invalidate the other.
- ❌ Don't introduce a build step. The plugin ships as
.tsand opencode's bun-based loader handles it. - ❌ Don't add tests that require real Kimi credentials and check them in. If you add offline unit tests, put them under
test/and mockfetch. - ❌ Don't add named exports to
src/index.tsor change the default export away from the{ id, server }PluginModule shape. See rule 9.
Offline:
bunx tsc --noEmit # type-check
bunx tsc --noEmit --project tsconfig.tests.json # type-check tests/helpers
bun build --target=node --no-bundle src/index.ts # syntax check
bun test # offline unit testsOnline (requires a real Kimi-for-coding account):
- Install the local checkout via opencode's plugin flow (
opencode plugin /path/to/this/repo --global) or point thepluginarray in your opencode config at the repo root, as shown inREADME.md. - Paste the provider block from
README.mdinto your opencode config. opencode auth login kimi-for-coding-oauth— confirm a token lands in opencode'sauth.jsonwithtype: "oauth", a JWTaccess, andexpires~15 min in the future.- Start opencode, select
kimi-for-coding-oauth/kimi-for-coding, and ask the model to self-identify. It should claim to bekimi-for-coding/ Kimi Code. - Confirm
reasoning_contentdeltas render as thinking content (not assistant text). - In a second turn of the same session, confirm the response comes back faster (cache hit via
prompt_cache_key).
If any of 3–6 fails, diff research/kimi-cli against the contracts above.
- Read this file first. Every time.
- Don't grow the dependency footprint to "simplify" something; this plugin's value is being small and audit-able.
- When in doubt, mirror kimi-cli exactly, then comment the upstream reference. "We used to deviate, it broke" — document it here.
- Keep
README.mduser-focused and this file contributor-focused. If you catch yourself duplicating, move content here and link from the README. - Any new rule you add here must have a real incident or a grep-verified upstream source behind it. No speculative "best practices".