Skip to content

Latest commit

 

History

History
128 lines (94 loc) · 16.8 KB

File metadata and controls

128 lines (94 loc) · 16.8 KB

AGENTS.md — working notes for coding agents (and humans)

This file is the single source of truth for any AI agent (or human) modifying this repo. Read it top-to-bottom before touching code. If something you learn here contradicts what you see in the code, the code wins — update this file in the same commit.

User-facing install / usage documentation lives in README.md. Do not duplicate it here.


Purpose

One plugin, one job: make opencode talk to Kimi's kimi-for-coding endpoint exactly the way the official kimi-cli does. Everything in this repo exists to minimize drift from upstream kimi-cli.

The one rule that matters

Moonshot's coding backend is entitlement-sensitive: the model-name string alone is not the whole story.

Every design decision here follows from that: we do device-flow OAuth to mirror official kimi-cli, we do not accept API keys in this plugin, and we do not let the upstream SDK attach its own Authorization header.

Non-goals

  • No support for any non-kimi-for-coding model. opencode already handles other Moonshot / Baseten / Alibaba-CN / etc. entries itself.
  • No support for static API keys. Users who want that can use a different opencode provider entry.
  • No custom SSE parser, tool-call normalizer, or message rewriter. @ai-sdk/openai-compatible already does SSE/reasoning_content correctly.

Architecture

Each source file has one job. Do not add new files unless the existing ones genuinely can't hold a new concern.

File Responsibility
src/constants.ts Pinned strings that must mirror upstream kimi-cli (version, endpoints, client id).
src/headers.ts The seven X-Msh-* / UA headers + the persistent ~/.kimi/device_id file.
src/oauth.ts Device-code start, device-code poll, refresh-token exchange, and GET /coding/v1/models discovery.
src/auth-store.ts Read/write opencode's auth.json entries for this provider.
src/auth-refresh.ts Lock-based token refresh with cross-instance coordination, ensureFreshStoredAuth for standalone callers.
src/index.ts Plugin entry (v1 PluginModule format). Wires auth (login + loader) plus the Kimi chat hooks/body rewrite.
src/usage.ts Fetch and parse Kimi subscription usage (/coding/v1/usages).
src/tui.tsx TUI slash command /kimi:usage — renders usage in an opencode dialog.

Data flow on a chat request:

  1. opencode asks the @ai-sdk/openai-compatible provider for a language model.
  2. Before instantiating it, opencode calls our auth.loader. We return { apiKey, fetch }.
  3. The SDK uses our fetch for every HTTP call (models, chat, whatever).
  4. Our fetch calls ensureFresh() → prefers the live opencode auth-store entry over stale OPENCODE_AUTH_CONTENT snapshots → maybe refreshes (sharing one in-flight promise in-process and a lock across plugin instances so they don't race the same refresh token) → lazily discovers /coding/v1/models when needed → sets Authorization + the seven X-Msh-* headers → on 401 refreshes once and retries.
  5. Separately, opencode runs chat.headers and chat.params. chat.headers computes thinking, reasoning_effort, and prompt_cache_key from input.model.options plus the selected input.message.model.variant, then passes them to loader.fetch via private x-opencode-kimi-* headers. loader.fetch strips those headers and injects the wire fields into the JSON body. chat.params mirrors the same keys into output.options only as a forward-compat fallback if opencode later fixes its openai-compatible providerOptions namespace mismatch.

Contracts to keep intact

These are the invariants that, if broken, silently route requests onto the wrong auth/backend path or produce fingerprint-based throttling. Do not "clean them up" without reading the linked upstream.

  1. X-Msh-Version and User-Agent must track kimi-cli. Bumping involves exactly one line in src/constants.ts. See upstream research/kimi-cli/src/kimi_cli/constant.py. The UA prefix is KimiCLI/ (not KimiCodeCLI/) — Moonshot's kimi-for-coding backend 403s with access_terminated_error: only available for Coding Agents such as Kimi CLI, Claude Code, Roo Code… on any other prefix. Likewise, X-Msh-Device-Model must mirror kimi-cli's _device_model() shape, including the Darwin/Windows special cases (macOS <version> <arch>, Windows 10/11 <arch>, Linux "{system} {release} {machine}") — NOT just {arch} — and X-Msh-Os-Version is the kernel build string from os.version(), NOT "{type} {release}". Tested live against api.kimi.com/coding/v1 on 2026-04-17 — any of those three fields off-spec → 403.

  2. X-Msh-Device-Id must be stable across runs. Never regenerate a fresh UUID at import time. getDeviceId() reads/writes ~/.kimi/device_id; that path is shared with kimi-cli on purpose.

  3. Authorization header is owned by loader.fetch. Anything else (opencode core, the SDK, future hooks) must be overridden. Our loader deletes both authorization and Authorization before setting its own. The private x-opencode-kimi-* transport headers are also consumed and stripped there; they must never leak upstream.

  4. Effort ↔ fields mapping (kimi-cli llm.py / kosong/chat_provider/kimi.py):

    Effort reasoning_effort thinking
    auto (omitted) (omitted)
    off (omitted) {type:"disabled"}
    low "low" {type:"enabled"}
    medium "medium" {type:"enabled"}
    high "high" {type:"enabled"}
    xhigh "high" (clamped) {type:"enabled"}
    max "high" (clamped) {type:"enabled"}

    auto is the "let the server decide dynamically" variant — neither field is sent, matching kimi-cli's "nothing passed" default. xhigh and max are clamped to "high" because Kimi's backend does not support higher tiers (kimi-cli's Kimi.with_thinking() does the same). When no effort is set at all, the plugin still emits thinking: {type: "enabled"} because the model is a reasoner. Compute this from input.model.options plus input.model.variants[input.message.model.variant], not from input.provider.info.id. The @opencode-ai/plugin ProviderContext type claims .info.id exists, but the runtime shape opencode passes (see research/opencode/packages/opencode/src/session/llm.ts::stream, ~line 168, provider: item) is the flat ProviderConfig (.id). input.model.providerID is what every first-party plugin uses (cloudflare.ts, codex.ts, github-copilot/copilot.ts) and it avoids the runtime crash "undefined is not an object (evaluating 'input.provider.info.id')". Tested live 2026-04-17.

  5. prompt_cache_key only for kimi-for-coding. Never attach it to unrelated models. The check is input.model.id === MODEL_ID in the Kimi chat hooks, and the actual wire injection happens in loader.fetch.

  6. Wire model id comes from /coding/v1/models, not from user config. The opencode-side model id is a stable alias (MODEL_ID = "kimi-for-coding"); the plugin calls GET /coding/v1/models at login and on every token refresh (mirroring kimi-cli's refresh_managed_models in research/kimi-cli/src/kimi_cli/auth/platforms.py), caches the first returned {id, context_length, display_name, supports_image_in, supports_video_in} in loader memory, rewrites the JSON body model field inside loader.fetch whenever the discovered id differs from MODEL_ID, and backfills runtime model metadata from the same discovery response. A new loader instance re-discovers on first use if needed. Do not strip the kimi- prefix; send whatever the server returned. Discovery failures are non-fatal (warm cached id still works; 401 retry flushes broken tokens).

  7. Auth store is opencode's, not kimi-cli's. We use opencode's auth store for tokens under the kimi-for-coding-oauth provider id. Do not read/write ~/.kimi/credentials/kimi-code.json; that's kimi-cli's file and sharing it across independent apps causes token-race bugs. The plugin may live-read opencode's auth.json entry for this provider to bypass stale OPENCODE_AUTH_CONTENT workspace snapshots, but writes still go through opencode's auth store (client.auth.set). Also note that opencode's SDK auth schema only persists the standard oauth fields, so model discovery metadata cannot be stored there durably.

  8. Provider id must not collide with any id in the models.dev catalog. models.dev publishes kimi-for-coding as a separate API-key-driven integration. If we registered under that same id, opencode auth login kimi-for-coding would surface two methods under one entry and users could silently land on the wrong integration path. We deliberately use kimi-for-coding-oauth instead; MODEL_ID on the wire stays kimi-for-coding (rule 6).

  9. src/index.ts must have exactly one export — the default PluginModule object { id, server }. opencode's plugin loader (research/opencode/packages/opencode/src/plugin/index.ts) first tries readV1Plugin (detect mode) on the default export. If it finds an object with server (and optional id), it uses the v1 path directly. The older legacy path (getLegacyPlugins) iterates every export and throws Plugin export is not a function on any non-callable value — a problem that surfaced on Windows where Bun's standalone-binary dynamic imports can produce module namespace objects with unexpected non-function metadata. The v1 format bypasses getLegacyPlugins entirely. Keep constants in src/constants.ts and import them in src/index.ts rather than re-exporting. test/exports.test.ts guards this. The failure mode of a broken export is silent in the CLI (the provider just doesn't appear in opencode auth login); the error only surfaces in ~/.local/share/opencode/log/*.log.

  10. The post-login config hint must not emit a partial limit object. opencode's live config schema at https://opencode.ai/config.json requires both limit.context and limit.output whenever limit is present, while Kimi's GET /coding/v1/models only gives us context_length. Therefore buildConfigBlock() omits limit entirely and leaves provider.models to backfill limit.context at runtime. Do not invent output or set input heuristically; opencode's overflow logic treats limit.input as authoritative (research/opencode/packages/opencode/src/session/overflow.ts).

  11. Concurrent refreshes must collapse to one in-flight OAuth exchange, even across plugin instances. provider.models and auth.loader can both notice an expiring token at about the same time, and separate opencode workspace/plugin instances can inherit stale auth snapshots. refreshAuth() in src/index.ts therefore shares one promise across overlapping callers, takes a provider-scoped auth-store lock before refreshing, re-reads opencode's live auth-store entry under that lock, and treats a changed on-disk token chain as authoritative. test/plugin.test.ts covers loader-vs-loader, provider.models-vs-loader, cross-instance lock reuse, and the invalid_grant self-heal path where another process already rotated the refresh token.

  12. Media-input capabilities must be backfilled from /coding/v1/models. supports_image_in and supports_video_in from Kimi discovery are not cosmetic metadata: opencode's provider transform (research/opencode/packages/opencode/src/provider/transform.ts::unsupportedParts) rewrites every image part into local ERROR: Cannot read ... (this model does not support image input) text before the request reaches our loader when capabilities.input.image is false. Therefore provider.models must patch runtime model metadata for kimi-for-coding, and buildConfigBlock() must include attachment: true plus appropriate modalities.input / modalities.output when discovery says images/video are supported. test/plugin.test.ts covers both paths.

Working on this repo

  • Code style: see tsconfig.json (strict, noUncheckedIndexedAccess, ES2022). Prefer small pure functions, avoid try/catch except where we genuinely convert one error shape to another.
  • Comments: match the existing density — only explain non-obvious upstream-parity reasoning. Do not narrate the obvious ("// refresh the token"); instead reference upstream files when the reasoning is "because kimi-cli does it that way".
  • Dependencies: runtime deps are limited to @opentui/core and @opentui/solid (for the TUI slash command). The only dev/peer dep is @opencode-ai/plugin for types. Do not add further runtime deps.
  • Git commits: small, logical, imperative subject ("Add oauth device flow"). Do not add a Co-authored-by trailer.
  • Upstream research: the research/ directory is a read-only git-ignored pair of shallow clones (opencode + kimi-cli) for grep. Never edit files there; re-clone if you suspect drift. When citing upstream in a comment, use the research/… path so the reference is resolvable.
  • Version bumps: when kimi-cli bumps, (1) pull a fresh research/kimi-cli, (2) update KIMI_CLI_VERSION in src/constants.ts, (3) re-diff _kimi_default_headers() / oauth.py against src/headers.ts and src/oauth.ts, (4) smoke-test with opencode auth login kimi-for-coding-oauth and a one-turn chat, (5) tag release.
  • Tests: test/ holds one file per source file plus test/exports.test.ts (the rule-9 guard). Tests mock fetch via test/_util/fetchMock.ts; no real credentials or network. They use the real ~/.kimi/device_id on purpose — it is shared with kimi-cli by design and getDeviceId is idempotent, so tests don't clobber state. When adding a new contract to the list above, add the matching offline check to the corresponding test file rather than creating new ones.

What not to do

  • ❌ Don't add heuristics that look at the model id outside of the Kimi chat hooks / loader.fetch. The auth loader is already scoped to this provider; only the chat hooks and the body rewrite need to match on kimi-for-coding.
  • ❌ Don't rename the provider id back to kimi-for-coding or to anything else listed in models.dev. See rule 8.
  • ❌ Don't add new header values that kimi-cli doesn't send. The fingerprint matters.
  • ❌ Don't call out to other files to "share" the kimi-cli credentials. Different OAuth consumers must have independent refresh-token chains or one will invalidate the other.
  • ❌ Don't introduce a build step. The plugin ships as .ts and opencode's bun-based loader handles it.
  • ❌ Don't add tests that require real Kimi credentials and check them in. If you add offline unit tests, put them under test/ and mock fetch.
  • ❌ Don't add named exports to src/index.ts or change the default export away from the { id, server } PluginModule shape. See rule 9.

How to verify a change

Offline:

bunx tsc --noEmit                                  # type-check
bunx tsc --noEmit --project tsconfig.tests.json    # type-check tests/helpers
bun build --target=node --no-bundle src/index.ts   # syntax check
bun test                                           # offline unit tests

Online (requires a real Kimi-for-coding account):

  1. Install the local checkout via opencode's plugin flow (opencode plugin /path/to/this/repo --global) or point the plugin array in your opencode config at the repo root, as shown in README.md.
  2. Paste the provider block from README.md into your opencode config.
  3. opencode auth login kimi-for-coding-oauth — confirm a token lands in opencode's auth.json with type: "oauth", a JWT access, and expires ~15 min in the future.
  4. Start opencode, select kimi-for-coding-oauth/kimi-for-coding, and ask the model to self-identify. It should claim to be kimi-for-coding / Kimi Code.
  5. Confirm reasoning_content deltas render as thinking content (not assistant text).
  6. In a second turn of the same session, confirm the response comes back faster (cache hit via prompt_cache_key).

If any of 3–6 fails, diff research/kimi-cli against the contracts above.

House rules for AI agents

  • Read this file first. Every time.
  • Don't grow the dependency footprint to "simplify" something; this plugin's value is being small and audit-able.
  • When in doubt, mirror kimi-cli exactly, then comment the upstream reference. "We used to deviate, it broke" — document it here.
  • Keep README.md user-focused and this file contributor-focused. If you catch yourself duplicating, move content here and link from the README.
  • Any new rule you add here must have a real incident or a grep-verified upstream source behind it. No speculative "best practices".