Skip to content

Add subscription/usage-aware providers: OpenCode Go, MiniMax M3, GLM-5.2 (API cost + limit tracking) #4010

Description

@myagizmaktav

Summary

Add three subscription / pay-as-you-go provider integrations that the proxy currently routes only through generic openai-compatibility blocks (PR #3990 already mentions minimax-m3, kimi-k2.7-code, glm-5.2 passing through):

  1. OpenCode Go — $5 first month / $10/mo subscription proxy exposing 12+ open coding models behind a single API key.
  2. MiniMax M3 — MiniMax's frontier 1M-context MSA model (Token Plan + Pay-as-you-go).
  3. GLM-5.2 — Z.ai's MIT-licensed 1M-context coding model (ZCode Coding Plan + Pay-as-you-go, OAuth login work-in-progress in PR feat(zai): Z.AI / ZCode (GLM) OAuth login with API-key provisioning #3925 / CPAMC feat(registry): add support for Claude Opus 4.5 model #323).

For each: API-key auth flow, model alias, pricing/limit metadata, and rate-limit/quota telemetry exposed through the Management API.


Research: API cost, limits, and example calls

1. OpenCode Go

  • Subscription: $5 first month, then $10/month. Top-up credit available. Only one member per workspace can subscribe.
  • Endpoints:
    • https://opencode.ai/zen/go/v1/chat/completions — OpenAI-compatible (GLM, Kimi, DeepSeek, MiMo)
    • https://opencode.ai/zen/go/v1/messages — Anthropic-compatible (MiniMax, Qwen)
    • https://opencode.ai/zen/go/v1/models — model list
  • Models (June 2026):
    • chat/completions: glm-5.2, glm-5.1, kimi-k2.7, kimi-k2.6, deepseek-v4-pro, deepseek-v4-flash, mimo-v2.5, mimo-v2.5-pro
    • messages: minimax-m3, minimax-m2.7, minimax-m2.5, qwen3.7-max, qwen3.7-plus, qwen3.6-plus
  • Limits: "Generous limits" advertised; MiniMax M3 currently gets 3× usage limits during a limited-time promo.
  • Config ID prefix: opencode-go/<model-id>.
  • Docs: https://opencode.ai/docs/go/

Example (OpenAI-compat path):

curl https://opencode.ai/zen/go/v1/chat/completions \
  -H "Authorization: Bearer $OPENCODE_GO_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"opencode-go/glm-5.2","messages":[{"role":"user","content":"hi"}]}'

Example (Anthropic-compat path):

curl https://opencode.ai/zen/go/v1/messages \
  -H "x-api-key: $OPENCODE_GO_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"opencode-go/minimax-m3","max_tokens":1024,"messages":[{"role":"user","content":"hi"}]}'

2. MiniMax M3

  • Provider: MiniMax (Open Platform).
  • Direct endpoint: https://api.minimax.io/v1/text/chatcompletion_v2
  • Models list: https://api.minimax.io/v1/models
  • OpenRouter slug: minimax/minimax-m3 (also served by Parasail MXFP8, Together AI, Novita, SiliconFlow, MiniMax, Makora MXFP8, GMI).
  • Context: 1M tokens (guaranteed 512K); max output 131,072.
  • Pay-as-you-go pricing per 1M tokens (Permanent 50% off already applied):
Tier Input Output Cache read
≤ 512K input $0.30 $1.20 $0.06
> 512K input $0.60 $2.40 $0.12

Priority tier (service_tier: "priority", 1.5×): $0.45/$1.80, $0.90/$3.60.

  • Subscriptions: Token Plan (monthly) + 10% invite rebate; Audio Subscription tiers (Starter / Standard / Pro / Scale) for voice side.
  • How to learn remaining quota: GET https://api.minimax.io/v1/dashboard/billing/credit_grants with the same API key returns grants[].remaining, expires_at.

Example call:

curl https://api.minimax.io/v1/text/chatcompletion_v2 \
  -H "Authorization: Bearer $MINIMAX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "MiniMax-M3",
    "messages": [{"role":"user","content":"Hello"}],
    "max_tokens": 1024
  }'

3. GLM-5.2 (Z.ai / Zhipu AI)

  • Provider: Z.ai (international), BigModel / 智谱 (China).
  • Release: 2026-06-16, MIT open weights on Hugging Face, 1M-token stable context.
  • Endpoints (OpenAI-compatible):
    • https://api.z.ai/api/paas/v4/chat/completions
    • https://open.bigmodel.cn/api/paas/v4/chat/completions
  • API pricing per 1M tokens:
Item Price
Input $1.40
Output $4.40
Cached input $0.26 (storage free, limited time)
  • Subscription (GLM Coding Plan, annual billing):
Tier Price Allowance
Lite $12.60/mo light repos
Pro $50.40/mo mid repos (5× Lite)
Max $112.00/mo heavy (20× Lite, peak priority)

Example call:

curl https://api.z.ai/api/paas/v4/chat/completions \
  -H "Authorization: Bearer $ZAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5.2",
    "messages": [{"role":"user","content":"Hello"}],
    "thinking": {"type": "enabled"}
  }'

How to read remaining limits in general

  1. First-party: provider status page + X-RateLimit-* / x-ratelimit-* response headers.
  2. CPA docs: https://help.router-for.me/ (Management API → quota endpoints).
  3. OAuth providers (Claude / Codex / Grok / Kimi / ZCode): per-account quota already exposed via /v0/management/quota.
  4. API-key providers: per-key cooldown/disable state in the proxy logs. Config supports quota-exceeded.on-payment-required: "disable" (PR feat: add on-payment-required option to auto-disable key on 402 #3978).

Proposal

A. Backend — router-for-me/CLIProxyAPI

  1. OpenCode Go provider
    • New executor under internal/runtime/executor/opencode_go/ using both openai-compat and anthropic-compat paths.
    • Registers a single API-key provider with base URL https://opencode.ai/zen/go, models auto-fetched from /v1/models on start.
  2. MiniMax M3 provider
    • OpenAI-compatible provider with base URL https://api.minimax.io/v1; first-class aliases MiniMax-M3, MiniMax-M2.7, MiniMax-M2.5. Optional OpenRouter passthrough config (base_url=https://openrouter.ai/api/v1, header HTTP-Referer/X-Title).
  3. GLM-5.2 (extend feat(zai): Z.AI / ZCode (GLM) OAuth login with API-key provisioning #3925)
    • Model alias GLM-5.2 and GLM-5.2[1m] for 1M context.
    • Pay-as-you-go key variant via OpenAI-compat with base URL https://api.z.ai/api/paas/v4; thinking-effort selector (high/max); peak-hour quota multiplier surfaced.
  4. Usage-limit telemetry

B. UI — router-for-me/Cli-Proxy-API-Management-Center

  1. Three new entries under AI Providers → OpenAI-compatible with pre-filled base URLs and a "fetch models" button that hits /v1/models on the upstream.
  2. New "Quota & Limits" widget on the Quota Management page fed by the new endpoint.
  3. i18n strings for en / zh-CN / zh-TW / ru.

C. Pricing data

Add a models-pricing.json (or extend the existing example) with the three model families so CPA-Manager-Plus LiteLLM / OpenRouter sync matches without manual edits.


Acceptance criteria

  • A new provider entry can be created through CPAMC and survive a restart.
  • GET /v1/models surfaces opencode-go/*, minimax/*, glm-5.2, glm-5.2-1m.
  • A request to each provider returns a successful response via curl http://localhost:8317/v1/chat/completions.
  • /v0/management/usage/limits reports at least the OAuth-backed quota providers.
  • No regression in existing Gemini / Codex / Claude / Grok / Kimi / Z.AI OAuth flows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions