You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add three subscription / pay-as-you-go provider integrations that the proxy currently routes only through generic openai-compatibility blocks (PR #3990 already mentions minimax-m3, kimi-k2.7-code, glm-5.2 passing through):
OpenCode Go — $5 first month / $10/mo subscription proxy exposing 12+ open coding models behind a single API key.
MiniMax M3 — MiniMax's frontier 1M-context MSA model (Token Plan + Pay-as-you-go).
Subscriptions: Token Plan (monthly) + 10% invite rebate; Audio Subscription tiers (Starter / Standard / Pro / Scale) for voice side.
How to learn remaining quota: GET https://api.minimax.io/v1/dashboard/billing/credit_grants with the same API key returns grants[].remaining, expires_at.
B. UI — router-for-me/Cli-Proxy-API-Management-Center
Three new entries under AI Providers → OpenAI-compatible with pre-filled base URLs and a "fetch models" button that hits /v1/models on the upstream.
New "Quota & Limits" widget on the Quota Management page fed by the new endpoint.
i18n strings for en / zh-CN / zh-TW / ru.
C. Pricing data
Add a models-pricing.json (or extend the existing example) with the three model families so CPA-Manager-Plus LiteLLM / OpenRouter sync matches without manual edits.
Acceptance criteria
A new provider entry can be created through CPAMC and survive a restart.
GET /v1/models surfaces opencode-go/*, minimax/*, glm-5.2, glm-5.2-1m.
A request to each provider returns a successful response via curl http://localhost:8317/v1/chat/completions.
/v0/management/usage/limits reports at least the OAuth-backed quota providers.
No regression in existing Gemini / Codex / Claude / Grok / Kimi / Z.AI OAuth flows.
Summary
Add three subscription / pay-as-you-go provider integrations that the proxy currently routes only through generic
openai-compatibilityblocks (PR #3990 already mentionsminimax-m3,kimi-k2.7-code,glm-5.2passing through):For each: API-key auth flow, model alias, pricing/limit metadata, and rate-limit/quota telemetry exposed through the Management API.
Research: API cost, limits, and example calls
1. OpenCode Go
https://opencode.ai/zen/go/v1/chat/completions— OpenAI-compatible (GLM, Kimi, DeepSeek, MiMo)https://opencode.ai/zen/go/v1/messages— Anthropic-compatible (MiniMax, Qwen)https://opencode.ai/zen/go/v1/models— model listglm-5.2,glm-5.1,kimi-k2.7,kimi-k2.6,deepseek-v4-pro,deepseek-v4-flash,mimo-v2.5,mimo-v2.5-prominimax-m3,minimax-m2.7,minimax-m2.5,qwen3.7-max,qwen3.7-plus,qwen3.6-plusopencode-go/<model-id>.Example (OpenAI-compat path):
Example (Anthropic-compat path):
2. MiniMax M3
https://api.minimax.io/v1/text/chatcompletion_v2https://api.minimax.io/v1/modelsminimax/minimax-m3(also served by Parasail MXFP8, Together AI, Novita, SiliconFlow, MiniMax, Makora MXFP8, GMI).Priority tier (
service_tier: "priority", 1.5×): $0.45/$1.80, $0.90/$3.60.GET https://api.minimax.io/v1/dashboard/billing/credit_grantswith the same API key returnsgrants[].remaining,expires_at.Example call:
3. GLM-5.2 (Z.ai / Zhipu AI)
https://api.z.ai/api/paas/v4/chat/completionshttps://open.bigmodel.cn/api/paas/v4/chat/completions/zai-auth-url,/bigmodel-auth-url).Example call:
How to read remaining limits in general
X-RateLimit-*/x-ratelimit-*response headers./v0/management/quota.quota-exceeded.on-payment-required: "disable"(PR feat: add on-payment-required option to auto-disable key on 402 #3978).Proposal
A. Backend —
router-for-me/CLIProxyAPIinternal/runtime/executor/opencode_go/using bothopenai-compatandanthropic-compatpaths.https://opencode.ai/zen/go, models auto-fetched from/v1/modelson start.https://api.minimax.io/v1; first-class aliasesMiniMax-M3,MiniMax-M2.7,MiniMax-M2.5. Optional OpenRouter passthrough config (base_url=https://openrouter.ai/api/v1, headerHTTP-Referer/X-Title).GLM-5.2andGLM-5.2[1m]for 1M context.https://api.z.ai/api/paas/v4; thinking-effort selector (high/max); peak-hour quota multiplier surfaced.service_tier,quota_window(5h/weekly), token totals, and cache hit/miss split.GET /v0/management/usage/limitsreturning per-provider{limit, used, window, reset_at}so CPAMC can render bars (companion to issue 偶尔会弹出无效API key提示,“400 API key not valid. Please pass a valid API key.” #2 of this report).B. UI —
router-for-me/Cli-Proxy-API-Management-Center/v1/modelson the upstream.C. Pricing data
Add a
models-pricing.json(or extend the existing example) with the three model families so CPA-Manager-Plus LiteLLM / OpenRouter sync matches without manual edits.Acceptance criteria
GET /v1/modelssurfacesopencode-go/*,minimax/*,glm-5.2,glm-5.2-1m.curl http://localhost:8317/v1/chat/completions./v0/management/usage/limitsreports at least the OAuth-backed quota providers.