Add subscription/usage-aware providers: OpenCode Go, MiniMax M3, GLM-5.2 (API cost + limit tracking)

## Summary

Add three subscription / pay-as-you-go provider integrations that the proxy currently routes only through generic `openai-compatibility` blocks (PR #3990 already mentions `minimax-m3`, `kimi-k2.7-code`, `glm-5.2` passing through):

1. **OpenCode Go** — $5 first month / $10/mo subscription proxy exposing 12+ open coding models behind a single API key.
2. **MiniMax M3** — MiniMax's frontier 1M-context MSA model (Token Plan + Pay-as-you-go).
3. **GLM-5.2** — Z.ai's MIT-licensed 1M-context coding model (ZCode Coding Plan + Pay-as-you-go, OAuth login work-in-progress in PR #3925 / CPAMC #323).

For each: API-key auth flow, model alias, pricing/limit metadata, and rate-limit/quota telemetry exposed through the Management API.

---

## Research: API cost, limits, and example calls

### 1. OpenCode Go

- Subscription: **$5 first month, then $10/month**. Top-up credit available. Only one member per workspace can subscribe.
- Endpoints:
  - `https://opencode.ai/zen/go/v1/chat/completions` — OpenAI-compatible (GLM, Kimi, DeepSeek, MiMo)
  - `https://opencode.ai/zen/go/v1/messages` — Anthropic-compatible (MiniMax, Qwen)
  - `https://opencode.ai/zen/go/v1/models` — model list
- Models (June 2026):
  - chat/completions: `glm-5.2`, `glm-5.1`, `kimi-k2.7`, `kimi-k2.6`, `deepseek-v4-pro`, `deepseek-v4-flash`, `mimo-v2.5`, `mimo-v2.5-pro`
  - messages: `minimax-m3`, `minimax-m2.7`, `minimax-m2.5`, `qwen3.7-max`, `qwen3.7-plus`, `qwen3.6-plus`
- Limits: "Generous limits" advertised; MiniMax M3 currently gets **3× usage limits** during a limited-time promo.
- Config ID prefix: `opencode-go/<model-id>`.
- Docs: <https://opencode.ai/docs/go/>

Example (OpenAI-compat path):

```bash
curl https://opencode.ai/zen/go/v1/chat/completions \
  -H "Authorization: Bearer $OPENCODE_GO_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"opencode-go/glm-5.2","messages":[{"role":"user","content":"hi"}]}'
```

Example (Anthropic-compat path):

```bash
curl https://opencode.ai/zen/go/v1/messages \
  -H "x-api-key: $OPENCODE_GO_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"opencode-go/minimax-m3","max_tokens":1024,"messages":[{"role":"user","content":"hi"}]}'
```

### 2. MiniMax M3

- Provider: MiniMax (Open Platform).
- Direct endpoint: `https://api.minimax.io/v1/text/chatcompletion_v2`
- Models list: `https://api.minimax.io/v1/models`
- OpenRouter slug: `minimax/minimax-m3` (also served by Parasail MXFP8, Together AI, Novita, SiliconFlow, MiniMax, Makora MXFP8, GMI).
- Context: 1M tokens (guaranteed 512K); max output 131,072.
- Pay-as-you-go pricing per 1M tokens (Permanent 50% off already applied):

| Tier | Input | Output | Cache read |
| --- | --- | --- | --- |
| ≤ 512K input | $0.30 | $1.20 | $0.06 |
| > 512K input | $0.60 | $2.40 | $0.12 |

Priority tier (`service_tier: "priority"`, 1.5×): $0.45/$1.80, $0.90/$3.60.

- Subscriptions: Token Plan (monthly) + 10% invite rebate; Audio Subscription tiers (Starter / Standard / Pro / Scale) for voice side.
- How to learn remaining quota: `GET https://api.minimax.io/v1/dashboard/billing/credit_grants` with the same API key returns `grants[].remaining`, `expires_at`.

Example call:

```bash
curl https://api.minimax.io/v1/text/chatcompletion_v2 \
  -H "Authorization: Bearer $MINIMAX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "MiniMax-M3",
    "messages": [{"role":"user","content":"Hello"}],
    "max_tokens": 1024
  }'
```

### 3. GLM-5.2 (Z.ai / Zhipu AI)

- Provider: Z.ai (international), BigModel / 智谱 (China).
- Release: 2026-06-16, MIT open weights on Hugging Face, 1M-token stable context.
- Endpoints (OpenAI-compatible):
  - `https://api.z.ai/api/paas/v4/chat/completions`
  - `https://open.bigmodel.cn/api/paas/v4/chat/completions`
- API pricing per 1M tokens:

| Item | Price |
| --- | --- |
| Input | $1.40 |
| Output | $4.40 |
| Cached input | $0.26 (storage free, limited time) |

- Subscription (GLM Coding Plan, annual billing):

| Tier | Price | Allowance |
| --- | --- | --- |
| Lite | $12.60/mo | light repos |
| Pro | $50.40/mo | mid repos (5× Lite) |
| Max | $112.00/mo | heavy (20× Lite, peak priority) |

- Peak-hour quota multiplier: 14:00–18:00 UTC+8 → **3× quota**; off-peak 2×; **1×** off-peak promo through end of September 2026.
- ZCode desktop agent: 1.5× effective quota through June 30, 2026.
- Companion backend PR: #3925 (`/zai-auth-url`, `/bigmodel-auth-url`).

Example call:

```bash
curl https://api.z.ai/api/paas/v4/chat/completions \
  -H "Authorization: Bearer $ZAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5.2",
    "messages": [{"role":"user","content":"Hello"}],
    "thinking": {"type": "enabled"}
  }'
```

### How to read remaining limits in general

1. First-party: provider status page + `X-RateLimit-*` / `x-ratelimit-*` response headers.
2. CPA docs: <https://help.router-for.me/> (Management API → quota endpoints).
3. OAuth providers (Claude / Codex / Grok / Kimi / ZCode): per-account quota already exposed via `/v0/management/quota`.
4. API-key providers: per-key cooldown/disable state in the proxy logs. Config supports `quota-exceeded.on-payment-required: "disable"` (PR #3978).

---

## Proposal

### A. Backend — `router-for-me/CLIProxyAPI`

1. **OpenCode Go provider**
   - New executor under `internal/runtime/executor/opencode_go/` using both `openai-compat` and `anthropic-compat` paths.
   - Registers a single API-key provider with base URL `https://opencode.ai/zen/go`, models auto-fetched from `/v1/models` on start.
2. **MiniMax M3 provider**
   - OpenAI-compatible provider with base URL `https://api.minimax.io/v1`; first-class aliases `MiniMax-M3`, `MiniMax-M2.7`, `MiniMax-M2.5`. Optional OpenRouter passthrough config (`base_url=https://openrouter.ai/api/v1`, header `HTTP-Referer`/`X-Title`).
3. **GLM-5.2 (extend #3925)**
   - Model alias `GLM-5.2` and `GLM-5.2[1m]` for 1M context.
   - Pay-as-you-go key variant via OpenAI-compat with base URL `https://api.z.ai/api/paas/v4`; thinking-effort selector (`high`/`max`); peak-hour quota multiplier surfaced.
4. **Usage-limit telemetry**
   - Extend the per-request log line with `service_tier`, `quota_window` (`5h` / `weekly`), token totals, and cache hit/miss split.
   - New endpoint `GET /v0/management/usage/limits` returning per-provider `{limit, used, window, reset_at}` so CPAMC can render bars (companion to issue #2 of this report).

### B. UI — `router-for-me/Cli-Proxy-API-Management-Center`

1. Three new entries under **AI Providers → OpenAI-compatible** with pre-filled base URLs and a "fetch models" button that hits `/v1/models` on the upstream.
2. New "Quota & Limits" widget on the **Quota Management** page fed by the new endpoint.
3. i18n strings for en / zh-CN / zh-TW / ru.

### C. Pricing data

Add a `models-pricing.json` (or extend the existing example) with the three model families so CPA-Manager-Plus LiteLLM / OpenRouter sync matches without manual edits.

---

## Acceptance criteria

- A new provider entry can be created through CPAMC and survive a restart.
- `GET /v1/models` surfaces `opencode-go/*`, `minimax/*`, `glm-5.2`, `glm-5.2-1m`.
- A request to each provider returns a successful response via `curl http://localhost:8317/v1/chat/completions`.
- `/v0/management/usage/limits` reports at least the OAuth-backed quota providers.
- No regression in existing Gemini / Codex / Claude / Grok / Kimi / Z.AI OAuth flows.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add subscription/usage-aware providers: OpenCode Go, MiniMax M3, GLM-5.2 (API cost + limit tracking) #4010

Summary

Research: API cost, limits, and example calls

1. OpenCode Go

2. MiniMax M3

3. GLM-5.2 (Z.ai / Zhipu AI)

How to read remaining limits in general

Proposal

A. Backend — `router-for-me/CLIProxyAPI`

B. UI — `router-for-me/Cli-Proxy-API-Management-Center`

C. Pricing data

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Item	Price
Input	$1.40
Output	$4.40
Cached input	$0.26 (storage free, limited time)

Tier	Price	Allowance
Lite	$12.60/mo	light repos
Pro	$50.40/mo	mid repos (5× Lite)
Max	$112.00/mo	heavy (20× Lite, peak priority)

Uh oh!

Uh oh!

Add subscription/usage-aware providers: OpenCode Go, MiniMax M3, GLM-5.2 (API cost + limit tracking) #4010

Description

Summary

Research: API cost, limits, and example calls

1. OpenCode Go

2. MiniMax M3

3. GLM-5.2 (Z.ai / Zhipu AI)

How to read remaining limits in general

Proposal

A. Backend — router-for-me/CLIProxyAPI

B. UI — router-for-me/Cli-Proxy-API-Management-Center

C. Pricing data

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

A. Backend — `router-for-me/CLIProxyAPI`

B. UI — `router-for-me/Cli-Proxy-API-Management-Center`