Skip to content

Commit 3bbef49

Browse files
committed
feat(coding): 新增 Coding Plan 窗口预热工具
1 parent 3c11df8 commit 3bbef49

21 files changed

Lines changed: 2283 additions & 1 deletion

File tree

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# Coding Plan Window Warmer Spec
2+
3+
> 本规范记录 `ai/coding/window-warmer` 的预热调度、直连上游、依赖管理和 PM2 管理约定。修改窗口预热工具、默认 TOML、PM2 配置或相关启动文档时必须先阅读。
4+
5+
---
6+
7+
## Scenario: Direct Coding Plan Window Warmup
8+
9+
### 1. Scope / Trigger
10+
11+
- Trigger: 修改 `ai/coding/window-warmer/**`、窗口预热启动命令、默认 `window-warmer.toml`、PM2 ecosystem 配置,或 LiteLLM 网关文档中的窗口预热说明。
12+
- Scope: 宿主机侧独立脚本按多个 `[[plans]]``fixed_times``interval` 调度发送轻量 completion 请求,用于把 Coding Plan 额度窗口尽量锁定到可预期的时间段。
13+
- Design intent: 预热是独立运维工具,不属于 LiteLLM callback、LiteLLM Proxy 路由或 Docker Compose sidecar;默认请求必须直连上游 Coding Plan 服务端点,避免被 LiteLLM Proxy fallback 到 DeepSeek 或其它兜底路由。
14+
15+
### 2. Signatures
16+
17+
- Direct run:
18+
- `uv run --script ai/coding/window-warmer/window_warmer.py --config ai/coding/window-warmer/window-warmer.toml`
19+
- `uv run --script ai/coding/window-warmer/window_warmer.py --config ai/coding/window-warmer/window-warmer.toml --print-next`
20+
- `uv run --script ai/coding/window-warmer/window_warmer.py --config ai/coding/window-warmer/window-warmer.toml --once --dry-run`
21+
- PM2:
22+
- `pm2 start ai/coding/window-warmer/window-warmer.pm2.config.cjs`
23+
- PM2 app name: `coding-window-warmer`
24+
- Python script:
25+
- Entry file: `ai/coding/window-warmer/window_warmer.py`
26+
- Dependency declaration: PEP 723 script metadata with `litellm>=1.81.0`
27+
- Helper package: `ai/coding/window-warmer/window_warmer_lib/`
28+
29+
### 3. Contracts
30+
31+
- Target config `[target]`:
32+
- `name`: log-only target name.
33+
- `base_url`: direct upstream OpenAI-compatible API base URL. Default points to `https://open.bigmodel.cn/api/coding/paas/v4`, not local LiteLLM Proxy.
34+
- `container_name`: optional local Docker readiness gate. When set to `litellm`, it only proves the local gateway container is running; it must not change the warm request destination.
35+
- `api_key_env`: optional environment variable for upstream API key. Default for Z.ai Coding Plan is `Z_AI_API_KEY`.
36+
- `env_file`: optional dotenv-style file path, resolved relative to the TOML file.
37+
- `health_path`: optional direct target health path. Default is `/models`.
38+
- `request_timeout_seconds`: timeout used by health check and LiteLLM SDK completion.
39+
- Plan config `[[plans]]`:
40+
- `model`: LiteLLM SDK model string. For direct OpenAI-compatible upstreams, use `openai/<provider-model>`, for example `openai/GLM-5.1`.
41+
- `prompt`: light warmup prompt. Logs must not print prompt text.
42+
- `schedule_mode`: `fixed_times` or `interval`.
43+
- `times`: required for `fixed_times`.
44+
- `start_time` or `start_at` plus `window`: required for `interval`.
45+
- `jitter_seconds`, `retry_count`, `retry_delay_seconds`: per-plan overrides.
46+
- Request contract:
47+
- Warm requests use `litellm.completion(model=plan.model, messages=[...], api_base=target.base_url, api_key=api_key, timeout=..., max_tokens=..., temperature=...)`.
48+
- The warmer must not call local LiteLLM Proxy `/v1/chat/completions` for default GLM warmup.
49+
- Health checks may use direct HTTP GET because they are a readiness probe, not the warmup completion.
50+
51+
### 4. Validation & Error Matrix
52+
53+
| Condition | Expected Behavior |
54+
|-----------|-------------------|
55+
| `scheduler.enabled=false` | Script exits successfully without scheduling warmups |
56+
| No enabled plans | `--once` / watch mode logs `没有启用的 plan` and does not send requests |
57+
| `container_name` configured but Docker missing | Warmup is skipped with `未找到 docker 命令` diagnostic |
58+
| `container_name` configured but container not running | Warmup is skipped before reading/sending completion |
59+
| `api_key_env` configured but missing from env and `env_file` | Warmup is skipped with missing key diagnostic |
60+
| `health_path` configured but direct target health check fails | Warmup is skipped before completion request |
61+
| `--dry-run` or `scheduler.dry_run=true` | Docker/API readiness checks and completion request are skipped |
62+
| LiteLLM SDK completion fails | Failure is logged without prompt/key/body; retry up to `retry_count` |
63+
| Multiple plans share the same base time | Each plan remains in the event queue and is executed independently |
64+
65+
### 5. Good/Base/Bad Cases
66+
67+
- Good: Default config checks optional local `litellm` container but sends `openai/GLM-5.1` to `https://open.bigmodel.cn/api/coding/paas/v4` through LiteLLM SDK.
68+
- Good: `uv run --script` handles LiteLLM SDK dependency without creating repo-level `requirements.txt`, `pyproject.toml`, or a committed virtual environment.
69+
- Good: Time calculation is pure and unit-tested separately from HTTP/LiteLLM SDK calls.
70+
- Base: `fixed_times = ["08:00", "13:00", "18:00", "23:00"]` with `jitter_seconds = 120` schedules each event within two minutes after the base time.
71+
- Bad: Pointing `[target].base_url` at `http://127.0.0.1:34000` for default GLM warmup, because the request can enter LiteLLM Proxy fallback chains.
72+
- Bad: Using model `GLM-5.1` without an explicit provider prefix for direct upstream calls, because LiteLLM SDK provider inference can be ambiguous.
73+
- Bad: Logging prompt text, API key, full headers, or full request body.
74+
- Bad: Re-merging all helper modules into a single thousand-line script.
75+
76+
### 6. Tests Required
77+
78+
- Unit tests for `fixed_times` next-day rollover.
79+
- Unit tests for `interval` continuous-window rollover across midnight.
80+
- Unit tests for multiple plans with simultaneous base time remaining independently executable.
81+
- Config parse tests for multiple `[[plans]]`.
82+
- SDK call test mocking the local wrapper around `litellm.completion`, asserting:
83+
- `model` keeps the configured provider-prefixed model.
84+
- `api_base` is the direct target URL.
85+
- prompt, max tokens, temperature and timeout are passed.
86+
- Dry-run test asserting readiness checks are skipped.
87+
- Smoke commands:
88+
- `uv run --script ai/coding/window-warmer/window_warmer.py --config ai/coding/window-warmer/window-warmer.toml --print-next`
89+
- `uv run --script ai/coding/window-warmer/window_warmer.py --config ai/coding/window-warmer/window-warmer.toml --once --dry-run`
90+
- `node -c ai/coding/window-warmer/window-warmer.pm2.config.cjs`
91+
92+
### 7. Wrong vs Correct
93+
94+
#### Wrong
95+
96+
```toml
97+
[target]
98+
name = "local-litellm"
99+
base_url = "http://127.0.0.1:34000"
100+
api_key_env = "LITELLM_MASTER_KEY"
101+
health_path = "/health"
102+
103+
[[plans]]
104+
model = "GLM-5.1"
105+
endpoint = "/v1/chat/completions"
106+
```
107+
108+
问题:这会把 warmup 请求送进 LiteLLM Proxy;如果 GLM 429 或被 callback 标记冷却,请求可能 fallback 到 DeepSeek,既不能锁定 GLM Coding Plan 窗口,也会消耗兜底额度。
109+
110+
#### Correct
111+
112+
```toml
113+
[target]
114+
name = "z-ai-coding-plan"
115+
base_url = "https://open.bigmodel.cn/api/coding/paas/v4"
116+
container_name = "litellm"
117+
api_key_env = "Z_AI_API_KEY"
118+
health_path = "/models"
119+
120+
[[plans]]
121+
name = "glm-coding-plan"
122+
model = "openai/GLM-5.1"
123+
schedule_mode = "fixed_times"
124+
times = ["08:00", "13:00", "18:00", "23:00"]
125+
```
126+
127+
理由:`container_name` 只是可选本机启动条件;真实 warmup completion 由 LiteLLM SDK 直连 `target.base_url`,不会进入 LiteLLM Proxy 路由/fallback。

.trellis/spec/infra/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,6 @@
88

99
| Guide | Description | Status |
1010
|-------|-------------|--------|
11+
| [Coding Plan Window Warmer](./coding-plan-window-warmer.md) | 独立窗口预热脚本、直连上游、uv 依赖和 PM2 管理约定 | Active |
1112
| [LiteLLM Gateway](./litellm-gateway.md) | LiteLLM 路由、fallback、参数兼容和验证边界 | Active |
1213
| [Node/Vitest Scripts](./node-vitest-scripts.md) | 根目录 Vitest 发现的 Node 脚本测试与 shebang 行尾约定 | Active |
13-
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"_example": "Fill with {\"file\": \"<path>\", \"reason\": \"<why>\"}. Put spec/research files only — no code paths. Run `python3 .trellis/scripts/get_context.py --mode packages` to list available specs. Delete this line once real entries are added."}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
{"_example": "Fill with {\"file\": \"<path>\", \"reason\": \"<why>\"}. Put spec/research files only — no code paths. Run `python3 .trellis/scripts/get_context.py --mode packages` to list available specs. Delete this line once real entries are added."}

0 commit comments

Comments
 (0)