ZCode CLI coding agent (z.ai), running GLM-5.2 over the z.ai start-plan.
- Source:
src/providers/zcode.ts - Loading: lazy (
src/providers/index.ts). Lazy because we read ZCode's SQLite database withnode:sqlite. - Test:
tests/providers/zcode.test.ts(3 tests, fixture-based)
ZCode keeps a single global SQLite database for the CLI.
| Source | Path |
|---|---|
| ZCode CLI db | ~/.zcode/cli/db/db.sqlite |
The desktop app dir (~/Library/Application Support/ZCode) only holds Electron runtime state, and the JSONL activity log (~/.zcode/cli/log/*.jsonl) redacts token counts, so neither is used.
SQLite. Schema verified against CLI db v0.14.8. Three tables matter:
CREATE TABLE session (
id TEXT PRIMARY KEY,
directory TEXT NOT NULL,
...
);
CREATE TABLE model_usage (
id TEXT PRIMARY KEY,
session_id TEXT NOT NULL,
turn_id TEXT,
model_id TEXT NOT NULL,
input_tokens INTEGER NOT NULL DEFAULT 0,
output_tokens INTEGER NOT NULL DEFAULT 0,
reasoning_tokens INTEGER NOT NULL DEFAULT 0,
cache_creation_input_tokens INTEGER NOT NULL DEFAULT 0,
cache_read_input_tokens INTEGER NOT NULL DEFAULT 0,
started_at INTEGER NOT NULL,
completed_at INTEGER,
...
);
CREATE TABLE tool_usage (
session_id TEXT NOT NULL,
turn_id TEXT,
tool_name TEXT NOT NULL,
started_at INTEGER NOT NULL,
...
);None at the provider level.
Per zcode:<model_usage.id> (zcode.ts). model_usage.id is the row primary key, unique per request.
| codeburn field | ZCode source |
|---|---|
inputTokens |
model_usage.input_tokens minus cached + created (see quirks) |
outputTokens |
model_usage.output_tokens |
reasoningTokens |
model_usage.reasoning_tokens |
cacheCreationInputTokens |
model_usage.cache_creation_input_tokens |
cacheReadInputTokens |
model_usage.cache_read_input_tokens |
costUSD |
computed by calculateCost (ZCode stores no cost) |
model |
model_usage.model_id (e.g. GLM-5.2) |
timestamp |
model_usage.completed_at if set, otherwise started_at (epoch ms) |
tools |
tool_usage.tool_name for the turn, attached to one request per turn |
- Cached tokens are folded into
input_tokens(OpenAI-style). The row'sinput_tokensis the full prompt size including cache reads/writes, andprovider_total_tokens = input_tokens + output_tokens. The parser subtractscache_read_input_tokensandcache_creation_input_tokensfrominput_tokensso fresh input bills at the input rate and cached at the cache-read rate. Confirmed against the nested Anthropic usage inprovider_metadata_json(e.g. 100 input = 36 fresh + 64 cached). - No cost is stored anywhere. GLM-5.2 runs on z.ai's
start-plansubscription, so ZCode logs tokens only. CodeBurn computes a notional cost from the pricing table. - GLM-5.2 is priced via an alias. LiteLLM does not list GLM-5.2 yet, so
GLM-5.2maps toglm-5p1(GLM-5.1) inBUILTIN_ALIASES(src/models.ts). Reports therefore show the model asglm-5p1, the same way any aliased model displays as its priced-as target. Drop the alias once LiteLLM adds GLM-5.2. - Timestamps are milliseconds. Unlike Crush (seconds), ZCode stores epoch ms; the parser passes them straight to
Date. - Tools are attached per turn, not per request.
tool_usagelinks to a turn, not a specificmodel_usagerow, so each turn's tools are attached to its first request to avoid double-counting. Bash command text is not stored, sobashCommandsis always empty.
- Confirm the schema against a real ZCode install; copy
~/.zcode/cli/db/db.sqliteto a temp file before querying so you do not lock the live db. - If costs are $0, check that
GLM-5.2(or the current model id) still resolves throughBUILTIN_ALIASESto a priced model. - If tokens look ~8x too high, someone likely removed the cache-subtraction in the input normalization; the row's
input_tokensalready includes cached tokens. - New fixtures go under the inline schema in
tests/providers/zcode.test.ts.