Skip to content

Commit 2edc4de

Browse files
authored
🤖 feat: first-class DeepSeek V4 support (#3237)
## Summary Promote DeepSeek to a first-class provider in the curated model registry, with explicit V4 Pro and V4 Flash entries (1M context, 384K max output, full pricing + cache pricing) and proper "DeepSeek" branding in the display formatter. ## Background DeepSeek was already wired up as a provider (`@ai-sdk/deepseek` factory, settings UI, OpenRouter route, API-key requirements), but no DeepSeek model was in `knownModels.ts`, so users had no curated alias, no tokenizer override, and no warm-up entry. With V4 in preview and the legacy `deepseek-chat` / `deepseek-reasoner` IDs scheduled for retirement, V4 should be the default DeepSeek anchor going forward. ## Implementation - `src/common/constants/knownModels.ts` - Extend `ModelProvider` to include `"deepseek"`. - Add `DEEPSEEK_V4_PRO` (id `deepseek:deepseek-v4-pro`) — flagship; bare `deepseek` alias points here, matching the convention `gemini` → Gemini Pro, `grok` → Grok 4.1. - Add `DEEPSEEK_V4_FLASH` (id `deepseek:deepseek-v4-flash`) — fast/cheap tier, reachable via `deepseek-flash`. - Both reuse `tokenizerOverride: "deepseek/deepseek-v3.1"` (latest DeepSeek tokenizer published in `ai-tokenizer`) until V4's `encoding_dsv4` lands upstream — same pattern OPUS/SONNET use. - `src/common/utils/tokens/models-extra.ts` - V4-Pro: 1M context, 384K output, $1.74/M input, $3.48/M output, cache-hit input at 1/10 of input price. - V4-Flash: 1M context, 384K output, $0.14/M input, $0.28/M output, cache-hit input at 1/10 of input price. - Recorded the post-promo (full) prices, not the launch-window 75% discount, so cost forecasts don't silently regress when the promo ends. - `src/common/utils/ai/modelDisplay.ts` - Added a DeepSeek branch so ids render as `DeepSeek V4 Pro` instead of the fallback `Deepseek V4 Pro` (mis-cased brand). Version-tag tokens like `v4` / `r1` are uppercased; anything else is title-cased. - Tests cover `deepseek-v4-pro`, `deepseek-v4-flash`, `deepseek-r1`, `deepseek-chat`, and the gateway-scoped `deepseek/deepseek-v4-pro` form (validates the existing slash-stripping branch routes through the new handler). - `docs/config/models.mdx` and the matching `builtInSkillContent.generated.ts` — regenerated by `make fmt`. ## Validation - `make fmt`, `make typecheck`, `make lint`, `make static-check` — green. - `bun test src/common/utils/ai/modelDisplay.test.ts` — passes new DeepSeek cases. - Jest `knownModels.test.ts` — passes (verifies every curated model resolves in `models.json` or `models-extra.ts` and that aliases stay unique). - Bun-runner regression sweep on neighboring tests: `contextLimit`, `modelStats`, `normalizeModelInput`, `modelPreferenceRepair`, `aiService`. ## Risks Low. Changes are additive in the model registry (new provider variant, new entries) and the display formatter adds a leading branch with a tight prefix guard (`deepseek-`); existing Claude/GPT/Gemini/Ollama/fallback paths are untouched. Pricing values are sourced from DeepSeek's published pricing page; if numbers move, only `models-extra.ts` needs a refresh. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `max` • Cost: `$2.65`_ <!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=max costs=2.65 -->
1 parent 2792deb commit 2edc4de

7 files changed

Lines changed: 168 additions & 35 deletions

File tree

docs/config/models.mdx

Lines changed: 19 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -11,23 +11,25 @@ Mux ships with curated models kept up to date with the frontier. Use any custom
1111

1212
{/* BEGIN KNOWN_MODELS_TABLE */}
1313

14-
| Model | ID | Aliases | Default |
15-
| ---------------------- | ----------------------------- | ---------------------------------------- | ------- |
16-
| Opus 4.7 | anthropic:claude-opus-4-7 | `opus` ||
17-
| Sonnet 4.6 | anthropic:claude-sonnet-4-6 | `sonnet` | |
18-
| Haiku 4.5 | anthropic:claude-haiku-4-5 | `haiku` | |
19-
| GPT-5.5 | openai:gpt-5.5 | `gpt`, `gpt-5.5` | |
20-
| GPT-5.5 Pro | openai:gpt-5.5-pro | `gpt-pro`, `gpt-5.5-pro` | |
21-
| GPT-5.4 Mini | openai:gpt-5.4-mini | `gpt-mini` | |
22-
| GPT-5.4 Nano | openai:gpt-5.4-nano | `gpt-nano` | |
23-
| Codex 5.3 | openai:gpt-5.3-codex | `codex`, `codex-5.3` | |
24-
| Spark 5.3 | openai:gpt-5.3-codex-spark | `spark` | |
25-
| Codex Mini 5.1 | openai:gpt-5.1-codex-mini | `codex-mini` | |
26-
| Codex Max 5.1 | openai:gpt-5.1-codex-max | `codex-max` | |
27-
| Gemini 3.1 Pro Preview | google:gemini-3.1-pro-preview | `gemini`, `gemini-pro` | |
28-
| Gemini 3 Flash Preview | google:gemini-3-flash-preview | `gemini-flash` | |
29-
| Grok 4 1 Fast | xai:grok-4-1-fast | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` | |
30-
| Grok Code Fast 1 | xai:grok-code-fast-1 | `grok-code` | |
14+
| Model | ID | Aliases | Default |
15+
| ---------------------- | ----------------------------- | ------------------------------------------------------------ | ------- |
16+
| Opus 4.7 | anthropic:claude-opus-4-7 | `opus` ||
17+
| Sonnet 4.6 | anthropic:claude-sonnet-4-6 | `sonnet` | |
18+
| Haiku 4.5 | anthropic:claude-haiku-4-5 | `haiku` | |
19+
| GPT-5.5 | openai:gpt-5.5 | `gpt`, `gpt-5.5` | |
20+
| GPT-5.5 Pro | openai:gpt-5.5-pro | `gpt-pro`, `gpt-5.5-pro` | |
21+
| GPT-5.4 Mini | openai:gpt-5.4-mini | `gpt-mini` | |
22+
| GPT-5.4 Nano | openai:gpt-5.4-nano | `gpt-nano` | |
23+
| Codex 5.3 | openai:gpt-5.3-codex | `codex`, `codex-5.3` | |
24+
| Spark 5.3 | openai:gpt-5.3-codex-spark | `spark` | |
25+
| Codex Mini 5.1 | openai:gpt-5.1-codex-mini | `codex-mini` | |
26+
| Codex Max 5.1 | openai:gpt-5.1-codex-max | `codex-max` | |
27+
| Gemini 3.1 Pro Preview | google:gemini-3.1-pro-preview | `gemini`, `gemini-pro` | |
28+
| Gemini 3 Flash Preview | google:gemini-3-flash-preview | `gemini-flash` | |
29+
| Grok 4 1 Fast | xai:grok-4-1-fast | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` | |
30+
| Grok Code Fast 1 | xai:grok-code-fast-1 | `grok-code` | |
31+
| DeepSeek V4 Pro | deepseek:deepseek-v4-pro | `deepseek`, `deepseek-pro`, `deepseek-v4`, `deepseek-v4-pro` | |
32+
| DeepSeek V4 Flash | deepseek:deepseek-v4-flash | `deepseek-flash`, `deepseek-v4-flash` | |
3133

3234
{/* END KNOWN_MODELS_TABLE */}
3335

src/common/constants/knownModels.ts

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
import { formatModelDisplayName } from "../utils/ai/modelDisplay";
66

7-
type ModelProvider = "anthropic" | "openai" | "google" | "xai";
7+
type ModelProvider = "anthropic" | "openai" | "google" | "xai" | "deepseek";
88

99
interface KnownModelDefinition {
1010
/** Provider identifier used by SDK factories */
@@ -131,6 +131,27 @@ const MODEL_DEFINITIONS = {
131131
providerModelId: "grok-code-fast-1",
132132
aliases: ["grok-code"],
133133
},
134+
// DeepSeek V4 Pro is the flagship V4 tier (1.6T total / 49B active params, 1M context,
135+
// 384K max output). Bare `deepseek` alias points here per the convention that the
136+
// shortest alias tracks each provider's flagship model (mirrors `gemini` → Gemini Pro,
137+
// `grok` → Grok 4.1).
138+
DEEPSEEK_V4_PRO: {
139+
provider: "deepseek",
140+
providerModelId: "deepseek-v4-pro",
141+
aliases: ["deepseek", "deepseek-pro", "deepseek-v4", "deepseek-v4-pro"],
142+
// V4 ships a custom `encoding_dsv4` tokenizer that isn't published upstream yet;
143+
// reuse v3.1 (the latest available DeepSeek tokenizer in ai-tokenizer) for
144+
// approximate token counting until V4 weights land in the registry.
145+
tokenizerOverride: "deepseek/deepseek-v3.1",
146+
},
147+
// DeepSeek V4 Flash is the fast/economical V4 tier (284B total / 13B active params).
148+
// Same 1M context + 384K output as Pro; lower cost, smaller scale.
149+
DEEPSEEK_V4_FLASH: {
150+
provider: "deepseek",
151+
providerModelId: "deepseek-v4-flash",
152+
aliases: ["deepseek-flash", "deepseek-v4-flash"],
153+
tokenizerOverride: "deepseek/deepseek-v3.1",
154+
},
134155
} as const satisfies Record<string, KnownModelDefinition>;
135156

136157
export type KnownModelKey = keyof typeof MODEL_DEFINITIONS;

src/common/utils/ai/modelDisplay.test.ts

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,30 @@ describe("formatModelDisplayName", () => {
4949
});
5050
});
5151

52+
describe("DeepSeek models", () => {
53+
test("preserves DeepSeek camel-case branding and uppercases version tags", () => {
54+
expect(formatModelDisplayName("deepseek-v4-pro")).toBe("DeepSeek V4 Pro");
55+
expect(formatModelDisplayName("deepseek-v4-flash")).toBe("DeepSeek V4 Flash");
56+
expect(formatModelDisplayName("deepseek-r1")).toBe("DeepSeek R1");
57+
expect(formatModelDisplayName("deepseek-chat")).toBe("DeepSeek Chat");
58+
});
59+
60+
test("strips provider prefix when DeepSeek model is gateway-scoped", () => {
61+
// OpenRouter exposes the same models under "deepseek/deepseek-v4-pro"; the
62+
// existing slash-stripping branch should route through the DeepSeek handler.
63+
expect(formatModelDisplayName("deepseek/deepseek-v4-pro")).toBe("DeepSeek V4 Pro");
64+
});
65+
66+
test("colon-suffixed Ollama IDs preserve DeepSeek branding and size", () => {
67+
// Locally-pulled DeepSeek models use Ollama tags like "deepseek-r1:8b".
68+
// Both the DeepSeek brand casing and the parenthesized size suffix must
69+
// be preserved; the generic digit-split formatter would otherwise render
70+
// "Deepseek-r 1 (8B)".
71+
expect(formatModelDisplayName("deepseek-r1:8b")).toBe("DeepSeek R1 (8B)");
72+
expect(formatModelDisplayName("deepseek-coder:6.7b")).toBe("DeepSeek Coder (6.7B)");
73+
});
74+
});
75+
5276
describe("Ollama models", () => {
5377
test("formats Llama models with size", () => {
5478
expect(formatModelDisplayName("llama3.2:7b")).toBe("Llama 3.2 (7B)");

src/common/utils/ai/modelDisplay.ts

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,10 +143,38 @@ export function formatModelDisplayName(modelName: string): string {
143143
}
144144
}
145145

146+
// DeepSeek models - keep camel-cased "DeepSeek" branding and uppercase the
147+
// version segment (e.g. "v4-pro" -> "V4 Pro") since "Deepseek V4 Pro" mis-cases
148+
// the brand name.
149+
//
150+
// Skip when the name carries a colon-suffixed size tag like "deepseek-r1:8b" —
151+
// those are Ollama-style local model IDs and must fall through to the colon-size
152+
// handler below so the size renders as "(8B)" rather than being concatenated
153+
// verbatim.
154+
if (lower.startsWith("deepseek-") && !modelName.includes(":")) {
155+
const parts = lower.replace("deepseek-", "").split("-");
156+
const formatted = parts
157+
.map((part) => {
158+
// Uppercase short tokens that look like a version tag (e.g. "v4", "r1").
159+
if (/^[a-z]\d+(?:\.\d+)?$/.test(part)) return part.toUpperCase();
160+
return capitalize(part);
161+
})
162+
.join(" ");
163+
return formatted ? `DeepSeek ${formatted}` : "DeepSeek";
164+
}
165+
146166
// Ollama models - handle format like "llama3.2:7b" or "codellama:13b"
147167
// Split by colon to handle quantization/size suffix
148168
const [baseName, size] = modelName.split(":");
149169
if (size) {
170+
// DeepSeek IDs published as Ollama tags (e.g. "deepseek-r1:8b") need to
171+
// preserve the DeepSeek brand casing before the size suffix is appended.
172+
// Recurse into the formatter for the colon-stripped base so the DeepSeek
173+
// branch above produces "DeepSeek R1", then append "(8B)". Without this,
174+
// the generic digit-split below would render "Deepseek-r 1 (8B)".
175+
if (baseName.toLowerCase().startsWith("deepseek-")) {
176+
return `${formatModelDisplayName(baseName)} (${size.toUpperCase()})`;
177+
}
150178
// "llama3.2:7b" -> "Llama 3.2 (7B)"
151179
// "codellama:13b" -> "Codellama (13B)"
152180
const formatted = baseName

src/common/utils/tokens/modelStats.test.ts

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,25 @@ describe("getModelStats", () => {
7676
expect(uncached.cache_read_input_token_cost).toBeUndefined();
7777
});
7878

79+
test("resolves DeepSeek V4 pricing and limits via direct and gateway forms", () => {
80+
// Direct provider id wires up to the modelsExtra entry.
81+
const pro = expectStats("deepseek:deepseek-v4-pro");
82+
expect(pro.max_input_tokens).toBe(1_000_000);
83+
expect(pro.max_output_tokens).toBe(384_000);
84+
expect(pro.input_cost_per_token).toBe(0.00000174);
85+
expect(pro.output_cost_per_token).toBe(0.00000348);
86+
expect(pro.cache_read_input_token_cost).toBe(0.000000174);
87+
88+
// OpenRouter routes "deepseek/deepseek-v4-pro" back to the direct DeepSeek
89+
// entry via normalizeToCanonical, so pricing must match the direct lookup.
90+
expect(expectStats("openrouter:deepseek/deepseek-v4-pro")).toEqual(pro);
91+
92+
const flash = expectStats("deepseek:deepseek-v4-flash");
93+
expect(flash.input_cost_per_token).toBe(0.00000014);
94+
expect(flash.output_cost_per_token).toBe(0.00000028);
95+
expect(flash.cache_read_input_token_cost).toBe(0.000000014);
96+
});
97+
7998
test("returns null for unknown models across direct and gateway forms", () => {
8099
expect(getModelStats("unknown:fake-model-9000")).toBeNull();
81100
expect(getModelStats("ollama:this-model-does-not-exist")).toBeNull();

src/common/utils/tokens/models-extra.ts

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,4 +342,41 @@ export const modelsExtra: Record<string, ModelData> = {
342342
supports_response_schema: true,
343343
supported_endpoints: ["/v1/responses"],
344344
},
345+
346+
// DeepSeek V4 Pro - Released April 24, 2026 (Preview)
347+
// 1.6T total / 49B active MoE params; 1M context, 384K max output.
348+
// Standard pricing: $1.74/M input, $3.48/M output (full price; an introductory 75%
349+
// discount runs through 2026/05/05 but we record the post-discount baseline so
350+
// billing/forecasts don't silently regress when the promo ends).
351+
// Cache-hit input pricing is documented at 1/10 of input price.
352+
"deepseek-v4-pro": {
353+
max_input_tokens: 1000000,
354+
max_output_tokens: 384000,
355+
input_cost_per_token: 0.00000174, // $1.74 per million input tokens
356+
output_cost_per_token: 0.00000348, // $3.48 per million output tokens
357+
cache_read_input_token_cost: 0.000000174, // 1/10 of input price
358+
litellm_provider: "deepseek",
359+
mode: "chat",
360+
supports_function_calling: true,
361+
supports_reasoning: true,
362+
supports_response_schema: true,
363+
},
364+
365+
// DeepSeek V4 Flash - Released April 24, 2026 (Preview)
366+
// 284B total / 13B active MoE params; 1M context, 384K max output.
367+
// Pricing: $0.14/M input, $0.28/M output. Cache-hit input is 1/10 of input price.
368+
// Legacy `deepseek-chat` (non-thinking) and `deepseek-reasoner` (thinking) currently
369+
// route to V4-Flash compatibility modes and retire 2026-07-24.
370+
"deepseek-v4-flash": {
371+
max_input_tokens: 1000000,
372+
max_output_tokens: 384000,
373+
input_cost_per_token: 0.00000014, // $0.14 per million input tokens
374+
output_cost_per_token: 0.00000028, // $0.28 per million output tokens
375+
cache_read_input_token_cost: 0.000000014, // 1/10 of input price
376+
litellm_provider: "deepseek",
377+
mode: "chat",
378+
supports_function_calling: true,
379+
supports_reasoning: true,
380+
supports_response_schema: true,
381+
},
345382
};

src/node/services/agentSkills/builtInSkillContent.generated.ts

Lines changed: 19 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2043,23 +2043,25 @@ export const BUILTIN_SKILL_FILES: Record<string, Record<string, string>> = {
20432043
"",
20442044
"{/* BEGIN KNOWN_MODELS_TABLE */}",
20452045
"",
2046-
"| Model | ID | Aliases | Default |",
2047-
"| ---------------------- | ----------------------------- | ---------------------------------------- | ------- |",
2048-
"| Opus 4.7 | anthropic:claude-opus-4-7 | `opus` | ✓ |",
2049-
"| Sonnet 4.6 | anthropic:claude-sonnet-4-6 | `sonnet` | |",
2050-
"| Haiku 4.5 | anthropic:claude-haiku-4-5 | `haiku` | |",
2051-
"| GPT-5.5 | openai:gpt-5.5 | `gpt`, `gpt-5.5` | |",
2052-
"| GPT-5.5 Pro | openai:gpt-5.5-pro | `gpt-pro`, `gpt-5.5-pro` | |",
2053-
"| GPT-5.4 Mini | openai:gpt-5.4-mini | `gpt-mini` | |",
2054-
"| GPT-5.4 Nano | openai:gpt-5.4-nano | `gpt-nano` | |",
2055-
"| Codex 5.3 | openai:gpt-5.3-codex | `codex`, `codex-5.3` | |",
2056-
"| Spark 5.3 | openai:gpt-5.3-codex-spark | `spark` | |",
2057-
"| Codex Mini 5.1 | openai:gpt-5.1-codex-mini | `codex-mini` | |",
2058-
"| Codex Max 5.1 | openai:gpt-5.1-codex-max | `codex-max` | |",
2059-
"| Gemini 3.1 Pro Preview | google:gemini-3.1-pro-preview | `gemini`, `gemini-pro` | |",
2060-
"| Gemini 3 Flash Preview | google:gemini-3-flash-preview | `gemini-flash` | |",
2061-
"| Grok 4 1 Fast | xai:grok-4-1-fast | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` | |",
2062-
"| Grok Code Fast 1 | xai:grok-code-fast-1 | `grok-code` | |",
2046+
"| Model | ID | Aliases | Default |",
2047+
"| ---------------------- | ----------------------------- | ------------------------------------------------------------ | ------- |",
2048+
"| Opus 4.7 | anthropic:claude-opus-4-7 | `opus` | ✓ |",
2049+
"| Sonnet 4.6 | anthropic:claude-sonnet-4-6 | `sonnet` | |",
2050+
"| Haiku 4.5 | anthropic:claude-haiku-4-5 | `haiku` | |",
2051+
"| GPT-5.5 | openai:gpt-5.5 | `gpt`, `gpt-5.5` | |",
2052+
"| GPT-5.5 Pro | openai:gpt-5.5-pro | `gpt-pro`, `gpt-5.5-pro` | |",
2053+
"| GPT-5.4 Mini | openai:gpt-5.4-mini | `gpt-mini` | |",
2054+
"| GPT-5.4 Nano | openai:gpt-5.4-nano | `gpt-nano` | |",
2055+
"| Codex 5.3 | openai:gpt-5.3-codex | `codex`, `codex-5.3` | |",
2056+
"| Spark 5.3 | openai:gpt-5.3-codex-spark | `spark` | |",
2057+
"| Codex Mini 5.1 | openai:gpt-5.1-codex-mini | `codex-mini` | |",
2058+
"| Codex Max 5.1 | openai:gpt-5.1-codex-max | `codex-max` | |",
2059+
"| Gemini 3.1 Pro Preview | google:gemini-3.1-pro-preview | `gemini`, `gemini-pro` | |",
2060+
"| Gemini 3 Flash Preview | google:gemini-3-flash-preview | `gemini-flash` | |",
2061+
"| Grok 4 1 Fast | xai:grok-4-1-fast | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` | |",
2062+
"| Grok Code Fast 1 | xai:grok-code-fast-1 | `grok-code` | |",
2063+
"| DeepSeek V4 Pro | deepseek:deepseek-v4-pro | `deepseek`, `deepseek-pro`, `deepseek-v4`, `deepseek-v4-pro` | |",
2064+
"| DeepSeek V4 Flash | deepseek:deepseek-v4-flash | `deepseek-flash`, `deepseek-v4-flash` | |",
20632065
"",
20642066
"{/* END KNOWN_MODELS_TABLE */}",
20652067
"",

0 commit comments

Comments
 (0)