Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/specs/ai-sdk-runtime/plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# AI SDK Runtime Plan

1. Add hidden runtime resolution and keep `legacy` available for rollback.
2. Introduce shared AI SDK runtime modules without changing upper-layer interfaces.
3. Migrate OpenAI-compatible and OpenAI responses providers first.
4. Migrate Anthropic / Gemini / Vertex / Bedrock / Ollama to the shared runtime.
5. Keep routing providers (`new-api`, `zenmux`) as thin delegates over migrated providers.
6. Freeze `LLMCoreStreamEvent` behavior with adapter-focused tests.
7. Remove legacy state machines only after the rollback window closes.
90 changes: 90 additions & 0 deletions docs/specs/ai-sdk-runtime/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# AI SDK Runtime Spec

## Goal

Unify DeepChat's low-level LLM request pipeline on Vercel AI SDK while keeping the upper-layer contracts unchanged:

- `BaseLLMProvider`
- `LLMProviderPresenter`
- `LLMCoreStreamEvent`
- existing provider IDs, model configs, and conversation history

The AI SDK runtime becomes the default implementation. A hidden runtime switch keeps `legacy` available as a rollback path.

## Non-Negotiable Compatibility

- No functional regression in text streaming, reasoning streaming, tool call streaming, image output, prompt cache, proxy handling, request tracing, routing, and embeddings.
- `LLMCoreStreamEvent` event names, field names, and stop reasons remain unchanged.
- Existing `function_call_record` history must stay reusable across providers and across runtime switches.
- Existing provider list / model list / provider check / key status responsibilities remain in provider classes.

## Runtime Modes

- Default: `ai-sdk`
- Hidden fallback: `legacy`
- Resolution order:
1. `DEEPCHAT_LLM_RUNTIME`
2. config setting `llmRuntimeMode`
3. default `ai-sdk`

## Scope

Shared runtime under `src/main/presenter/llmProviderPresenter/aiSdk/` provides:

- provider factory
- model / message mapper
- MCP tool mapper
- streaming adapter
- image runtime
- embedding runtime
- provider-options mapper
- reasoning middleware
- legacy function-call compatibility middleware

## Provider Rollout

Phase 1:

- `OpenAICompatibleProvider`
- `OpenAIResponsesProvider`
- all `extends OpenAICompatibleProvider` providers

Phase 2:

- `AnthropicProvider`
- `GeminiProvider`
- `VertexProvider`
- `AwsBedrockProvider`
- `OllamaProvider`

Phase 3:

- `NewApiProvider`
- `ZenmuxProvider`

Out of scope for first unification pass:

- `AcpProvider`
- `VoiceAIProvider`

## Validation Matrix

- pure text
- reasoning native
- reasoning via `<think>`
- native tool streaming
- legacy `<function_call>` fallback
- multi-tool history replay
- image input
- image output
- usage mapping
- prompt cache mapping
- proxy / trace / abort
- embeddings
- hidden rollback path

## Legacy Removal Exit Criteria

- AI SDK runtime passes the provider regression matrix
- rollback path remains unused for at least one release cycle
- duplicated legacy stream parsers / tool parsers have no remaining callers
14 changes: 14 additions & 0 deletions docs/specs/ai-sdk-runtime/tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# AI SDK Runtime Tasks

- [x] Add hidden `llmRuntimeMode` default and env override contract.
- [x] Freeze migration scope in SDD docs.
- [x] Add shared AI SDK runtime modules. (implemented in 4c8345a7)
- [x] Integrate OpenAI-compatible runtime path. (implemented in 4c8345a7)
- [x] Integrate OpenAI responses runtime path. (implemented in 4c8345a7)
- [x] Integrate Anthropic runtime path. (implemented in 4c8345a7)
- [x] Integrate Gemini runtime path. (implemented in 4c8345a7)
- [x] Integrate Vertex runtime path. (implemented in 4c8345a7)
- [x] Integrate Bedrock runtime path. (implemented in 4c8345a7)
- [x] Integrate Ollama runtime path. (implemented in 4c8345a7)
- [x] Add regression tests for runtime adapter behavior. (implemented in 4c8345a7)
- [ ] Run format, i18n, lint, and targeted tests.
11 changes: 10 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,13 @@
},
"dependencies": {
"@agentclientprotocol/sdk": "^0.16.1",
"@ai-sdk/amazon-bedrock": "^4.0.92",
"@ai-sdk/anthropic": "^3.0.68",
"@ai-sdk/google": "^3.0.61",
"@ai-sdk/google-vertex": "^4.0.106",
"@ai-sdk/openai": "^3.0.52",
"@ai-sdk/openai-compatible": "^2.0.41",
"@ai-sdk/provider": "^3.0.8",
"@anthropic-ai/sdk": "^0.53.0",
"@aws-sdk/client-bedrock": "^3.958.0",
"@aws-sdk/client-bedrock-runtime": "^3.958.0",
Expand All @@ -74,6 +81,7 @@
"@jxa/run": "^1.4.0",
"@larksuiteoapi/node-sdk": "^1.60.0",
"@modelcontextprotocol/sdk": "^1.28.0",
"ai": "^6.0.157",
"axios": "^1.13.6",
"better-sqlite3-multiple-ciphers": "12.8.0",
"cheerio": "^1.2.0",
Expand All @@ -96,6 +104,7 @@
"nanoid": "^5.1.7",
"node-pty": "^1.1.0",
"ollama": "^0.5.18",
"ollama-ai-provider": "^1.2.0",
"openai": "^6.33.0",
"pdf-parse-new": "^1.4.1",
"run-applescript": "^7.1.0",
Expand All @@ -110,6 +119,7 @@
"zod": "^3.25.76"
},
"devDependencies": {
"@antv/infographic": "^0.2.7",
"@electron-toolkit/tsconfig": "^1.0.1",
"@electron/notarize": "^3.1.1",
"@iconify-json/lucide": "^1.2.99",
Expand Down Expand Up @@ -189,7 +199,6 @@
"vue-virtual-scroller": "^2.0.0-beta.10",
"vuedraggable": "^4.1.0",
"yaml": "^2.8.3",
"@antv/infographic": "^0.2.7",
"zod-to-json-schema": "^3.25.1"
},
"simple-git-hooks": {
Expand Down
63 changes: 49 additions & 14 deletions src/main/presenter/agentRuntimePresenter/accumulator.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import type { AssistantMessageBlock } from '@shared/types/agent-interface'
import type { LLMCoreStreamEvent } from '@shared/types/core/llm-events'
import type { ChatMessageProviderOptions } from '@shared/types/core/chat-message'
import type { StreamState } from './types'

export function finalizeTrailingPendingNarrativeBlocks(blocks: AssistantMessageBlock[]): void {
Expand All @@ -17,32 +18,49 @@ export function finalizeTrailingPendingNarrativeBlocks(blocks: AssistantMessageB

function getCurrentBlock(
blocks: AssistantMessageBlock[],
type: 'content' | 'reasoning_content'
type: 'content' | 'reasoning_content',
providerOptions?: ChatMessageProviderOptions
): AssistantMessageBlock {
const providerOptionsJson = serializeProviderOptions(providerOptions)
const last = blocks[blocks.length - 1]
if (
last &&
last.status === 'pending' &&
(last.type === 'content' || last.type === 'reasoning_content') &&
last.type !== type
(last.type === 'content' || last.type === 'reasoning_content')
) {
const lastProviderOptionsJson =
typeof last.extra?.providerOptionsJson === 'string'
? last.extra.providerOptionsJson
: undefined

if (last.type === type && lastProviderOptionsJson === providerOptionsJson) {
return last
}

last.status = 'success'
}

const current = blocks[blocks.length - 1]
if (current && current.type === type && current.status === 'pending') {
return current
}
const block: AssistantMessageBlock = {
type,
content: '',
status: 'pending',
timestamp: Date.now()
timestamp: Date.now(),
...(providerOptionsJson ? { extra: { providerOptionsJson } } : {})
}
blocks.push(block)
return block
}

function serializeProviderOptions(
providerOptions?: ChatMessageProviderOptions
): string | undefined {
if (!providerOptions) {
return undefined
}

return JSON.stringify(providerOptions)
}

function updateReasoningMetadata(state: StreamState, start: number, end: number): void {
const relativeStart = Math.max(0, start - state.startTime)
const relativeEnd = Math.max(0, end - state.startTime)
Expand All @@ -61,15 +79,15 @@ export function accumulate(state: StreamState, event: LLMCoreStreamEvent): void
switch (event.type) {
case 'text': {
if (state.firstTokenTime === null) state.firstTokenTime = Date.now()
const block = getCurrentBlock(state.blocks, 'content')
const block = getCurrentBlock(state.blocks, 'content', event.provider_options)
block.content += event.content
state.dirty = true
break
}
case 'reasoning': {
const currentTime = Date.now()
if (state.firstTokenTime === null) state.firstTokenTime = currentTime
const block = getCurrentBlock(state.blocks, 'reasoning_content')
const block = getCurrentBlock(state.blocks, 'reasoning_content', event.provider_options)
block.content += event.reasoning_content
if (
typeof block.reasoning_time !== 'object' ||
Expand All @@ -91,6 +109,7 @@ export function accumulate(state: StreamState, event: LLMCoreStreamEvent): void
}
case 'tool_call_start': {
finalizeTrailingPendingNarrativeBlocks(state.blocks)
const providerOptionsJson = serializeProviderOptions(event.provider_options)
const toolBlock: AssistantMessageBlock = {
type: 'tool_call',
content: '',
Expand All @@ -101,13 +120,15 @@ export function accumulate(state: StreamState, event: LLMCoreStreamEvent): void
name: event.tool_call_name,
params: '',
response: ''
}
},
...(providerOptionsJson ? { extra: { providerOptionsJson } } : {})
}
state.blocks.push(toolBlock)
state.pendingToolCalls.set(event.tool_call_id, {
name: event.tool_call_name,
arguments: '',
blockIndex: state.blocks.length - 1
blockIndex: state.blocks.length - 1,
providerOptions: event.provider_options
})
state.dirty = true
break
Expand All @@ -116,9 +137,18 @@ export function accumulate(state: StreamState, event: LLMCoreStreamEvent): void
const pending = state.pendingToolCalls.get(event.tool_call_id)
if (pending) {
pending.arguments += event.tool_call_arguments_chunk
if (!pending.providerOptions && event.provider_options) {
pending.providerOptions = event.provider_options
}
const block = state.blocks[pending.blockIndex]
if (block?.tool_call) {
block.tool_call.params = pending.arguments
if (event.provider_options) {
block.extra = {
...block.extra,
providerOptionsJson: serializeProviderOptions(event.provider_options)
}
}
}
state.dirty = true
}
Expand All @@ -128,19 +158,24 @@ export function accumulate(state: StreamState, event: LLMCoreStreamEvent): void
const pending = state.pendingToolCalls.get(event.tool_call_id)
if (pending) {
const finalArgs = event.tool_call_arguments_complete ?? pending.arguments
const providerOptions = event.provider_options ?? pending.providerOptions
pending.arguments = finalArgs
const block = state.blocks[pending.blockIndex]
if (block?.tool_call) {
block.tool_call.params = finalArgs
block.extra = {
...block.extra,
toolCallArgsComplete: true
toolCallArgsComplete: true,
...(providerOptions
? { providerOptionsJson: serializeProviderOptions(providerOptions) }
: {})
}
}
state.completedToolCalls.push({
id: event.tool_call_id,
name: pending.name,
arguments: finalArgs
arguments: finalArgs,
...(providerOptions ? { providerOptions } : {})
})
state.pendingToolCalls.delete(event.tool_call_id)
state.dirty = true
Expand Down
Loading
Loading