Skip to content

Commit ada0904

Browse files
author
shijiashuai
committed
feat: 引入多 LLM Provider 抽象并优化前端交互防抖与调试日志
在 `DialogueService` 中新增 `LLM_PROVIDER` 和 `LLM_BASE_URL` 环境变量,添加 `_call_llm` 私有方法统一封装 HTTP 请求逻辑,当前实现为 OpenAI Chat Completions 接口,为后续对接其他 Provider 预留扩展点。在 `_append_session_messages` 中添加历史截断调试日志,记录 session_id 和最终长度。 在 `AdvancedDigitalHumanPage` 中优化 Chat Dock 交互体验:输入框回
1 parent 1980f93 commit ada0904

3 files changed

Lines changed: 80 additions & 19 deletions

File tree

changelog/2025-11-24-voice-and-audio-integration.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,3 +82,36 @@
8282
- 一般说话但希望有一定口型/动态时用 `speak`
8383
- 只有在没有合适动作或需要静止时才用 `idle`
8484
- 强调:**无论何种情况严禁输出 JSON 以外的任何文字、注释或解释**,确保前端解析稳定。
85+
86+
## 多 LLM Provider 抽象(预留扩展点)
87+
88+
-`DialogueService` 中引入轻量级 Provider 抽象:
89+
- 新增环境变量:
90+
- `LLM_PROVIDER`:当前使用的 LLM 提供方标识,默认 `openai`
91+
- `LLM_BASE_URL`:可选,覆盖默认的 OpenAI Chat Completions URL,方便对接 OpenAI 兼容网关。
92+
- 新增私有方法 `_call_llm(messages)`
93+
- 统一封装 HTTP 请求逻辑,当前实现为调用 OpenAI Chat Completions 接口;
94+
- 记录调试日志:`provider``model``messages` 数量等;
95+
-`LLM_PROVIDER` 不是 `openai` 时,会输出告警日志并暂时回退到 OpenAI,实现“先有接口,再慢慢接其他 Provider”的策略。
96+
97+
## 前端交互与调试体验微调
98+
99+
- 高级页面 Chat Dock:
100+
- 输入框回车发送逻辑增加防抖:在 `isChatLoading``isRecording` 时禁止再次触发 `handleChatSend`,避免重复请求。
101+
- 输入框占位文案根据状态切换:
102+
- 录音中:显示 `Listening... press mic again to stop`
103+
- 加载中:显示 `Thinking...`
104+
- 其他情况:保持原有 `Type a message to interact...`
105+
- 发送按钮:
106+
-`isChatLoading``true` 时禁用按钮,防止重复发送;
107+
- 同时保留加载态的圆形 spinner。
108+
- 录音按钮:
109+
-`isChatLoading` 时禁用,避免在模型回复过程中开启新的录音;
110+
- 增加 `disabled` 的视觉反馈(透明度和光标样式)。
111+
- 调试日志:
112+
- 在前端 `AdvancedDigitalHumanPage` 中:
113+
- 对每次 LLM 返回的 `emotion`/`action` 输出 `console.debug`,便于在 DevTools 中观察映射效果;
114+
- 在切换录音状态时输出 `console.debug`,方便排查麦克风交互问题。
115+
- 在后端 `DialogueService` 中:
116+
- 每次调用 LLM 时输出 provider、model 与消息数量;
117+
- 在会话历史被截断时输出包含 `session_id` 和最终长度的调试日志,便于观察内存行为。

server/app/services/dialogue.py

Lines changed: 34 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ class DialogueService:
1313
def __init__(self) -> None:
1414
self.api_key = os.getenv("OPENAI_API_KEY")
1515
self.model = os.getenv("OPENAI_MODEL", "gpt-3.5-turbo")
16+
self.provider = os.getenv("LLM_PROVIDER", "openai").lower()
17+
self.base_url = os.getenv("LLM_BASE_URL")
1618
self._session_messages: dict[str, list[dict[str, str]]] = {}
1719
try:
1820
self.max_session_messages = int(os.getenv("DIALOGUE_MAX_SESSION_MESSAGES", "10"))
@@ -87,21 +89,7 @@ async def generate_reply(
8789
)
8890

8991
try:
90-
async with httpx.AsyncClient(timeout=20.0) as client:
91-
resp = await client.post(
92-
"https://api.openai.com/v1/chat/completions",
93-
headers={
94-
"Authorization": f"Bearer {self.api_key}",
95-
"Content-Type": "application/json",
96-
},
97-
json={
98-
"model": self.model,
99-
"messages": messages,
100-
"temperature": 0.7,
101-
},
102-
)
103-
resp.raise_for_status()
104-
data = resp.json()
92+
data = await self._call_llm(messages)
10593
content = data["choices"][0]["message"]["content"]
10694

10795
try:
@@ -146,6 +134,29 @@ async def generate_reply(
146134
"action": "idle",
147135
}
148136

137+
async def _call_llm(self, messages: list[dict[str, str]]) -> Dict[str, Any]:
138+
provider = (self.provider or "openai").lower()
139+
logger.debug("Calling LLM provider=%s model=%s messages=%d", provider, self.model, len(messages))
140+
141+
if provider != "openai":
142+
logger.warning("LLM_PROVIDER=%s 未实现,暂时使用 openai 作为回退", provider)
143+
144+
url = self.base_url or "https://api.openai.com/v1/chat/completions"
145+
headers = {
146+
"Authorization": f"Bearer {self.api_key}",
147+
"Content-Type": "application/json",
148+
}
149+
payload = {
150+
"model": self.model,
151+
"messages": messages,
152+
"temperature": 0.7,
153+
}
154+
155+
async with httpx.AsyncClient(timeout=20.0) as client:
156+
resp = await client.post(url, headers=headers, json=payload)
157+
resp.raise_for_status()
158+
return resp.json()
159+
149160
def _get_session_messages(self, session_id: str) -> list[dict[str, str]]:
150161
return self._session_messages.get(session_id, [])
151162

@@ -158,9 +169,17 @@ def _append_session_messages(
158169
return
159170
history = self._session_messages.get(session_id, [])
160171
history.extend(new_messages)
172+
truncated = False
161173
if len(history) > self.max_session_messages:
162174
history = history[-self.max_session_messages :]
175+
truncated = True
163176
self._session_messages[session_id] = history
177+
logger.debug(
178+
"Session %s history size=%d%s",
179+
session_id,
180+
len(history),
181+
" (truncated)" if truncated else "",
182+
)
164183

165184

166185
dialogue_service = DialogueService()

src/pages/AdvancedDigitalHumanPage.tsx

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ export default function AdvancedDigitalHumanPage() {
7272
setIsChatLoading(true);
7373
try {
7474
const res = await sendUserInput({ userText: content, sessionId: 'demo-session' });
75+
console.debug('LLM response', { emotion: res.emotion, action: res.action });
7576
const assistantMessage = { id: Date.now() + 1, role: 'assistant' as const, text: res.replyText };
7677
setChatMessages((prev) => [...prev, assistantMessage]);
7778

@@ -94,6 +95,7 @@ export default function AdvancedDigitalHumanPage() {
9495
};
9596

9697
const handleToggleRecording = () => {
98+
console.debug('Toggle recording', { from: isRecording });
9799
if (isRecording) {
98100
asrService.stop();
99101
setRecording(false);
@@ -282,15 +284,22 @@ export default function AdvancedDigitalHumanPage() {
282284
type="text"
283285
value={chatInput}
284286
onChange={(e) => setChatInput(e.target.value)}
285-
onKeyDown={(e) => e.key === 'Enter' && handleChatSend()}
286-
placeholder="Type a message to interact..."
287+
onKeyDown={(e) => e.key === 'Enter' && !isChatLoading && !isRecording && handleChatSend()}
288+
placeholder={
289+
isRecording
290+
? 'Listening... press mic again to stop'
291+
: isChatLoading
292+
? 'Thinking...'
293+
: 'Type a message to interact...'
294+
}
287295
className="flex-1 bg-transparent border-none outline-none text-white placeholder-white/30 text-sm h-10"
288296
/>
289297

290298
<div className="flex items-center gap-2 pr-1">
291299
<button
292300
onClick={handleToggleRecording}
293-
className={`p-3 rounded-xl transition-all duration-300 ${
301+
disabled={isChatLoading}
302+
className={`p-3 rounded-xl transition-all duration-300 disabled:opacity-50 disabled:cursor-not-allowed ${
294303
isRecording
295304
? 'bg-red-500 text-white shadow-[0_0_15px_rgba(239,68,68,0.5)]'
296305
: 'hover:bg-white/10 text-white/70 hover:text-white'
@@ -301,7 +310,7 @@ export default function AdvancedDigitalHumanPage() {
301310

302311
<button
303312
onClick={() => handleChatSend()}
304-
disabled={!chatInput.trim() && !isChatLoading}
313+
disabled={isChatLoading || !chatInput.trim()}
305314
className="p-3 bg-white/10 hover:bg-white/20 disabled:opacity-50 disabled:cursor-not-allowed rounded-xl text-white transition-colors"
306315
>
307316
{isChatLoading ? (

0 commit comments

Comments
 (0)