Skip to content

Commit 4cf833e

Browse files
yaojin3616yaojin
andauthored
fix: reduce DB connection pool exhaustion during LLM calls (#650)
* fix: reduce DB connection pool exhaustion by shortening connection hold time during LLM calls All channel message handlers (Feishu, DingTalk, WeCom, Slack, Discord, WeChat) now follow a three-phase pattern: - Phase 1: Short transaction — load config, save user message → commit + close - Phase 2: LLM call with no DB session held - Phase 3: New short transaction — save assistant reply Previously, a single DB connection was held for the entire request lifecycle, including slow LLM calls (30s+). This caused connection pool exhaustion under concurrent load. Changes: - Split _call_agent_llm into _load_agent_and_model + _call_llm_with_config - Each channel handler manages its own connection lifecycle - New short transactions for saving assistant replies reload ChatSession - Background tasks (heartbeat, trigger_daemon) get new trace IDs for log correlation * fix(release): update GitHub Models to use available openai/gpt-5 model * fix(release): use MODELS_TOKEN secret for GitHub Models API access * fix(heartbeat): add semaphore concurrency limit and atomic heartbeat claim to prevent duplicate triggers * fix(startup): remove blanket A2A async enable on every restart * openai 4.1 * fix(release): use personal models endpoint and gpt-5 for release notes Organization-level GitHub Models endpoint returns 403 on free plan. Switch to personal endpoint to bypass org restriction. * fix(release): switch to gpt-4.1 for larger token limit * fix(release): include release notes in PR body * fix(release): sort release notes by importance in prompt * fix(publish_page): use storage backend instead of local filesystem for S3 compatibility --------- Co-authored-by: yaojin <yaojin@58.com>
1 parent 34ae4e5 commit 4cf833e

16 files changed

Lines changed: 421 additions & 439 deletions

.github/workflows/release.yml

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,7 @@ jobs:
196196
1. Title: Start with a top-level heading exactly formatted as: # {target_tag} — <Concise title summarizing the main theme of this release>
197197
2. ## What's New:
198198
- Group related changes into thematic subheadings (e.g., ### Core Features, ### UI/UX Enhancements, ### Optimizations).
199+
- Sort subheadings and items within each group by importance: major features first, then enhancements, then minor tweaks.
199200
- Specifically list all newly added features and optimization items. Explain what value they add.
200201
3. ## Bug Fixes:
201202
- List resolved bugs, issues, or stability improvements.
@@ -211,6 +212,7 @@ jobs:
211212
- Prefer grouping related commits and summarizing the feature/improvement instead of listing every commit verbatim.
212213
- Do NOT invent or hallucinate any features or fixes that are not present or strongly implied in the commit list.
213214
- Keep the language clean and consistent with previous release notes.
215+
- IMPORTANT: Within each section (What's New, Bug Fixes, etc.), sort items by impact and importance in descending order. New core features and major enhancements come first, followed by smaller improvements. Bug fixes that affect stability or data integrity come before minor UI tweaks.
214216
215217
Style Reference (Mimic this structure and formatting):
216218
---
@@ -244,7 +246,7 @@ jobs:
244246
if: ${{ inputs.use_ai_notes }}
245247
shell: bash
246248
env:
247-
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
249+
GITHUB_TOKEN: ${{ secrets.MODELS_TOKEN }}
248250
run: |
249251
set -euo pipefail
250252
@@ -254,7 +256,7 @@ jobs:
254256
255257
prompt = Path(".github/release-artifacts/release-prompt.txt").read_text(encoding="utf-8")
256258
payload = {
257-
"model": "openai/gpt-5.4",
259+
"model": "openai/gpt-4.1",
258260
"messages": [
259261
{
260262
"role": "system",
@@ -281,7 +283,7 @@ jobs:
281283
-H "Authorization: Bearer $GITHUB_TOKEN" \
282284
-H "X-GitHub-Api-Version: 2026-03-10" \
283285
-H "Content-Type: application/json" \
284-
https://models.github.ai/orgs/dataelement/inference/chat/completions \
286+
https://models.github.ai/inference/chat/completions \
285287
-d @.github/release-artifacts/openai-payload.json
286288
)
287289
@@ -441,9 +443,21 @@ jobs:
441443
run: |
442444
set -euo pipefail
443445
446+
release_notes=""
447+
if [ -s .github/release-artifacts/release-notes.generated.md ]; then
448+
release_notes="$(cat .github/release-artifacts/release-notes.generated.md)"
449+
fi
450+
444451
gh pr create \
445452
--title "chore(release): cut ${TARGET_TAG}" \
446-
--body "Automated release PR for ${TARGET_TAG}. Merging this PR will automatically tag the release and publish it." \
453+
--body "$(cat <<EOF
454+
Automated release PR for ${TARGET_TAG}. Merging this PR will automatically tag the release and publish it.
455+
456+
---
457+
458+
${release_notes}
459+
EOF
460+
)" \
447461
--head "release/${TARGET_TAG}" \
448462
--base "${{ github.ref_name }}"
449463

backend/app/api/dingtalk.py

Lines changed: 43 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,6 @@ async def process_dingtalk_message(
158158
sender_nick: Display name of the sender from DingTalk.
159159
message_id: DingTalk message ID (used for reactions).
160160
"""
161-
import json
162161
import httpx
163162
from datetime import datetime, timezone
164163
from sqlalchemy import select as _select
@@ -167,7 +166,6 @@ async def process_dingtalk_message(
167166
from app.models.audit import ChatMessage
168167
from app.services.channel_session import find_or_create_channel_session
169168
from app.services.channel_user_service import channel_user_service
170-
from app.api.feishu import _call_agent_llm
171169

172170
async with async_session() as db:
173171
sender_staff_id = (sender_staff_id or "").strip()
@@ -181,7 +179,6 @@ async def process_dingtalk_message(
181179
if not sender_staff_id:
182180
logger.warning("[DingTalk] Skip message attribution because sender_staff_id is empty")
183181
return
184-
creator_id = agent_obj.creator_id
185182
from app.models.agent import DEFAULT_CONTEXT_WINDOW_SIZE
186183
ctx_size = (agent_obj.context_window_size or DEFAULT_CONTEXT_WINDOW_SIZE) if agent_obj else DEFAULT_CONTEXT_WINDOW_SIZE
187184

@@ -245,7 +242,28 @@ async def process_dingtalk_message(
245242
conversation_id=session_conv_id,
246243
))
247244
sess.last_message_at = datetime.now(timezone.utc)
245+
246+
# Also load DingTalk credentials and agent/model config in this transaction
247+
_dt_cfg_r = await db.execute(
248+
_select(ChannelConfig).where(
249+
ChannelConfig.agent_id == agent_id,
250+
ChannelConfig.channel_type == "dingtalk",
251+
)
252+
)
253+
_dt_cfg = _dt_cfg_r.scalar_one_or_none()
254+
_dt_app_key = _dt_cfg.app_id if _dt_cfg else None
255+
_dt_app_secret = _dt_cfg.app_secret if _dt_cfg else None
256+
257+
# Pre-load agent/model for LLM call
258+
from app.api.feishu import _load_agent_and_model
259+
_agent_model, _llm_model, _fallback_model = await _load_agent_and_model(db, agent_id)
260+
261+
# Extract agent name before closing session
262+
_agent_name = agent_obj.name
263+
248264
await db.commit()
265+
# ── Phase 1 complete: release connection before slow LLM/HTTP work ──
266+
await db.close()
249267

250268
# Build LLM input text: for images, inject base64 markers so vision models can see them
251269
llm_user_text = user_text
@@ -262,17 +280,6 @@ async def process_dingtalk_message(
262280
_send_dingtalk_media_message,
263281
)
264282

265-
# Load DingTalk credentials from ChannelConfig
266-
_dt_cfg_r = await db.execute(
267-
_select(ChannelConfig).where(
268-
ChannelConfig.agent_id == agent_id,
269-
ChannelConfig.channel_type == "dingtalk",
270-
)
271-
)
272-
_dt_cfg = _dt_cfg_r.scalar_one_or_none()
273-
_dt_app_key = _dt_cfg.app_id if _dt_cfg else None
274-
_dt_app_secret = _dt_cfg.app_secret if _dt_cfg else None
275-
276283
_cfs_token = None
277284
if _dt_app_key and _dt_app_secret:
278285
# Determine send target: group -> conversation_id, P2P -> sender_staff_id
@@ -337,10 +344,12 @@ async def _dingtalk_file_sender(file_path: str, msg: str = ""):
337344

338345
_cfs_token = _cfs.set(_dingtalk_file_sender)
339346

340-
# Call LLM
347+
# Call LLM (no DB session needed)
348+
from app.api.feishu import _call_llm_with_config
341349
try:
342-
reply_text = await _call_agent_llm(
343-
db, agent_id, llm_user_text,
350+
reply_text = await _call_llm_with_config(
351+
_agent_model, _llm_model, _fallback_model,
352+
agent_id, llm_user_text,
344353
history=history, user_id=platform_user_id,
345354
)
346355
finally:
@@ -370,7 +379,7 @@ async def _dingtalk_file_sender(file_path: str, msg: str = ""):
370379
await client.post(session_webhook, json={
371380
"msgtype": "markdown",
372381
"markdown": {
373-
"title": agent_obj.name or "AI Reply",
382+
"title": _agent_name or "AI Reply",
374383
"text": reply_text,
375384
},
376385
})
@@ -386,14 +395,22 @@ async def _dingtalk_file_sender(file_path: str, msg: str = ""):
386395
except Exception as e2:
387396
logger.error(f"[DingTalk] Fallback text reply also failed: {e2}")
388397

389-
# Save assistant reply
390-
db.add(ChatMessage(
391-
agent_id=agent_id, user_id=platform_user_id,
392-
role="assistant", content=reply_text,
393-
conversation_id=session_conv_id,
394-
))
395-
sess.last_message_at = datetime.now(timezone.utc)
396-
await db.commit()
398+
# Save assistant reply (new short transaction)
399+
async with async_session() as _save_db:
400+
_save_db.add(ChatMessage(
401+
agent_id=agent_id, user_id=platform_user_id,
402+
role="assistant", content=reply_text,
403+
conversation_id=session_conv_id,
404+
))
405+
# Reload session object to update last_message_at
406+
from app.models.chat_session import ChatSession
407+
_sess_r = await _save_db.execute(
408+
_select(ChatSession).where(ChatSession.id == uuid.UUID(session_conv_id))
409+
)
410+
_sess_fresh = _sess_r.scalar_one_or_none()
411+
if _sess_fresh:
412+
_sess_fresh.last_message_at = datetime.now(timezone.utc)
413+
await _save_db.commit()
397414

398415
# Log activity
399416
from app.services.activity_logger import log_activity

backend/app/api/discord_bot.py

Lines changed: 44 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -281,11 +281,11 @@ async def discord_interaction_webhook(
281281
async def handle_in_background():
282282
from app.models.audit import ChatMessage
283283
from app.models.agent import Agent as AgentModel
284-
from app.api.feishu import _call_agent_llm
285284
from app.services.channel_session import find_or_create_channel_session
286285
from app.database import async_session
287286
from datetime import datetime, timezone
288287

288+
# ── Phase 1: Short transaction — load configs, save user message ──
289289
async with async_session() as bg_db:
290290
# Load agent
291291
agent_r = await bg_db.execute(select(AgentModel).where(AgentModel.id == agent_id))
@@ -296,19 +296,19 @@ async def handle_in_background():
296296

297297
# Find-or-create platform user for this Discord sender via unified service
298298
from app.services.channel_user_service import channel_user_service
299-
299+
300300
_discord_username = body.get("member", {}).get("user", {}).get("username") or body.get("user", {}).get("username", "")
301301
_display = _discord_username or f"Discord User {sender_id[:8]}"
302302
_extra_info = {"name": _display}
303-
303+
304304
_platform_user = await channel_user_service.resolve_channel_user(
305305
db=bg_db,
306306
agent=agent_obj,
307307
channel_type="discord",
308308
external_user_id=sender_id,
309309
extra_info=_extra_info,
310310
)
311-
311+
312312
# Update display_name if we now have a better name
313313
if _discord_username and _platform_user.display_name and _platform_user.display_name.startswith("Discord User ") and _platform_user.display_name != _discord_username:
314314
_platform_user.display_name = _discord_username
@@ -340,40 +340,54 @@ async def handle_in_background():
340340
# Save user message
341341
bg_db.add(ChatMessage(agent_id=agent_id, user_id=platform_user_id, role="user", content=user_text, conversation_id=session_conv_id))
342342
sess.last_message_at = datetime.now(timezone.utc)
343-
await bg_db.commit()
344343

345-
# Call LLM
346-
reply_text = await _call_agent_llm(
347-
bg_db,
348-
agent_id,
349-
user_text,
350-
history=history,
351-
user_id=platform_user_id,
352-
session_id=session_conv_id,
353-
)
354-
logger.info(f"[Discord] LLM reply: {reply_text[:80]}")
344+
# Pre-load agent/model for LLM call and extract config values
345+
from app.api.feishu import _load_agent_and_model
346+
_agent_model, _llm_model, _fallback_model = await _load_agent_and_model(bg_db, agent_id)
355347

356-
# Save reply
357-
bg_db.add(ChatMessage(agent_id=agent_id, user_id=platform_user_id, role="assistant", content=reply_text, conversation_id=session_conv_id))
358-
sess.last_message_at = datetime.now(timezone.utc)
359-
await bg_db.commit()
360-
361-
# Bot token stored in config — read from DB to avoid detached ORM issues
362348
from sqlalchemy import select as _sel
363349
cfg_r = await bg_db.execute(_sel(ChannelConfig).where(
364350
ChannelConfig.agent_id == agent_id,
365351
ChannelConfig.channel_type == "discord",
366352
))
367353
cfg = cfg_r.scalar_one_or_none()
368-
bot_token_bg = cfg.app_secret if cfg else ""
369-
app_id_bg = cfg.app_id if cfg else ""
370-
371-
# Send chunked reply via Discord follow-up
372-
if bot_token_bg and interaction_token and app_id_bg:
373-
try:
374-
await _send_discord_followup(app_id_bg, bot_token_bg, interaction_token, reply_text)
375-
except Exception as e:
376-
logger.error(f"[Discord] Failed to send follow-up: {e}")
354+
_bot_token_bg = cfg.app_secret if cfg else ""
355+
_app_id_bg = cfg.app_id if cfg else ""
356+
357+
await bg_db.commit()
358+
# ── Phase 1 complete: release connection ──
359+
360+
# ── Phase 2: LLM call (no DB session needed) ──
361+
from app.api.feishu import _call_llm_with_config
362+
reply_text = await _call_llm_with_config(
363+
_agent_model, _llm_model, _fallback_model,
364+
agent_id,
365+
user_text,
366+
history=history,
367+
user_id=platform_user_id,
368+
session_id=session_conv_id,
369+
)
370+
logger.info(f"[Discord] LLM reply: {reply_text[:80]}")
371+
372+
# ── Phase 3: Save reply + send (new short transaction) ──
373+
async with async_session() as _save_db:
374+
_save_db.add(ChatMessage(agent_id=agent_id, user_id=platform_user_id, role="assistant", content=reply_text, conversation_id=session_conv_id))
375+
# Reload session object to update last_message_at
376+
from app.models.chat_session import ChatSession
377+
_sess_r = await _save_db.execute(
378+
select(ChatSession).where(ChatSession.id == uuid.UUID(session_conv_id))
379+
)
380+
_sess_fresh = _sess_r.scalar_one_or_none()
381+
if _sess_fresh:
382+
_sess_fresh.last_message_at = datetime.now(timezone.utc)
383+
await _save_db.commit()
384+
385+
# Send chunked reply via Discord follow-up
386+
if _bot_token_bg and interaction_token and _app_id_bg:
387+
try:
388+
await _send_discord_followup(_app_id_bg, _bot_token_bg, interaction_token, reply_text)
389+
except Exception as e:
390+
logger.error(f"[Discord] Failed to send follow-up: {e}")
377391

378392
asyncio.create_task(handle_in_background())
379393
# Return DEFERRED_CHANNEL_MESSAGE_WITH_SOURCE — shows "thinking..." to user

0 commit comments

Comments
 (0)