perf(ltm): pre-caption images outside session lock to unblock subsequent messages#7228
perf(ltm): pre-caption images outside session lock to unblock subsequent messages#7228Reisenbug wants to merge 4 commits intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
pre_caption_imagesthe config is fetched again even thoughprocess()has already retrievedprovider_settings; consider passing the relevant config or provider id intopre_caption_imagesto avoid duplicate lookups and keep configuration handling in one place. - When catching the broad
Exceptioninpre_caption_images, log withexc_info=True(or similar) so that stack traces are preserved and diagnosing caption failures is easier.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `pre_caption_images` the config is fetched again even though `process()` has already retrieved `provider_settings`; consider passing the relevant config or provider id into `pre_caption_images` to avoid duplicate lookups and keep configuration handling in one place.
- When catching the broad `Exception` in `pre_caption_images`, log with `exc_info=True` (or similar) so that stack traces are preserved and diagnosing caption failures is easier.
## Individual Comments
### Comment 1
<location path="astrbot/core/astr_main_agent.py" line_range="530-541" />
<code_context>
plugin_context: Context,
image_caption_provider: str,
) -> None:
+ pre_caption = event.get_extra(_PRE_CAPTION_RESULT_KEY)
+ if pre_caption is not None:
+ if pre_caption:
+ req.extra_user_content_parts.append(
+ TextPart(text=f"<image_caption>{pre_caption}</image_caption>")
+ )
+ req.image_urls = []
+ return
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Clarify behavior when pre-caption result is an empty string vs not set vs None.
Right now, an empty-string pre-caption is treated as a successful caption: it suppresses normal captioning and clears `image_urls`, so images are dropped with no caption. If the intent is to short-circuit only on real captions, consider gating both appending the text and clearing `image_urls` behind `if pre_caption:` and leaving `image_urls` unchanged when `pre_caption` is falsy (including `""`).
```suggestion
plugin_context: Context,
image_caption_provider: str,
) -> None:
+ # If pre-captioning produced a non-empty caption, use it and skip normal captioning.
+ # For None or empty-string results, fall through to the normal captioning path.
+ pre_caption = event.get_extra(_PRE_CAPTION_RESULT_KEY)
+ if pre_caption:
+ req.extra_user_content_parts.append(
+ TextPart(text=f"<image_caption>{pre_caption}</image_caption>")
+ )
+ # We already have a valid caption; skip further image captioning.
+ req.image_urls = []
+ return
+
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces a pre-captioning mechanism for images to avoid blocking the session lock during slow LLM calls. It adds a new configuration setting, image_caption_wait_for_context_order, allowing users to choose between strict context ordering and faster response times. Feedback was provided regarding the non-blocking mode, noting that the current implementation still falls back to synchronous captioning within the session lock when the setting is disabled; a code suggestion was offered to skip captioning in this scenario to truly achieve non-blocking behavior.
astrbot/core/pipeline/process_stage/method/agent_sub_stages/internal.py
Outdated
Show resolved
Hide resolved
|
这个或许算预期行为?是否需要开放这一配置项我个人认为有待商榷,但仍然感谢您做出的贡献 |
开启图片转述功能后,若用户发送图片后紧接着发送文字消息 ,文字消息会被阻塞,必须等待图片转述完全 结束后才能得到回复。
小模型调用的时候延迟尤其明显。
根本原因:图片转述的 LLM 调用在 session_lock 内执行,占用锁期间同 session 的后续消息无法进入处理流程。
Modifications / 改动点
astrbot/core/astr_main_agent.py:新增 pre_caption_images() 函数,在获取 session lock 之前完成图片下载、压缩、转述,结果写入 event extra;_ensure_img_caption() 优先读取预处理结果,避免在 lock 内重复调用小模型。
astrbot/core/pipeline/process_stage/method/agent_sub_ stages/internal.py:在 session_lock_manager.acquire_lock() 之前调用 pre_caption_images(),根据配置决定是否等待。
astrbot/core/config/default.py:新增配置项 provider_s ettings.image_caption_wait_for_context_order(默认
True),位于”默认图片转述模型“下方。
开启(默认):后续消息等待图片转述完成后再处理,上 下文顺序正确(图片描述在后续消息之前)。也就是现在的行为。
关闭:后续消息立即响应,图片转述并发执行,但上下文 中图片描述可能出现在后续消息之后。
dashboard/src/i18n/:新增 zh-CN 和 en-US 翻译。
This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
改后并关闭:
$+ΛʀƊ∪$ τ: 03-31 16:54:48
[图片]
黑
Baka!: 03-31 16:54:58
什么黑?我的蝴蝶结是蓝的!
Baka!: 03-31 16:55:02
这个帽子比我的翅膀还大!他这样能看到路吗?
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Allow image captioning to run before acquiring the session lock so that slow caption LLMs no longer block subsequent messages in the same session.
New Features:
Bug Fixes:
Enhancements: