fix: 改进知识库的初始化错误处理 by piexian · Pull Request #7243 · AstrBotDevs/AstrBot

piexian · 2026-03-31T16:26:57Z

本次修改主要解决知识库在模型加载失败时的错误降级问题（对应 issue #7218）。

此前当模型提供方异常时，知识库可能出现不理想行为：因重排序模型异常导致知识库整体不可用。
本次改动将行为调整为更符合预期且更易排查：重排序模型不可用时，跳过重排序并继续检索（知识库仍可用），在日志中输出清晰的警告/错误信息，便于定位问题。

Modifications / 改动点

调整知识库初始化与检索流程中的异常处理策略。
增加对“重排序不可用”场景的降级逻辑：
- 记录 Warning 日志；
- 跳过重排序；
- 保留嵌入检索结果继续返回。
调整“嵌入不可用”场景的处理：
- 检索阶段直接抛错；
- 避免误返回空结果。
优化知识库实例状态管理：
- 初始化失败状态可被记录并透出；
- 相关生命周期方法增加防御性处理，避免二次异常。

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
/ 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到 requirements.txt 和 pyproject.toml 文件相应位置。
😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

In KBManager.retrieve, failing the whole request when any kb_helper.init_error is set might be too aggressive for multi-KB queries; consider skipping only the unavailable knowledge bases while still returning results from the remaining ones.
In list_kbs, calling await kb_manager.get_kb(kb.kb_id) inside the loop introduces an N+1 access pattern; consider exposing init_error in a way that can be fetched in bulk or cached to avoid repeated awaits per KB.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `KBManager.retrieve`, failing the whole request when any `kb_helper.init_error` is set might be too aggressive for multi-KB queries; consider skipping only the unavailable knowledge bases while still returning results from the remaining ones.
- In `list_kbs`, calling `await kb_manager.get_kb(kb.kb_id)` inside the loop introduces an N+1 access pattern; consider exposing `init_error` in a way that can be fetched in bulk or cached to avoid repeated awaits per KB.

## Individual Comments

### Comment 1
<location path="astrbot/dashboard/routes/knowledge_base.py" line_range="317-322" />
<code_context>
             kb_list = []
             for kb in kbs:
-                kb_list.append(kb.model_dump())
+                kb_dict = kb.model_dump()
+                # include init_error from KBHelper if present
+                kb_helper = await kb_manager.get_kb(kb.kb_id)
+                if kb_helper and kb_helper.init_error:
+                    kb_dict["init_error"] = kb_helper.init_error
+                kb_list.append(kb_dict)

             return (
</code_context>
<issue_to_address>
**suggestion (performance):** Per-KB `get_kb` calls in a loop can introduce N+1 async performance overhead.

`await kb_manager.get_kb(kb.kb_id)` inside the loop creates an N+1 I/O pattern for the list endpoint, which will add latency as the number of KBs grows or if `get_kb` hits DB/disk. To avoid this, consider exposing `init_error` directly on `KnowledgeBase`, adding a bulk `kb_manager` API to fetch all needed helpers in one call, or collecting the coroutines and using `asyncio.gather` to run them in parallel.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-31T16:28:45Z

astrbot/dashboard/routes/knowledge_base.py

+                kb_dict = kb.model_dump()
+                # include init_error from KBHelper if present
+                kb_helper = await kb_manager.get_kb(kb.kb_id)
+                if kb_helper and kb_helper.init_error:
+                    kb_dict["init_error"] = kb_helper.init_error
+                kb_list.append(kb_dict)


suggestion (performance): Per-KB get_kb calls in a loop can introduce N+1 async performance overhead.

await kb_manager.get_kb(kb.kb_id) inside the loop creates an N+1 I/O pattern for the list endpoint, which will add latency as the number of KBs grows or if get_kb hits DB/disk. To avoid this, consider exposing init_error directly on KnowledgeBase, adding a bulk kb_manager API to fetch all needed helpers in one call, or collecting the coroutines and using asyncio.gather to run them in parallel.

gemini-code-assist

Code Review

This pull request introduces improved error handling and status tracking for knowledge base initialization. Key changes include the addition of an init_error field to track failures, graceful degradation when reranking providers are unavailable, and exposing these initialization errors to the dashboard. Review feedback suggests adopting standard Python logging practices by using exc_info=True instead of manual traceback formatting. Additionally, there is a recommendation to reconsider the new behavior of raising exceptions when a single knowledge base fails during retrieval, suggesting that skipping faulty nodes might better preserve system availability in multi-knowledge base scenarios.

astrbot/core/knowledge_base/kb_mgr.py

astrbot/core/knowledge_base/retrieval/manager.py

Copilot

Pull request overview

该 PR 针对知识库在模型/Provider 初始化或执行失败时的行为进行了“可用性降级 + 可观测性增强”的调整：当重排序模型不可用时继续检索并记录明确日志；当嵌入检索不可用时在检索阶段明确抛错，避免静默返回空结果；同时透出知识库实例初始化失败状态，便于在 Dashboard 排查。

Changes:

知识库列表接口新增透出 init_error（来自 KBHelper 实例初始化状态）。
检索流程中 rerank 失败改为降级跳过（保留融合结果），并调整稠密检索失败为直接抛错。
知识库加载/更新时捕获初始化异常并记录到实例状态（init_error），检索前对不可用 KB 明确报错。

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
astrbot/dashboard/routes/knowledge_base.py	KB 列表接口在返回项中附加初始化错误信息，提升可排查性
astrbot/core/knowledge_base/retrieval/manager.py	rerank 失败降级跳过；稠密检索失败改为抛错，避免静默吞错
astrbot/core/knowledge_base/kb_mgr.py	KB 初始化失败状态记录与透出；更新 KB 后尝试重新初始化并更新 `init_error`
astrbot/core/knowledge_base/kb_helper.py	新增 `init_error` 字段；rerank provider 不可用时降级为不启用 rerank；terminate 增强防御性

astrbot/core/knowledge_base/kb_mgr.py

astrbot/core/knowledge_base/retrieval/manager.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a3fba97b71

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

astrbot/core/knowledge_base/kb_mgr.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 164640a9c9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

astrbot/core/knowledge_base/kb_mgr.py

display a dedicated error state for knowledge base cards that fail initialization, including a visible badge and error details prevent navigation and edit actions for failed cards while keeping delete available, and hide normal stats/description for error items add list.initError locale strings for en-US, ru-RU, and zh-CN

RC-CHN · 2026-04-02T07:30:41Z

注意到后端的init_error字段在前端未使用，添加了相关显示组件

Initialize a new KB helper before swapping instances so a failed re-init does not break the active knowledge base service. If initialization fails, restore in-memory KB settings and keep the existing helper and previous init error state. Also clear stale init_error after successful vector DB initialization to prevent outdated error reporting.

cover initialization failure and recovery scenarios to guard against regressions in kb error handling include reference assets under refs for test validation

piexian added 2 commits March 31, 2026 23:36

fix: 改进 KnowledgeBaseManager 和 KBHelper 中的初始化错误处理

987f9f6

fix: 改进知识库初始化和重排序错误处理，增强日志记录

a3fba97

Copilot AI review requested due to automatic review settings March 31, 2026 16:26

auto-assign bot requested review from Raven95676 and Soulter March 31, 2026 16:27

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 31, 2026

Copilot started reviewing on behalf of piexian March 31, 2026 16:27 View session

dosubot bot added the feature:knowledge-base The bug / feature is about knowledge base label Mar 31, 2026

sourcery-ai bot reviewed Mar 31, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 31, 2026

View reviewed changes

Copilot AI reviewed Mar 31, 2026

View reviewed changes

astrbot/core/knowledge_base/kb_mgr.py Show resolved Hide resolved

astrbot/core/knowledge_base/retrieval/manager.py Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 31, 2026

View reviewed changes

astrbot/core/knowledge_base/kb_mgr.py Outdated Show resolved Hide resolved

fix: 改进知识库模块初始化和检索错误处理

164640a

chatgpt-codex-connector bot reviewed Mar 31, 2026

View reviewed changes

astrbot/core/knowledge_base/kb_mgr.py Show resolved Hide resolved

astrbot/core/knowledge_base/kb_mgr.py Outdated Show resolved Hide resolved

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 2, 2026

RC-CHN added 2 commits April 2, 2026 15:40

test(kb): add kb manager resilience tests

86e50b8

cover initialization failure and recovery scenarios to guard against regressions in kb error handling include reference assets under refs for test validation

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 2, 2026

RC-CHN approved these changes Apr 2, 2026

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 2, 2026

RC-CHN merged commit 9d4472c into AstrBotDevs:master Apr 2, 2026
7 checks passed

piexian deleted the fix/kb-init-error-handlin branch April 5, 2026 01:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: 改进知识库的初始化错误处理#7243

fix: 改进知识库的初始化错误处理#7243
RC-CHN merged 6 commits intoAstrBotDevs:masterfrom
piexian:fix/kb-init-error-handlin

piexian commented Mar 31, 2026 •

edited

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Mar 31, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

RC-CHN commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

piexian commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

RC-CHN commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

piexian commented Mar 31, 2026 •

edited

Loading