Conversation
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
KBManager.retrieve, failing the whole request when anykb_helper.init_erroris set might be too aggressive for multi-KB queries; consider skipping only the unavailable knowledge bases while still returning results from the remaining ones. - In
list_kbs, callingawait kb_manager.get_kb(kb.kb_id)inside the loop introduces an N+1 access pattern; consider exposinginit_errorin a way that can be fetched in bulk or cached to avoid repeated awaits per KB.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `KBManager.retrieve`, failing the whole request when any `kb_helper.init_error` is set might be too aggressive for multi-KB queries; consider skipping only the unavailable knowledge bases while still returning results from the remaining ones.
- In `list_kbs`, calling `await kb_manager.get_kb(kb.kb_id)` inside the loop introduces an N+1 access pattern; consider exposing `init_error` in a way that can be fetched in bulk or cached to avoid repeated awaits per KB.
## Individual Comments
### Comment 1
<location path="astrbot/dashboard/routes/knowledge_base.py" line_range="317-322" />
<code_context>
kb_list = []
for kb in kbs:
- kb_list.append(kb.model_dump())
+ kb_dict = kb.model_dump()
+ # include init_error from KBHelper if present
+ kb_helper = await kb_manager.get_kb(kb.kb_id)
+ if kb_helper and kb_helper.init_error:
+ kb_dict["init_error"] = kb_helper.init_error
+ kb_list.append(kb_dict)
return (
</code_context>
<issue_to_address>
**suggestion (performance):** Per-KB `get_kb` calls in a loop can introduce N+1 async performance overhead.
`await kb_manager.get_kb(kb.kb_id)` inside the loop creates an N+1 I/O pattern for the list endpoint, which will add latency as the number of KBs grows or if `get_kb` hits DB/disk. To avoid this, consider exposing `init_error` directly on `KnowledgeBase`, adding a bulk `kb_manager` API to fetch all needed helpers in one call, or collecting the coroutines and using `asyncio.gather` to run them in parallel.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| kb_dict = kb.model_dump() | ||
| # include init_error from KBHelper if present | ||
| kb_helper = await kb_manager.get_kb(kb.kb_id) | ||
| if kb_helper and kb_helper.init_error: | ||
| kb_dict["init_error"] = kb_helper.init_error | ||
| kb_list.append(kb_dict) |
There was a problem hiding this comment.
suggestion (performance): Per-KB get_kb calls in a loop can introduce N+1 async performance overhead.
await kb_manager.get_kb(kb.kb_id) inside the loop creates an N+1 I/O pattern for the list endpoint, which will add latency as the number of KBs grows or if get_kb hits DB/disk. To avoid this, consider exposing init_error directly on KnowledgeBase, adding a bulk kb_manager API to fetch all needed helpers in one call, or collecting the coroutines and using asyncio.gather to run them in parallel.
There was a problem hiding this comment.
Code Review
This pull request introduces improved error handling and status tracking for knowledge base initialization. Key changes include the addition of an init_error field to track failures, graceful degradation when reranking providers are unavailable, and exposing these initialization errors to the dashboard. Review feedback suggests adopting standard Python logging practices by using exc_info=True instead of manual traceback formatting. Additionally, there is a recommendation to reconsider the new behavior of raising exceptions when a single knowledge base fails during retrieval, suggesting that skipping faulty nodes might better preserve system availability in multi-knowledge base scenarios.
There was a problem hiding this comment.
Pull request overview
该 PR 针对知识库在模型/Provider 初始化或执行失败时的行为进行了“可用性降级 + 可观测性增强”的调整:当重排序模型不可用时继续检索并记录明确日志;当嵌入检索不可用时在检索阶段明确抛错,避免静默返回空结果;同时透出知识库实例初始化失败状态,便于在 Dashboard 排查。
Changes:
- 知识库列表接口新增透出
init_error(来自 KBHelper 实例初始化状态)。 - 检索流程中 rerank 失败改为降级跳过(保留融合结果),并调整稠密检索失败为直接抛错。
- 知识库加载/更新时捕获初始化异常并记录到实例状态(
init_error),检索前对不可用 KB 明确报错。
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| astrbot/dashboard/routes/knowledge_base.py | KB 列表接口在返回项中附加初始化错误信息,提升可排查性 |
| astrbot/core/knowledge_base/retrieval/manager.py | rerank 失败降级跳过;稠密检索失败改为抛错,避免静默吞错 |
| astrbot/core/knowledge_base/kb_mgr.py | KB 初始化失败状态记录与透出;更新 KB 后尝试重新初始化并更新 init_error |
| astrbot/core/knowledge_base/kb_helper.py | 新增 init_error 字段;rerank provider 不可用时降级为不启用 rerank;terminate 增强防御性 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3fba97b71
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 164640a9c9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
display a dedicated error state for knowledge base cards that fail initialization, including a visible badge and error details prevent navigation and edit actions for failed cards while keeping delete available, and hide normal stats/description for error items add list.initError locale strings for en-US, ru-RU, and zh-CN
Initialize a new KB helper before swapping instances so a failed re-init does not break the active knowledge base service. If initialization fails, restore in-memory KB settings and keep the existing helper and previous init error state. Also clear stale init_error after successful vector DB initialization to prevent outdated error reporting.
cover initialization failure and recovery scenarios to guard against regressions in kb error handling include reference assets under refs for test validation

本次修改主要解决知识库在模型加载失败时的错误降级问题(对应 issue #7218)。
此前当模型提供方异常时,知识库可能出现不理想行为: 因重排序模型异常导致知识库整体不可用。
本次改动将行为调整为更符合预期且更易排查:重排序模型不可用时,跳过重排序并继续检索(知识库仍可用),在日志中输出清晰的警告/错误信息,便于定位问题。
Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。