From a8e2ee0f00bfe6dc1293fe35ab32d27913faee0b Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Sat, 4 Apr 2026 17:13:20 +0800 Subject: [PATCH 1/7] docs(spec): add KB RAG retrieval design --- .../2026-04-04-kb-rag-retrieval-design.md | 229 ++++++++++++++++++ 1 file changed, 229 insertions(+) create mode 100644 docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md diff --git a/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md b/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md new file mode 100644 index 0000000000..74f15fb539 --- /dev/null +++ b/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md @@ -0,0 +1,229 @@ +--- +sidebar_position: 1 +--- + +# KB RAG Retrieval Design + +## Overview + +This design improves knowledge-base retrieval behavior for AI agents without adding new tools, new persistent configuration, or new backend I/O requests. + +The current system already provides three complementary KB capabilities: + +- `kb_ls` for document-level overview +- `knowledge_base_search` for fast retrieval +- `kb_head` for direct content reading + +The main gap is not missing infrastructure. The gap is that agents do not have enough guidance or enough scoped-search controls to use these capabilities proactively on large or complex knowledge bases. + +This design keeps the current intent-routing model, but changes the routing outcome for KB content questions from a mostly fixed "search first" path to a more flexible "decide the next best retrieval action" path. + +## Goals + +- Preserve intent routing as the first decision step for KB interactions. +- Let the agent choose between direct search, document overview, scoped search, and manual reading based on the current question and KB metadata. +- Allow `knowledge_base_search` to narrow search scope by specific documents. +- Reuse existing retrieval infrastructure and call-limit behavior. +- Avoid extra network requests or new long-lived configuration. + +## Non-Goals + +- No new "smart KB" tool that wraps `kb_ls`, `knowledge_base_search`, and `kb_head`. +- No backend-enforced size thresholds such as small/medium/large KB routing. +- No new generic metadata filter object exposed to the model in the first version. +- No changes to existing call-limit policies. + +## Current Problems + +1. Prompt guidance over-biases KB question answering toward `knowledge_base_search`, which weakens retrieval quality on larger or more heterogeneous KBs. +2. `knowledge_base_search` already supports document filtering internally through `document_ids`, but that capability is not exposed to the model as part of the tool schema. +3. Dynamic KB metadata injected into the conversation does not currently include retrieval-relevant facts such as document count or whether RAG is available. +4. `kb_ls` is described mainly as a fallback path instead of a proactive range-discovery tool. + +## Proposed Behavior + +### Intent Routing + +Intent routing remains the first step. + +For KB-related requests, the model should still first classify whether the user is asking about: + +- KB metadata or selection +- KB document overview +- KB content question +- KB management operations +- Manual content reading in no-RAG situations + +The change is in the next-step guidance for KB content questions: + +- If the question is precise and direct retrieval is likely enough, the model may call `knowledge_base_search` immediately. +- If the question is broad, ambiguous, or likely to depend on document structure or document choice, the model should prefer `kb_ls` first. +- After `kb_ls`, the model may call `knowledge_base_search` again with scoped document arguments. +- If RAG is unavailable, the model should use `kb_ls` and `kb_head`. + +### Dynamic KB Metadata + +The request-scoped KB metadata block should remain lightweight and objective. + +Each KB entry should include: + +- `kb_name` +- `kb_id` +- `document_count` +- `rag_enabled` +- existing optional summary/topic hints when available + +No derived size buckets should be added. Exact values are simpler to maintain and give the model enough signal to decide whether it should first inspect document scope. + +Example shape: + +```text +Knowledge Bases In Scope: +- KB Name: Product Docs, KB ID: 12, Documents: 128, RAG: enabled + - Summary: Internal product documentation and runbooks + - Topics: release process, deployment, alerts +``` + +### Scoped Search + +`knowledge_base_search` should support narrowing retrieval to specific documents with: + +- `document_ids: number[]` +- `document_names: string[]` + +Expected usage: + +- If the model already knows exact document names from the user prompt, previous `kb_ls` output, or conversation history, it may use `document_names` directly. +- If the model does not know the target documents, it should use `kb_ls` first. +- If the model has document IDs, it should prefer `document_ids`. + +## Interface Design + +### Shared Prompt Layer + +Update the KB prompt templates in `shared/prompts/knowledge_base.py`. + +Key prompt adjustments: + +- Keep `Intent Routing (DO THIS FIRST)`. +- Keep the current strict / relaxed / no-RAG / restricted mode structure. +- Change KB content-question guidance from effectively "search first" to "decide whether to inspect scope first". +- Reframe `kb_ls` as a proactive document-overview tool, not only a fallback tool. +- Document that scoped `knowledge_base_search` can use `document_ids` or `document_names` when the target documents are known. + +Restricted mode remains search-only. It does not expose `kb_ls` or `kb_head`. + +### Dynamic KB Meta Prompt + +Update the backend KB meta formatter so each entry includes `document_count` and `rag_enabled`. + +This change should stay formatting-only from the prompt module's perspective. The preprocessing layer remains responsible for assembling the meta list from already available backend state. + +### Chat Shell Tool Schema + +Extend the `KnowledgeBaseInput` schema used by `knowledge_base_search`: + +- add optional `document_ids` +- add optional `document_names` + +The tool continues to call the same backend internal retrieve endpoint. Existing calls with only `query` and `max_results` remain valid and unchanged. + +### Backend Internal Retrieve Surface + +Extend the internal retrieval request model to accept: + +- `document_ids` +- `document_names` + +Resolution rules: + +- If `document_ids` is provided, retrieval uses those IDs directly. +- If `document_names` is provided, backend resolves them to document IDs within the currently scoped KB set. +- If both are provided, `document_ids` takes precedence and `document_names` is ignored. +- Name resolution uses exact matching only. +- If no names resolve, return a structured error instructing the caller to use `kb_ls` first. + +The backend should resolve `document_names` into `document_ids` before entering the existing retrieval/filtering flow so that the downstream retrieval path remains unchanged. + +## Matching Rules + +### `document_names` + +- Matching is exact. +- Matching is scoped to the KBs already attached to the current tool invocation. +- If multiple KBs contain the same exact document name, all matching documents are included. +- No fuzzy matching or contains matching is added in the first version. + +### Ambiguity + +If multiple exact matches exist across KBs, that is acceptable and should not be treated as an error. The current system already supports multi-KB retrieval, so the scoped result set can include all exact matches. + +## Error Handling + +- If `document_names` resolves to no documents, return a structured error and guidance to call `kb_ls`. +- If `document_ids` or `document_names` is an empty list, treat it as absent input. +- If RAG is not configured, preserve the current no-RAG response and guide the model toward `kb_ls` and `kb_head`. +- Restricted mode continues to allow only `knowledge_base_search`, even when scoped arguments are used. +- Do not silently fall back from failed scoped search to global search. Silent widening would reduce precision and make tool behavior unpredictable. + +## Compatibility + +- Existing callers that use only `query` and `max_results` keep the current behavior. +- Existing retrieval logic based on `document_ids -> metadata_condition -> backend storage retrieval` remains the main implementation path. +- No new tool is introduced. +- No new persistent KB configuration is introduced. +- No new call-limit behavior is introduced. + +## Implementation Boundaries + +### Shared + +Responsible for updating the KB prompt instructions only. + +### Chat Shell + +Responsible for exposing scoped-search inputs to the model and forwarding them to backend retrieval. + +### Backend + +Responsible for: + +- enriching KB metadata with `document_count` and `rag_enabled` +- resolving `document_names` to `document_ids` +- keeping retrieval execution on the existing filtered retrieval path + +## Testing + +### Shared + +- Update prompt tests to reflect the new KB guidance wording. +- Add or update tests for KB meta prompt formatting with `document_count` and `rag_enabled`. + +### Chat Shell + +- Verify `knowledge_base_search` schema includes `document_ids`. +- Verify `knowledge_base_search` schema includes `document_names`. +- Verify scoped arguments are forwarded in HTTP mode. +- Verify legacy unscoped usage remains unchanged. + +### Backend + +- Verify internal retrieve accepts `document_names`. +- Verify exact-match name resolution within KB scope. +- Verify unresolved names return the expected structured error. +- Verify multiple exact matches across KBs are included. +- Verify direct `document_ids` filtering still works. +- Verify the downstream retrieval path still uses document-based metadata filtering. + +## Rollout Notes + +This design is intentionally incremental. + +The first version should stop at: + +1. prompt guidance updates +2. dynamic metadata enrichment +3. scoped search support through `document_ids` and `document_names` +4. backend exact-match document-name resolution + +Do not expand the first version into generic filter support, fuzzy matching, or backend hard-coded KB-size routing. From dbc5695df59dfea486752e03770ed8623e350509 Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Sat, 4 Apr 2026 18:44:57 +0800 Subject: [PATCH 2/7] feat(knowledge): enrich dynamic KB retrieval metadata --- .../services/chat/preprocessing/contexts.py | 22 ++++ .../services/chat/preprocessing/kb_meta.py | 19 ++- .../services/knowledge/knowledge_service.py | 46 ++++++- backend/tests/services/test_kb_meta_prompt.py | 117 ++++++++++++++++++ 4 files changed, 202 insertions(+), 2 deletions(-) diff --git a/backend/app/services/chat/preprocessing/contexts.py b/backend/app/services/chat/preprocessing/contexts.py index c6f338322f..927b6e30da 100644 --- a/backend/app/services/chat/preprocessing/contexts.py +++ b/backend/app/services/chat/preprocessing/contexts.py @@ -1319,6 +1319,7 @@ def _build_kb_meta_prompt( format_restricted_kb_meta_prompt, select_kb_summary_text, ) + from app.services.knowledge import KnowledgeService from app.services.knowledge.task_knowledge_base_service import ( task_knowledge_base_service, ) @@ -1326,6 +1327,9 @@ def _build_kb_meta_prompt( kb_map = task_knowledge_base_service.get_knowledge_bases_by_ids( db, knowledge_base_ids ) + kb_prompt_stats = KnowledgeService.get_document_prompt_stats( + db, knowledge_base_ids + ) kb_meta_list: list[dict[str, Any]] = [] for kb_id in knowledge_base_ids: @@ -1335,6 +1339,10 @@ def _build_kb_meta_prompt( { "kb_id": kb_id, "kb_name": "Unknown", + "search_available": False, + "total_document_count": 0, + "searchable_document_count": 0, + "spreadsheet_document_count": 0, "summary_text": "", "topics": [], } @@ -1365,10 +1373,24 @@ def _build_kb_meta_prompt( exc_info=True, ) + stats = kb_prompt_stats.get( + kb_id, + { + "total_document_count": 0, + "searchable_document_count": 0, + "spreadsheet_document_count": 0, + }, + ) + retrieval_config = kb_spec.get("retrievalConfig") or {} + kb_meta_list.append( { "kb_id": kb_id, "kb_name": kb_name, + "search_available": bool(retrieval_config.get("retriever_name")), + "total_document_count": stats["total_document_count"], + "searchable_document_count": stats["searchable_document_count"], + "spreadsheet_document_count": stats["spreadsheet_document_count"], "summary_text": summary_text, "topics": topics, } diff --git a/backend/app/services/chat/preprocessing/kb_meta.py b/backend/app/services/chat/preprocessing/kb_meta.py index 7437d793ca..580a69efd3 100644 --- a/backend/app/services/chat/preprocessing/kb_meta.py +++ b/backend/app/services/chat/preprocessing/kb_meta.py @@ -105,7 +105,24 @@ def format_kb_meta_prompt(kb_meta_list: list[dict[str, Any]]) -> str: kb_meta.get("kb_name", "Unknown"), "Unknown" ) kb_id = sanitize_prompt_identifier(kb_meta.get("kb_id", "N/A"), "N/A") - kb_lines.append(f"- KB Name: {kb_name}, KB ID: {kb_id}") + search_available = ( + "available" if kb_meta.get("search_available") else "unavailable" + ) + total_document_count = int(kb_meta.get("total_document_count", 0) or 0) + searchable_document_count = int( + kb_meta.get("searchable_document_count", 0) or 0 + ) + spreadsheet_document_count = int( + kb_meta.get("spreadsheet_document_count", 0) or 0 + ) + + kb_lines.append( + f"- KB Name: {kb_name}, KB ID: {kb_id}, " + f"Search: {search_available}, " + f"Total Docs: {total_document_count}, " + f"Searchable Docs: {searchable_document_count}, " + f"Spreadsheets: {spreadsheet_document_count}" + ) summary_text = kb_meta.get("summary_text") or "" topics = kb_meta.get("topics") or [] diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py index 06898e58e1..c6dddf7f4b 100644 --- a/backend/app/services/knowledge/knowledge_service.py +++ b/backend/app/services/knowledge/knowledge_service.py @@ -9,7 +9,7 @@ from dataclasses import dataclass from typing import Optional -from sqlalchemy import and_, func +from sqlalchemy import and_, case, func from sqlalchemy.orm import Session from sqlalchemy.orm.attributes import flag_modified @@ -845,6 +845,50 @@ def get_active_document_counts( return {kb_id: count for kb_id, count in results} + @staticmethod + def get_document_prompt_stats( + db: Session, + knowledge_base_ids: list[int], + ) -> dict[int, dict[str, int]]: + """Get prompt-oriented document stats for multiple knowledge bases.""" + if not knowledge_base_ids: + return {} + + spreadsheet_exts = ["csv", "xls", "xlsx"] + + results = ( + db.query( + KnowledgeDocument.kind_id, + func.count(KnowledgeDocument.id).label("total_count"), + func.sum(case((KnowledgeDocument.is_active == True, 1), else_=0)).label( + "searchable_count" + ), + func.sum( + case( + ( + func.lower(KnowledgeDocument.file_extension).in_( + spreadsheet_exts + ), + 1, + ), + else_=0, + ) + ).label("spreadsheet_count"), + ) + .filter(KnowledgeDocument.kind_id.in_(knowledge_base_ids)) + .group_by(KnowledgeDocument.kind_id) + .all() + ) + + return { + kb_id: { + "total_document_count": int(total_count or 0), + "searchable_document_count": int(searchable_count or 0), + "spreadsheet_document_count": int(spreadsheet_count or 0), + } + for kb_id, total_count, searchable_count, spreadsheet_count in results + } + @staticmethod def get_active_document_text_length_stats( db: Session, diff --git a/backend/tests/services/test_kb_meta_prompt.py b/backend/tests/services/test_kb_meta_prompt.py index 30089a4f93..6f2e92f38d 100644 --- a/backend/tests/services/test_kb_meta_prompt.py +++ b/backend/tests/services/test_kb_meta_prompt.py @@ -7,6 +7,8 @@ All comments must be written in English. """ +from unittest.mock import MagicMock, patch + import pytest @@ -25,12 +27,20 @@ def test_format_kb_meta_prompt_with_summary_and_topics(self): { "kb_id": 1, "kb_name": "KB1", + "search_available": True, + "total_document_count": 3, + "searchable_document_count": 2, + "spreadsheet_document_count": 1, "summary_text": "S1", "topics": ["t1", "t2"], }, { "kb_id": 2, "kb_name": "KB2", + "search_available": False, + "total_document_count": 0, + "searchable_document_count": 0, + "spreadsheet_document_count": 0, "summary_text": "", "topics": [], }, @@ -40,9 +50,14 @@ def test_format_kb_meta_prompt_with_summary_and_topics(self): assert "Knowledge Bases In Scope" in prompt assert "KB Name: KB1" in prompt assert "KB ID: 1" in prompt + assert "Search: available" in prompt + assert "Total Docs: 3" in prompt + assert "Searchable Docs: 2" in prompt + assert "Spreadsheets: 1" in prompt assert "Summary: S1" in prompt assert "Topics: t1, t2" in prompt assert "KB Name: KB2" in prompt + assert "Search: unavailable" in prompt assert "request-scoped metadata only" in prompt def test_format_kb_meta_prompt_marks_single_selected_kb_as_target(self): @@ -67,6 +82,54 @@ def test_format_kb_meta_prompt_marks_single_selected_kb_as_target(self): assert "list_knowledge_bases" not in prompt assert "clarifying questions" not in prompt + def test_format_kb_meta_prompt_includes_runtime_retrieval_fields(self): + from app.services.chat.preprocessing.kb_meta import format_kb_meta_prompt + + prompt = format_kb_meta_prompt( + [ + { + "kb_id": 12, + "kb_name": "Product Docs", + "search_available": True, + "total_document_count": 128, + "searchable_document_count": 113, + "spreadsheet_document_count": 9, + "summary_text": "Internal product documentation and runbooks", + "topics": ["release process", "deployment"], + } + ] + ) + + assert "KB Name: Product Docs" in prompt + assert "KB ID: 12" in prompt + assert "Search: available" in prompt + assert "Total Docs: 128" in prompt + assert "Searchable Docs: 113" in prompt + assert "Spreadsheets: 9" in prompt + + def test_format_kb_meta_prompt_marks_search_unavailable(self): + from app.services.chat.preprocessing.kb_meta import format_kb_meta_prompt + + prompt = format_kb_meta_prompt( + [ + { + "kb_id": 7, + "kb_name": "Ops KB", + "search_available": False, + "total_document_count": 10, + "searchable_document_count": 6, + "spreadsheet_document_count": 2, + "summary_text": "", + "topics": [], + } + ] + ) + + assert "Search: unavailable" in prompt + assert "Total Docs: 10" in prompt + assert "Searchable Docs: 6" in prompt + assert "Spreadsheets: 2" in prompt + def test_select_kb_summary_text_prefers_long_for_small_list(self): from app.services.chat.preprocessing.kb_meta import select_kb_summary_text @@ -113,3 +176,57 @@ def test_format_restricted_kb_meta_prompt_includes_safe_routing_hints(self): assert "retrieval guidance only" in prompt assert "document structure" not in prompt assert "high-level analysis" not in prompt + + def test_build_kb_meta_prompt_passes_retrieval_metadata(self): + from app.services.chat.preprocessing.contexts import _build_kb_meta_prompt + + kb_kind = MagicMock() + kb_kind.json = { + "spec": { + "name": "Product Docs", + "retrievalConfig": {"retriever_name": "retriever-a"}, + "summaryEnabled": False, + } + } + captured_meta = {} + + def _capture_meta(kb_meta_list): + captured_meta["value"] = kb_meta_list + return "captured" + + with ( + patch( + "app.services.knowledge.task_knowledge_base_service.task_knowledge_base_service.get_knowledge_bases_by_ids", + return_value={12: kb_kind}, + ), + patch( + "app.services.chat.preprocessing.kb_meta.format_kb_meta_prompt", + side_effect=_capture_meta, + ), + patch( + "app.services.knowledge.KnowledgeService.get_document_prompt_stats", + return_value={ + 12: { + "total_document_count": 128, + "searchable_document_count": 113, + "spreadsheet_document_count": 9, + } + }, + create=True, + ), + ): + result = _build_kb_meta_prompt(MagicMock(), [12]) + + assert result == "captured" + assert captured_meta["value"] == [ + { + "kb_id": 12, + "kb_name": "Product Docs", + "search_available": True, + "total_document_count": 128, + "searchable_document_count": 113, + "spreadsheet_document_count": 9, + "summary_text": "", + "topics": [], + } + ] From 83630b71ccdb171e311e97fd63a75559dcf9fd59 Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Sat, 4 Apr 2026 18:47:40 +0800 Subject: [PATCH 3/7] feat(chat-shell): add scoped KB search arguments --- .../tools/builtin/knowledge_base.py | 43 ++++++++++- .../test_knowledge_base_scoped_search.py | 71 +++++++++++++++++++ 2 files changed, 113 insertions(+), 1 deletion(-) create mode 100644 chat_shell/tests/test_knowledge_base_scoped_search.py diff --git a/chat_shell/chat_shell/tools/builtin/knowledge_base.py b/chat_shell/chat_shell/tools/builtin/knowledge_base.py index 5076de47a5..ff5d6b77d1 100644 --- a/chat_shell/chat_shell/tools/builtin/knowledge_base.py +++ b/chat_shell/chat_shell/tools/builtin/knowledge_base.py @@ -10,7 +10,8 @@ import json import logging -from typing import Any, Dict, List, Optional +from contextlib import contextmanager +from typing import Any, Dict, Iterator, List, Optional from langchain_core.callbacks import CallbackManagerForToolRun from langchain_core.tools import BaseTool @@ -46,6 +47,14 @@ class KnowledgeBaseInput(BaseModel): default=20, description="Maximum number of results to return. Increased from 5 to 20 for better RAG coverage.", ) + document_ids: list[int] = Field( + default_factory=list, + description="Optional document IDs to restrict retrieval scope.", + ) + document_names: list[str] = Field( + default_factory=list, + description="Optional exact document names to restrict retrieval scope when document IDs are not known.", + ) class KnowledgeBaseTool(BaseTool): @@ -73,6 +82,7 @@ class KnowledgeBaseTool(BaseTool): # Document IDs to filter (optional, for searching specific documents only) # When set, only chunks from these documents will be returned document_ids: list[int] = Field(default_factory=list) + document_names: list[str] = Field(default_factory=list) # User ID for access control user_id: int = 0 @@ -468,7 +478,36 @@ def _run( """Synchronous run - not implemented, use async version.""" raise NotImplementedError("KnowledgeBaseTool only supports async execution") + @contextmanager + def _apply_call_scoped_filters( + self, + document_ids: Optional[list[int]] = None, + document_names: Optional[list[str]] = None, + ) -> Iterator[None]: + """Temporarily apply per-call document filters.""" + original_document_ids = self.document_ids + original_document_names = self.document_names + self.document_ids = document_ids or original_document_ids + self.document_names = document_names or original_document_names + try: + yield + finally: + self.document_ids = original_document_ids + self.document_names = original_document_names + async def _arun( + self, + query: str, + max_results: int = 20, + document_ids: Optional[list[int]] = None, + document_names: Optional[list[str]] = None, + run_manager: CallbackManagerForToolRun | None = None, + ) -> str: + """Execute knowledge base search with optional per-call scoped filters.""" + with self._apply_call_scoped_filters(document_ids, document_names): + return await self._arun_impl(query, max_results, run_manager) + + async def _arun_impl( self, query: str, max_results: int = 20, @@ -900,6 +939,8 @@ async def _retrieve_with_strategy_via_http( payload["mediation_context"] = mediation_context if self.document_ids: payload["document_ids"] = self.document_ids + if self.document_names: + payload["document_names"] = self.document_names if self.user_name is not None: payload["user_name"] = self.user_name diff --git a/chat_shell/tests/test_knowledge_base_scoped_search.py b/chat_shell/tests/test_knowledge_base_scoped_search.py new file mode 100644 index 0000000000..2c36288152 --- /dev/null +++ b/chat_shell/tests/test_knowledge_base_scoped_search.py @@ -0,0 +1,71 @@ +# SPDX-FileCopyrightText: 2026 Weibo, Inc. +# +# SPDX-License-Identifier: Apache-2.0 + +from unittest.mock import AsyncMock, MagicMock, patch + +import pytest + +from chat_shell.tools.builtin import KnowledgeBaseTool + + +def test_knowledge_base_input_supports_document_names(): + from chat_shell.tools.builtin.knowledge_base import KnowledgeBaseInput + + payload = KnowledgeBaseInput( + query="checklist", + max_results=5, + document_names=["release.md"], + ) + + assert payload.document_names == ["release.md"] + + +@pytest.mark.asyncio +async def test_http_mode_forwards_document_names_and_ids(): + tool = KnowledgeBaseTool( + knowledge_base_ids=[1, 2], + user_id=7, + ) + mock_response = MagicMock() + mock_response.status_code = 200 + mock_response.json.return_value = { + "mode": "rag_retrieval", + "records": [], + "total": 0, + "total_estimated_tokens": 0, + } + + with ( + patch("httpx.AsyncClient") as mock_client, + patch.object( + tool, + "_get_kb_info", + AsyncMock( + return_value={ + "items": [ + { + "id": 1, + "name": "Test KB", + "rag_enabled": True, + "max_calls_per_conversation": 10, + "exempt_calls_before_check": 5, + } + ] + } + ), + ), + ): + post = AsyncMock(return_value=mock_response) + mock_client.return_value.__aenter__.return_value.post = post + + await tool._arun( + query="release checklist", + max_results=8, + document_ids=[101], + document_names=["release.md"], + ) + + payload = post.call_args.kwargs["json"] + assert payload["document_ids"] == [101] + assert payload["document_names"] == ["release.md"] From d7d81406a5f2c017d429e84cb0c0e6c83483c91a Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Sat, 4 Apr 2026 19:17:11 +0800 Subject: [PATCH 4/7] feat(knowledge): improve KB retrieval routing --- backend/app/api/endpoints/internal/rag.py | 49 ++++++++++++-- .../services/knowledge/knowledge_service.py | 26 ++++++++ .../internal/test_rag_retrieve_endpoint.py | 51 +++++++++++++++ .../services/rag/test_retrieval_service.py | 64 +++++++++++++++++++ .../tools/builtin/knowledge_base.py | 33 +++++++++- .../test_knowledge_base_scoped_search.py | 47 ++++++++++++++ shared/prompts/knowledge_base.py | 27 +++++--- shared/tests/test_prompts.py | 9 ++- 8 files changed, 289 insertions(+), 17 deletions(-) diff --git a/backend/app/api/endpoints/internal/rag.py b/backend/app/api/endpoints/internal/rag.py index 777f6381aa..ae4ce6667f 100644 --- a/backend/app/api/endpoints/internal/rag.py +++ b/backend/app/api/endpoints/internal/rag.py @@ -118,6 +118,10 @@ class InternalRetrieveRequest(BaseModel): default=None, description="Optional list of document IDs to filter. Only chunks from these documents will be returned.", ) + document_names: Optional[list[str]] = Field( + default=None, + description="Optional exact document names to resolve into document IDs before retrieval.", + ) route_mode: Literal["auto", "direct_injection", "rag_retrieval"] = Field( default="auto", description="Routing mode: auto decides in Backend, direct_injection forces all-chunks, rag_retrieval forces standard retrieval", @@ -164,6 +168,22 @@ class InternalRetrieveResponse(BaseModel): records: list[RetrieveRecord] total: int total_estimated_tokens: int = 0 + message: Optional[str] = None + + +def _resolve_document_names( + db: Session, + knowledge_base_ids: list[int], + document_names: list[str], +) -> list[int]: + """Resolve exact document names into document IDs within KB scope.""" + from app.services.knowledge import KnowledgeService + + return KnowledgeService.resolve_document_ids_by_names( + db=db, + knowledge_base_ids=knowledge_base_ids, + document_names=document_names, + ) @router.post( @@ -193,11 +213,27 @@ async def internal_retrieve( if request.knowledge_base_id is not None: knowledge_base_ids = [request.knowledge_base_id] - if request.document_ids: + resolved_document_ids = request.document_ids or [] + if not resolved_document_ids and request.document_names: + resolved_document_ids = _resolve_document_names( + db=db, + knowledge_base_ids=knowledge_base_ids, + document_names=request.document_names, + ) + if not resolved_document_ids: + return { + "mode": "rag_retrieval", + "records": [], + "total": 0, + "total_estimated_tokens": 0, + "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.", + } + + if resolved_document_ids: logger.info( "[internal_rag] Filtering by %d documents: %s", - len(request.document_ids), - request.document_ids, + len(resolved_document_ids), + resolved_document_ids, ) runtime_context = request.runtime_context @@ -210,7 +246,7 @@ async def internal_retrieve( knowledge_base_ids=knowledge_base_ids, query=request.query, max_results=request.max_results, - document_ids=request.document_ids, + document_ids=resolved_document_ids or None, route_mode=request.route_mode, user_id=persistence_context.user_id if persistence_context else None, user_name=request.user_name, @@ -277,8 +313,8 @@ async def internal_retrieve( available_injection_tokens, request.query[:50], ( - f", filtered by {len(request.document_ids)} docs" - if request.document_ids + f", filtered by {len(resolved_document_ids)} docs" + if resolved_document_ids else "" ), ) @@ -328,6 +364,7 @@ async def internal_retrieve( ], total=len(records), total_estimated_tokens=total_estimated_tokens, + message=result.get("message"), ) except ValueError as e: diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py index c6dddf7f4b..ceee3f3eda 100644 --- a/backend/app/services/knowledge/knowledge_service.py +++ b/backend/app/services/knowledge/knowledge_service.py @@ -889,6 +889,32 @@ def get_document_prompt_stats( for kb_id, total_count, searchable_count, spreadsheet_count in results } + @staticmethod + def resolve_document_ids_by_names( + db: Session, + knowledge_base_ids: list[int], + document_names: list[str], + ) -> list[int]: + """Resolve exact document names within the provided knowledge-base scope.""" + if not knowledge_base_ids or not document_names: + return [] + + normalized_names = [ + name.strip() for name in document_names if name and name.strip() + ] + if not normalized_names: + return [] + + rows = ( + db.query(KnowledgeDocument.id) + .filter( + KnowledgeDocument.kind_id.in_(knowledge_base_ids), + KnowledgeDocument.name.in_(normalized_names), + ) + .all() + ) + return [row.id for row in rows] + @staticmethod def get_active_document_text_length_stats( db: Session, diff --git a/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py b/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py index 7d51d22a98..8e936fa723 100644 --- a/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py +++ b/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py @@ -111,3 +111,54 @@ def test_internal_retrieve_keeps_user_subtask_id_out_of_gateway(test_client): assert response.status_code == 200 mock_query.assert_awaited_once_with(ANY, db=ANY) mock_persist.assert_called_once() + + +def test_internal_retrieve_resolves_document_names_before_query(test_client): + with ( + patch( + "app.api.endpoints.internal.rag._resolve_document_names", + return_value=[101, 102], + ) as mock_resolve, + patch( + "app.api.endpoints.internal.rag.LocalRagGateway.query", + new_callable=AsyncMock, + return_value={ + "mode": "rag_retrieval", + "records": [], + "total": 0, + "total_estimated_tokens": 0, + }, + ) as mock_query, + ): + response = test_client.post( + "/api/internal/rag/retrieve", + json={ + "query": "release checklist", + "knowledge_base_ids": [12], + "document_names": ["release.md"], + }, + ) + + assert response.status_code == 200 + mock_resolve.assert_called_once() + mock_query.assert_awaited_once() + + +def test_internal_retrieve_returns_error_when_document_names_not_found(test_client): + with patch( + "app.api.endpoints.internal.rag._resolve_document_names", + return_value=[], + ): + response = test_client.post( + "/api/internal/rag/retrieve", + json={ + "query": "release checklist", + "knowledge_base_ids": [12], + "document_names": ["missing.md"], + }, + ) + + assert response.status_code == 200 + assert response.json()["mode"] == "rag_retrieval" + assert response.json()["records"] == [] + assert response.json()["message"].startswith("Document names not found") diff --git a/backend/tests/services/rag/test_retrieval_service.py b/backend/tests/services/rag/test_retrieval_service.py index 37e463e5d5..1d7da646d4 100644 --- a/backend/tests/services/rag/test_retrieval_service.py +++ b/backend/tests/services/rag/test_retrieval_service.py @@ -383,6 +383,70 @@ async def test_force_rag_route_uses_standard_retrieval(self): assert result["records"][0]["score"] == 0.9 assert result["records"][0]["knowledge_base_id"] == 123 + @pytest.mark.asyncio + async def test_package_mode_resolves_document_names_before_retrieval(self): + """Package mode should resolve document_names before calling retrieval service.""" + from chat_shell.tools.builtin import KnowledgeBaseTool + + tool = KnowledgeBaseTool( + knowledge_base_ids=[1], + db_session=MagicMock(), + user_id=7, + ) + + with ( + patch.object( + tool, + "_get_kb_info", + AsyncMock( + return_value={ + "items": [ + { + "id": 1, + "name": "Test KB", + "rag_enabled": True, + "max_calls_per_conversation": 10, + "exempt_calls_before_check": 5, + } + ] + } + ), + ), + patch( + "app.services.knowledge.KnowledgeService.resolve_document_ids_by_names", + return_value=[301], + create=True, + ) as mock_resolve, + patch( + "app.services.rag.retrieval_service.RetrievalService.retrieve_for_chat_shell", + new_callable=AsyncMock, + return_value={ + "mode": "rag_retrieval", + "records": [ + { + "content": "match", + "score": 0.9, + "title": "release.md", + "knowledge_base_id": 1, + } + ], + "total": 1, + "total_estimated_tokens": 0, + }, + ) as mock_retrieve, + ): + await tool._arun( + query="release checklist", + document_names=["release.md"], + ) + + mock_resolve.assert_called_once_with( + db=tool.db_session, + knowledge_base_ids=[1], + document_names=["release.md"], + ) + assert mock_retrieve.await_args.kwargs["document_ids"] == [301] + @pytest.mark.asyncio async def test_force_rag_route_sorts_and_limits_results_globally(self): """Backend should return the final globally ranked RAG records.""" diff --git a/chat_shell/chat_shell/tools/builtin/knowledge_base.py b/chat_shell/chat_shell/tools/builtin/knowledge_base.py index ff5d6b77d1..bcec1c64e6 100644 --- a/chat_shell/chat_shell/tools/builtin/knowledge_base.py +++ b/chat_shell/chat_shell/tools/builtin/knowledge_base.py @@ -628,11 +628,12 @@ async def _arun_impl( raw_result.get("records", []) ) if not kb_chunks: - message = ( + default_message = ( "No documents found in the knowledge base." if route_mode == InjectionMode.DIRECT_INJECTION else "No relevant information found in the knowledge base for this query." ) + message = raw_result.get("message") or default_message return json.dumps( { "query": query, @@ -840,6 +841,34 @@ async def _retrieve_with_strategy_from_all_kbs( ) else: try: + resolved_document_ids = self.document_ids or None + if not resolved_document_ids and self.document_names: + from app.services.knowledge import KnowledgeService + + resolved_document_ids = ( + KnowledgeService.resolve_document_ids_by_names( + db=self.db_session, + knowledge_base_ids=self.knowledge_base_ids, + document_names=self.document_names, + ) + or None + ) + if not resolved_document_ids: + result = { + "mode": InjectionMode.RAG_ONLY, + "records": [], + "total": 0, + "total_estimated_tokens": 0, + "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.", + } + mode = result.get("mode", InjectionMode.RAG_ONLY) + logger.info( + "[KnowledgeBaseTool] Retrieved %d records via Backend route mode=%s", + len(result.get("records", [])), + mode, + ) + return mode, result + from app.services.rag.retrieval_service import RetrievalService retrieval_service = RetrievalService() @@ -848,7 +877,7 @@ async def _retrieve_with_strategy_from_all_kbs( knowledge_base_ids=self.knowledge_base_ids, db=self.db_session, max_results=max_results, - document_ids=self.document_ids or None, + document_ids=resolved_document_ids, user_name=self.user_name, route_mode=route_mode, user_id=self.user_id, diff --git a/chat_shell/tests/test_knowledge_base_scoped_search.py b/chat_shell/tests/test_knowledge_base_scoped_search.py index 2c36288152..39ae9459e0 100644 --- a/chat_shell/tests/test_knowledge_base_scoped_search.py +++ b/chat_shell/tests/test_knowledge_base_scoped_search.py @@ -2,6 +2,7 @@ # # SPDX-License-Identifier: Apache-2.0 +import json from unittest.mock import AsyncMock, MagicMock, patch import pytest @@ -69,3 +70,49 @@ async def test_http_mode_forwards_document_names_and_ids(): payload = post.call_args.kwargs["json"] assert payload["document_ids"] == [101] assert payload["document_names"] == ["release.md"] + + +@pytest.mark.asyncio +async def test_arun_preserves_backend_scoped_search_error_message(): + tool = KnowledgeBaseTool( + knowledge_base_ids=[1], + user_id=7, + ) + + with ( + patch.object( + tool, + "_get_kb_info", + AsyncMock( + return_value={ + "items": [ + { + "id": 1, + "name": "Test KB", + "rag_enabled": True, + "max_calls_per_conversation": 10, + "exempt_calls_before_check": 5, + } + ] + } + ), + ), + patch.object( + tool, + "_retrieve_with_strategy_from_all_kbs", + AsyncMock( + return_value=( + "rag_retrieval", + { + "mode": "rag_retrieval", + "records": [], + "total": 0, + "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.", + }, + ) + ), + ), + ): + result = json.loads(await tool._arun(query="release checklist")) + + assert result["message"].startswith("Document names not found") diff --git a/shared/prompts/knowledge_base.py b/shared/prompts/knowledge_base.py index bcc611af31..d51c0c1a37 100644 --- a/shared/prompts/knowledge_base.py +++ b/shared/prompts/knowledge_base.py @@ -34,7 +34,11 @@ - Action: Call `kb_ls` for the selected knowledge base(s). Summarize the document list and ask which document(s) to open if needed. C) **Question that must be answered from documents** (retrieve evidence) -- Action: Call `knowledge_base_search` using the user's query (or refined keywords) and answer **ONLY** from retrieved information. +- Action: Choose the highest-signal retrieval path first: + - If the user already named the target file(s), prefer `knowledge_base_search` with exact `document_names`. + - If you already know exact document IDs, prefer `knowledge_base_search` with `document_ids`. + - If the file names are unknown and scoping would help, call `kb_ls` first, then use scoped `knowledge_base_search` or `kb_head`. + - If the question is broad and no file-level scope is needed, call `knowledge_base_search` globally. D) **Knowledge base management** (optional, only if tools exist) - Examples: "Create a KB", "Add/update a document", "List all my KBs" (management, not Q&A) @@ -43,14 +47,15 @@ ### Required Workflow: (ONLY for type C) -1. Call `knowledge_base_search` first -2. Wait for results -3. Answer **ONLY** from retrieved information -4. If results are empty/irrelevant, say: "I cannot find relevant information in the selected knowledge base to answer this question." -5. Do not use general knowledge or assumptions +1. Decide whether global search, scoped search, or `kb_ls` is the best first move +2. If you know exact files, use scoped `knowledge_base_search(document_names=[...])` or `knowledge_base_search(document_ids=[...])` +3. Wait for results or inspect documents with `kb_head` when browsing is required +4. Answer **ONLY** from retrieved or directly read knowledge-base content +5. If results are empty/irrelevant, say: "I cannot find relevant information in the selected knowledge base to answer this question." +6. Do not use general knowledge or assumptions ### Critical Rules: -- Type C: you MUST NOT answer without searching first +- Type C: you MUST retrieve evidence before answering, but you may choose `kb_ls` first when that is higher-signal - Type A/B: you MUST NOT force `knowledge_base_search` first (it is often low-signal) - Do not invent information not present in the knowledge base @@ -60,6 +65,7 @@ Use exploration tools when: - The user asks for an overview / document list (type B) +- You need exact document names before scoped search - `knowledge_base_search` is unavailable (rag_not_configured / rejected) or you hit call-limit warnings (⚠️/🚨) Do not use exploration tools just because RAG returned no results. @@ -86,7 +92,11 @@ - Action: Prefer `kb_ls` (then `kb_head` only when the user wants to open a specific document). C) **Content question** -- Action: Prefer `knowledge_base_search`. +- Action: Choose the highest-signal retrieval path first. + - Use `knowledge_base_search` globally for broad KB questions. + - Use scoped `knowledge_base_search(document_names=[...])` when the user already knows exact file names. + - Use scoped `knowledge_base_search(document_ids=[...])` when exact document IDs are already known. + - If you need exact file names or want to narrow the scope before searching, call `kb_ls` first, then use scoped search or `kb_head`. - If results are relevant: answer using KB content and cite sources. - If results are empty/irrelevant: you may answer from general knowledge, and clearly state the KB had no relevant info. - If `knowledge_base_search` is unavailable/limited (rag_not_configured / rejected / call-limit warnings ⚠️/🚨): switch to `kb_ls` → `kb_head` to retrieve evidence manually. @@ -99,6 +109,7 @@ - Prefer knowledge base content when relevant; cite sources when used - If the KB has no relevant content, say so and answer from general knowledge - For "what's in the KB" questions, `kb_ls` is usually higher-signal than `knowledge_base_search` +- `document_names` matching is exact, so use `kb_ls` first when you do not know the exact file names """ diff --git a/shared/tests/test_prompts.py b/shared/tests/test_prompts.py index 5dc7559faf..201076dcad 100644 --- a/shared/tests/test_prompts.py +++ b/shared/tests/test_prompts.py @@ -40,10 +40,14 @@ def test_kb_prompt_strict_contains_required_content(self): assert "Intent Routing" in KB_PROMPT_STRICT assert "knowledge base selection / metadata" in strict_lower assert "question that must be answered from documents" in strict_lower - assert "must not answer without searching first" in strict_lower + assert "must retrieve evidence before answering" in strict_lower assert "general knowledge" in strict_lower assert "selected knowledge base content only" in strict_lower assert "knowledge base management" in strict_lower + assert "document_names" in KB_PROMPT_STRICT + assert "document_ids" in KB_PROMPT_STRICT + assert "kb_ls" in KB_PROMPT_STRICT + assert "kb_head" in KB_PROMPT_STRICT def test_kb_prompt_relaxed_contains_required_content(self): """KB_PROMPT_RELAXED should contain relaxed mode instructions.""" @@ -55,6 +59,9 @@ def test_kb_prompt_relaxed_contains_required_content(self): assert "general knowledge" in KB_PROMPT_RELAXED # All KB prompts use Intent Routing pattern assert "Intent Routing" in KB_PROMPT_RELAXED + assert "document_names" in KB_PROMPT_RELAXED + assert "document_ids" in KB_PROMPT_RELAXED + assert "kb_ls" in KB_PROMPT_RELAXED def test_prompts_are_different(self): """Strict and relaxed prompts should be different.""" From eb7f72ad0c962e924981a2c46e4dc5b74ec8a824 Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Tue, 7 Apr 2026 10:44:19 +0800 Subject: [PATCH 5/7] fix(knowledge): tighten KB scoped retrieval --- backend/app/api/endpoints/internal/rag.py | 14 ++-- .../services/knowledge/knowledge_service.py | 1 + .../knowledge/test_knowledge_service.py | 49 +++++++++++++ .../tools/builtin/knowledge_base.py | 69 ++++++++++++------- .../test_knowledge_base_scoped_search.py | 58 ++++++++++++++++ 5 files changed, 158 insertions(+), 33 deletions(-) diff --git a/backend/app/api/endpoints/internal/rag.py b/backend/app/api/endpoints/internal/rag.py index ae4ce6667f..66871619b8 100644 --- a/backend/app/api/endpoints/internal/rag.py +++ b/backend/app/api/endpoints/internal/rag.py @@ -221,13 +221,13 @@ async def internal_retrieve( document_names=request.document_names, ) if not resolved_document_ids: - return { - "mode": "rag_retrieval", - "records": [], - "total": 0, - "total_estimated_tokens": 0, - "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.", - } + return InternalRetrieveResponse( + mode="rag_retrieval", + records=[], + total=0, + total_estimated_tokens=0, + message="Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.", + ) if resolved_document_ids: logger.info( diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py index ceee3f3eda..3a298543dd 100644 --- a/backend/app/services/knowledge/knowledge_service.py +++ b/backend/app/services/knowledge/knowledge_service.py @@ -910,6 +910,7 @@ def resolve_document_ids_by_names( .filter( KnowledgeDocument.kind_id.in_(knowledge_base_ids), KnowledgeDocument.name.in_(normalized_names), + KnowledgeDocument.is_active == True, ) .all() ) diff --git a/backend/tests/services/knowledge/test_knowledge_service.py b/backend/tests/services/knowledge/test_knowledge_service.py index 8793503631..40e980b5b9 100644 --- a/backend/tests/services/knowledge/test_knowledge_service.py +++ b/backend/tests/services/knowledge/test_knowledge_service.py @@ -7,6 +7,7 @@ import pytest +from app.models.knowledge import KnowledgeDocument from app.services.context import context_service from app.services.knowledge.knowledge_service import KnowledgeService @@ -62,3 +63,51 @@ def test_update_document_content_overwrites_attachment_binary(self) -> None: binary_data="# Updated release notes".encode("utf-8"), ) db.refresh.assert_called_once_with(document) + + +@pytest.mark.unit +class TestKnowledgeServiceResolveDocumentIdsByNames: + def test_resolve_document_ids_by_names_returns_only_active_docs_in_scope( + self, test_db, test_user + ) -> None: + """Document-name resolution should ignore inactive and out-of-scope rows.""" + active_doc = KnowledgeDocument( + kind_id=10, + attachment_id=0, + name="release.md", + file_extension="md", + file_size=10, + user_id=test_user.id, + is_active=True, + source_type="file", + ) + inactive_doc = KnowledgeDocument( + kind_id=10, + attachment_id=0, + name="release.md", + file_extension="md", + file_size=10, + user_id=test_user.id, + is_active=False, + source_type="file", + ) + other_kb_doc = KnowledgeDocument( + kind_id=11, + attachment_id=0, + name="release.md", + file_extension="md", + file_size=10, + user_id=test_user.id, + is_active=True, + source_type="file", + ) + test_db.add_all([active_doc, inactive_doc, other_kb_doc]) + test_db.commit() + + resolved_ids = KnowledgeService.resolve_document_ids_by_names( + db=test_db, + knowledge_base_ids=[10], + document_names=["release.md"], + ) + + assert resolved_ids == [active_doc.id] diff --git a/chat_shell/chat_shell/tools/builtin/knowledge_base.py b/chat_shell/chat_shell/tools/builtin/knowledge_base.py index bcec1c64e6..803080759e 100644 --- a/chat_shell/chat_shell/tools/builtin/knowledge_base.py +++ b/chat_shell/chat_shell/tools/builtin/knowledge_base.py @@ -10,8 +10,7 @@ import json import logging -from contextlib import contextmanager -from typing import Any, Dict, Iterator, List, Optional +from typing import Any, Dict, List, Optional from langchain_core.callbacks import CallbackManagerForToolRun from langchain_core.tools import BaseTool @@ -478,22 +477,19 @@ def _run( """Synchronous run - not implemented, use async version.""" raise NotImplementedError("KnowledgeBaseTool only supports async execution") - @contextmanager - def _apply_call_scoped_filters( + def _resolve_scoped_filters( self, document_ids: Optional[list[int]] = None, document_names: Optional[list[str]] = None, - ) -> Iterator[None]: - """Temporarily apply per-call document filters.""" - original_document_ids = self.document_ids - original_document_names = self.document_names - self.document_ids = document_ids or original_document_ids - self.document_names = document_names or original_document_names - try: - yield - finally: - self.document_ids = original_document_ids - self.document_names = original_document_names + ) -> tuple[list[int], list[str]]: + """Resolve per-call filters without mutating tool instance state.""" + effective_document_ids = ( + self.document_ids if document_ids is None else document_ids + ) + effective_document_names = ( + self.document_names if document_names is None else document_names + ) + return effective_document_ids, effective_document_names async def _arun( self, @@ -504,14 +500,25 @@ async def _arun( run_manager: CallbackManagerForToolRun | None = None, ) -> str: """Execute knowledge base search with optional per-call scoped filters.""" - with self._apply_call_scoped_filters(document_ids, document_names): - return await self._arun_impl(query, max_results, run_manager) + effective_document_ids, effective_document_names = self._resolve_scoped_filters( + document_ids=document_ids, + document_names=document_names, + ) + return await self._arun_impl( + query, + max_results, + run_manager, + document_ids=effective_document_ids, + document_names=effective_document_names, + ) async def _arun_impl( self, query: str, max_results: int = 20, run_manager: CallbackManagerForToolRun | None = None, + document_ids: Optional[list[int]] = None, + document_names: Optional[list[str]] = None, ) -> str: """Execute knowledge base search with intelligent injection strategy. @@ -533,6 +540,8 @@ async def _arun_impl( JSON string with search results or injected content """ try: + effective_document_ids = document_ids or [] + effective_document_names = document_names or [] if not self.knowledge_base_ids: return json.dumps( {"error": "No knowledge bases configured for this conversation."} @@ -597,8 +606,8 @@ async def _arun_impl( logger.info( f"[KnowledgeBaseTool] Searching {len(self.knowledge_base_ids)} knowledge bases with query: {query}" + ( - f", filtering by {len(self.document_ids)} documents" - if self.document_ids + f", filtering by {len(effective_document_ids)} documents" + if effective_document_ids else "" ) ) @@ -613,6 +622,8 @@ async def _arun_impl( query=query, max_results=max_results, route_mode=preferred_route_mode, + document_ids=effective_document_ids, + document_names=effective_document_names, ) logger.info( @@ -831,6 +842,8 @@ async def _retrieve_with_strategy_from_all_kbs( query: str, max_results: int, route_mode: str = "auto", + document_ids: Optional[list[int]] = None, + document_names: Optional[list[str]] = None, ) -> tuple[str, Dict[str, Any]]: """Retrieve KB data using Backend-side route selection.""" if self.db_session is None: @@ -838,18 +851,20 @@ async def _retrieve_with_strategy_from_all_kbs( query=query, max_results=max_results, route_mode=route_mode, + document_ids=document_ids, + document_names=document_names, ) else: try: - resolved_document_ids = self.document_ids or None - if not resolved_document_ids and self.document_names: + resolved_document_ids = document_ids or None + if not resolved_document_ids and document_names: from app.services.knowledge import KnowledgeService resolved_document_ids = ( KnowledgeService.resolve_document_ids_by_names( db=self.db_session, knowledge_base_ids=self.knowledge_base_ids, - document_names=self.document_names, + document_names=document_names, ) or None ) @@ -941,6 +956,8 @@ async def _retrieve_with_strategy_via_http( query: str, max_results: int, route_mode: str = "auto", + document_ids: Optional[list[int]] = None, + document_names: Optional[list[str]] = None, ) -> Dict[str, Any]: """Retrieve KB data from Backend internal retrieve endpoint.""" import httpx @@ -966,10 +983,10 @@ async def _retrieve_with_strategy_via_http( mediation_context = self._build_mediation_context() if mediation_context: payload["mediation_context"] = mediation_context - if self.document_ids: - payload["document_ids"] = self.document_ids - if self.document_names: - payload["document_names"] = self.document_names + if document_ids: + payload["document_ids"] = document_ids + if document_names: + payload["document_names"] = document_names if self.user_name is not None: payload["user_name"] = self.user_name diff --git a/chat_shell/tests/test_knowledge_base_scoped_search.py b/chat_shell/tests/test_knowledge_base_scoped_search.py index 39ae9459e0..d00fb4434d 100644 --- a/chat_shell/tests/test_knowledge_base_scoped_search.py +++ b/chat_shell/tests/test_knowledge_base_scoped_search.py @@ -116,3 +116,61 @@ async def test_arun_preserves_backend_scoped_search_error_message(): result = json.loads(await tool._arun(query="release checklist")) assert result["message"].startswith("Document names not found") + + +@pytest.mark.asyncio +async def test_arun_passes_scoped_filters_without_mutating_tool_defaults(): + tool = KnowledgeBaseTool( + knowledge_base_ids=[1], + user_id=7, + document_ids=[999], + document_names=["default.md"], + ) + + async def _fake_retrieve(**kwargs): + assert kwargs["document_ids"] == [101] + assert kwargs["document_names"] == ["release.md"] + assert tool.document_ids == [999] + assert tool.document_names == ["default.md"] + return ( + "rag_retrieval", + { + "mode": "rag_retrieval", + "records": [], + "total": 0, + "total_estimated_tokens": 0, + }, + ) + + with ( + patch.object( + tool, + "_get_kb_info", + AsyncMock( + return_value={ + "items": [ + { + "id": 1, + "name": "Test KB", + "rag_enabled": True, + "max_calls_per_conversation": 10, + "exempt_calls_before_check": 5, + } + ] + } + ), + ), + patch.object( + tool, + "_retrieve_with_strategy_from_all_kbs", + AsyncMock(side_effect=_fake_retrieve), + ), + ): + await tool._arun( + query="release checklist", + document_ids=[101], + document_names=["release.md"], + ) + + assert tool.document_ids == [999] + assert tool.document_names == ["default.md"] From dad292820693650a2518f9e33bce1eeec404e04c Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Tue, 7 Apr 2026 10:51:32 +0800 Subject: [PATCH 6/7] chore(docs): remove local KB spec from branch --- .../2026-04-04-kb-rag-retrieval-design.md | 229 ------------------ shared/prompts/knowledge_base.py | 3 + shared/tests/test_prompts.py | 5 + 3 files changed, 8 insertions(+), 229 deletions(-) delete mode 100644 docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md diff --git a/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md b/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md deleted file mode 100644 index 74f15fb539..0000000000 --- a/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md +++ /dev/null @@ -1,229 +0,0 @@ ---- -sidebar_position: 1 ---- - -# KB RAG Retrieval Design - -## Overview - -This design improves knowledge-base retrieval behavior for AI agents without adding new tools, new persistent configuration, or new backend I/O requests. - -The current system already provides three complementary KB capabilities: - -- `kb_ls` for document-level overview -- `knowledge_base_search` for fast retrieval -- `kb_head` for direct content reading - -The main gap is not missing infrastructure. The gap is that agents do not have enough guidance or enough scoped-search controls to use these capabilities proactively on large or complex knowledge bases. - -This design keeps the current intent-routing model, but changes the routing outcome for KB content questions from a mostly fixed "search first" path to a more flexible "decide the next best retrieval action" path. - -## Goals - -- Preserve intent routing as the first decision step for KB interactions. -- Let the agent choose between direct search, document overview, scoped search, and manual reading based on the current question and KB metadata. -- Allow `knowledge_base_search` to narrow search scope by specific documents. -- Reuse existing retrieval infrastructure and call-limit behavior. -- Avoid extra network requests or new long-lived configuration. - -## Non-Goals - -- No new "smart KB" tool that wraps `kb_ls`, `knowledge_base_search`, and `kb_head`. -- No backend-enforced size thresholds such as small/medium/large KB routing. -- No new generic metadata filter object exposed to the model in the first version. -- No changes to existing call-limit policies. - -## Current Problems - -1. Prompt guidance over-biases KB question answering toward `knowledge_base_search`, which weakens retrieval quality on larger or more heterogeneous KBs. -2. `knowledge_base_search` already supports document filtering internally through `document_ids`, but that capability is not exposed to the model as part of the tool schema. -3. Dynamic KB metadata injected into the conversation does not currently include retrieval-relevant facts such as document count or whether RAG is available. -4. `kb_ls` is described mainly as a fallback path instead of a proactive range-discovery tool. - -## Proposed Behavior - -### Intent Routing - -Intent routing remains the first step. - -For KB-related requests, the model should still first classify whether the user is asking about: - -- KB metadata or selection -- KB document overview -- KB content question -- KB management operations -- Manual content reading in no-RAG situations - -The change is in the next-step guidance for KB content questions: - -- If the question is precise and direct retrieval is likely enough, the model may call `knowledge_base_search` immediately. -- If the question is broad, ambiguous, or likely to depend on document structure or document choice, the model should prefer `kb_ls` first. -- After `kb_ls`, the model may call `knowledge_base_search` again with scoped document arguments. -- If RAG is unavailable, the model should use `kb_ls` and `kb_head`. - -### Dynamic KB Metadata - -The request-scoped KB metadata block should remain lightweight and objective. - -Each KB entry should include: - -- `kb_name` -- `kb_id` -- `document_count` -- `rag_enabled` -- existing optional summary/topic hints when available - -No derived size buckets should be added. Exact values are simpler to maintain and give the model enough signal to decide whether it should first inspect document scope. - -Example shape: - -```text -Knowledge Bases In Scope: -- KB Name: Product Docs, KB ID: 12, Documents: 128, RAG: enabled - - Summary: Internal product documentation and runbooks - - Topics: release process, deployment, alerts -``` - -### Scoped Search - -`knowledge_base_search` should support narrowing retrieval to specific documents with: - -- `document_ids: number[]` -- `document_names: string[]` - -Expected usage: - -- If the model already knows exact document names from the user prompt, previous `kb_ls` output, or conversation history, it may use `document_names` directly. -- If the model does not know the target documents, it should use `kb_ls` first. -- If the model has document IDs, it should prefer `document_ids`. - -## Interface Design - -### Shared Prompt Layer - -Update the KB prompt templates in `shared/prompts/knowledge_base.py`. - -Key prompt adjustments: - -- Keep `Intent Routing (DO THIS FIRST)`. -- Keep the current strict / relaxed / no-RAG / restricted mode structure. -- Change KB content-question guidance from effectively "search first" to "decide whether to inspect scope first". -- Reframe `kb_ls` as a proactive document-overview tool, not only a fallback tool. -- Document that scoped `knowledge_base_search` can use `document_ids` or `document_names` when the target documents are known. - -Restricted mode remains search-only. It does not expose `kb_ls` or `kb_head`. - -### Dynamic KB Meta Prompt - -Update the backend KB meta formatter so each entry includes `document_count` and `rag_enabled`. - -This change should stay formatting-only from the prompt module's perspective. The preprocessing layer remains responsible for assembling the meta list from already available backend state. - -### Chat Shell Tool Schema - -Extend the `KnowledgeBaseInput` schema used by `knowledge_base_search`: - -- add optional `document_ids` -- add optional `document_names` - -The tool continues to call the same backend internal retrieve endpoint. Existing calls with only `query` and `max_results` remain valid and unchanged. - -### Backend Internal Retrieve Surface - -Extend the internal retrieval request model to accept: - -- `document_ids` -- `document_names` - -Resolution rules: - -- If `document_ids` is provided, retrieval uses those IDs directly. -- If `document_names` is provided, backend resolves them to document IDs within the currently scoped KB set. -- If both are provided, `document_ids` takes precedence and `document_names` is ignored. -- Name resolution uses exact matching only. -- If no names resolve, return a structured error instructing the caller to use `kb_ls` first. - -The backend should resolve `document_names` into `document_ids` before entering the existing retrieval/filtering flow so that the downstream retrieval path remains unchanged. - -## Matching Rules - -### `document_names` - -- Matching is exact. -- Matching is scoped to the KBs already attached to the current tool invocation. -- If multiple KBs contain the same exact document name, all matching documents are included. -- No fuzzy matching or contains matching is added in the first version. - -### Ambiguity - -If multiple exact matches exist across KBs, that is acceptable and should not be treated as an error. The current system already supports multi-KB retrieval, so the scoped result set can include all exact matches. - -## Error Handling - -- If `document_names` resolves to no documents, return a structured error and guidance to call `kb_ls`. -- If `document_ids` or `document_names` is an empty list, treat it as absent input. -- If RAG is not configured, preserve the current no-RAG response and guide the model toward `kb_ls` and `kb_head`. -- Restricted mode continues to allow only `knowledge_base_search`, even when scoped arguments are used. -- Do not silently fall back from failed scoped search to global search. Silent widening would reduce precision and make tool behavior unpredictable. - -## Compatibility - -- Existing callers that use only `query` and `max_results` keep the current behavior. -- Existing retrieval logic based on `document_ids -> metadata_condition -> backend storage retrieval` remains the main implementation path. -- No new tool is introduced. -- No new persistent KB configuration is introduced. -- No new call-limit behavior is introduced. - -## Implementation Boundaries - -### Shared - -Responsible for updating the KB prompt instructions only. - -### Chat Shell - -Responsible for exposing scoped-search inputs to the model and forwarding them to backend retrieval. - -### Backend - -Responsible for: - -- enriching KB metadata with `document_count` and `rag_enabled` -- resolving `document_names` to `document_ids` -- keeping retrieval execution on the existing filtered retrieval path - -## Testing - -### Shared - -- Update prompt tests to reflect the new KB guidance wording. -- Add or update tests for KB meta prompt formatting with `document_count` and `rag_enabled`. - -### Chat Shell - -- Verify `knowledge_base_search` schema includes `document_ids`. -- Verify `knowledge_base_search` schema includes `document_names`. -- Verify scoped arguments are forwarded in HTTP mode. -- Verify legacy unscoped usage remains unchanged. - -### Backend - -- Verify internal retrieve accepts `document_names`. -- Verify exact-match name resolution within KB scope. -- Verify unresolved names return the expected structured error. -- Verify multiple exact matches across KBs are included. -- Verify direct `document_ids` filtering still works. -- Verify the downstream retrieval path still uses document-based metadata filtering. - -## Rollout Notes - -This design is intentionally incremental. - -The first version should stop at: - -1. prompt guidance updates -2. dynamic metadata enrichment -3. scoped search support through `document_ids` and `document_names` -4. backend exact-match document-name resolution - -Do not expand the first version into generic filter support, fuzzy matching, or backend hard-coded KB-size routing. diff --git a/shared/prompts/knowledge_base.py b/shared/prompts/knowledge_base.py index d51c0c1a37..f564c0aca2 100644 --- a/shared/prompts/knowledge_base.py +++ b/shared/prompts/knowledge_base.py @@ -38,6 +38,7 @@ - If the user already named the target file(s), prefer `knowledge_base_search` with exact `document_names`. - If you already know exact document IDs, prefer `knowledge_base_search` with `document_ids`. - If the file names are unknown and scoping would help, call `kb_ls` first, then use scoped `knowledge_base_search` or `kb_head`. + - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters. - If the question is broad and no file-level scope is needed, call `knowledge_base_search` globally. D) **Knowledge base management** (optional, only if tools exist) @@ -97,6 +98,7 @@ - Use scoped `knowledge_base_search(document_names=[...])` when the user already knows exact file names. - Use scoped `knowledge_base_search(document_ids=[...])` when exact document IDs are already known. - If you need exact file names or want to narrow the scope before searching, call `kb_ls` first, then use scoped search or `kb_head`. + - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters. - If results are relevant: answer using KB content and cite sources. - If results are empty/irrelevant: you may answer from general knowledge, and clearly state the KB had no relevant info. - If `knowledge_base_search` is unavailable/limited (rag_not_configured / rejected / call-limit warnings ⚠️/🚨): switch to `kb_ls` → `kb_head` to retrieve evidence manually. @@ -110,6 +112,7 @@ - If the KB has no relevant content, say so and answer from general knowledge - For "what's in the KB" questions, `kb_ls` is usually higher-signal than `knowledge_base_search` - `document_names` matching is exact, so use `kb_ls` first when you do not know the exact file names +- If spreadsheet files are present, verify precision-sensitive table details with `kb_head` instead of relying on `knowledge_base_search` alone """ diff --git a/shared/tests/test_prompts.py b/shared/tests/test_prompts.py index 201076dcad..d1dde5711e 100644 --- a/shared/tests/test_prompts.py +++ b/shared/tests/test_prompts.py @@ -48,12 +48,15 @@ def test_kb_prompt_strict_contains_required_content(self): assert "document_ids" in KB_PROMPT_STRICT assert "kb_ls" in KB_PROMPT_STRICT assert "kb_head" in KB_PROMPT_STRICT + assert "spreadsheet files" in strict_lower + assert "lower-confidence" in strict_lower def test_kb_prompt_relaxed_contains_required_content(self): """KB_PROMPT_RELAXED should contain relaxed mode instructions.""" from shared.prompts import KB_PROMPT_RELAXED # Check for key phrases in relaxed mode + relaxed_lower = KB_PROMPT_RELAXED.lower() assert "knowledge_base_search" in KB_PROMPT_RELAXED # Relaxed mode must allow fallback to general knowledge assert "general knowledge" in KB_PROMPT_RELAXED @@ -62,6 +65,8 @@ def test_kb_prompt_relaxed_contains_required_content(self): assert "document_names" in KB_PROMPT_RELAXED assert "document_ids" in KB_PROMPT_RELAXED assert "kb_ls" in KB_PROMPT_RELAXED + assert "spreadsheet files" in relaxed_lower + assert "kb_head" in relaxed_lower def test_prompts_are_different(self): """Strict and relaxed prompts should be different.""" From 9ac8342bbb693033620670154134deda713da19b Mon Sep 17 00:00:00 2001 From: yanhe1 Date: Tue, 7 Apr 2026 11:41:08 +0800 Subject: [PATCH 7/7] fix(knowledge): refine spreadsheet retrieval guidance --- .../services/knowledge/knowledge_service.py | 7 ++- .../knowledge/test_knowledge_service.py | 51 +++++++++++++++++++ shared/prompts/knowledge_base.py | 6 +-- shared/tests/test_prompts.py | 3 +- 4 files changed, 61 insertions(+), 6 deletions(-) diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py index 3a298543dd..5715d2d8b8 100644 --- a/backend/app/services/knowledge/knowledge_service.py +++ b/backend/app/services/knowledge/knowledge_service.py @@ -866,8 +866,11 @@ def get_document_prompt_stats( func.sum( case( ( - func.lower(KnowledgeDocument.file_extension).in_( - spreadsheet_exts + and_( + KnowledgeDocument.is_active == True, + func.lower(KnowledgeDocument.file_extension).in_( + spreadsheet_exts + ), ), 1, ), diff --git a/backend/tests/services/knowledge/test_knowledge_service.py b/backend/tests/services/knowledge/test_knowledge_service.py index 40e980b5b9..e5924acf21 100644 --- a/backend/tests/services/knowledge/test_knowledge_service.py +++ b/backend/tests/services/knowledge/test_knowledge_service.py @@ -111,3 +111,54 @@ def test_resolve_document_ids_by_names_returns_only_active_docs_in_scope( ) assert resolved_ids == [active_doc.id] + + +@pytest.mark.unit +class TestKnowledgeServiceGetDocumentPromptStats: + def test_get_document_prompt_stats_counts_only_active_spreadsheets( + self, test_db, test_user + ) -> None: + """Spreadsheet count should reflect searchable spreadsheet documents only.""" + active_csv = KnowledgeDocument( + kind_id=10, + attachment_id=0, + name="active-report.csv", + file_extension="csv", + file_size=10, + user_id=test_user.id, + is_active=True, + source_type="file", + ) + inactive_xlsx = KnowledgeDocument( + kind_id=10, + attachment_id=0, + name="inactive-report.xlsx", + file_extension="xlsx", + file_size=10, + user_id=test_user.id, + is_active=False, + source_type="file", + ) + active_markdown = KnowledgeDocument( + kind_id=10, + attachment_id=0, + name="guide.md", + file_extension="md", + file_size=10, + user_id=test_user.id, + is_active=True, + source_type="file", + ) + test_db.add_all([active_csv, inactive_xlsx, active_markdown]) + test_db.commit() + + stats = KnowledgeService.get_document_prompt_stats( + db=test_db, + knowledge_base_ids=[10], + ) + + assert stats[10] == { + "total_document_count": 3, + "searchable_document_count": 2, + "spreadsheet_document_count": 1, + } diff --git a/shared/prompts/knowledge_base.py b/shared/prompts/knowledge_base.py index f564c0aca2..ebf69a901c 100644 --- a/shared/prompts/knowledge_base.py +++ b/shared/prompts/knowledge_base.py @@ -38,7 +38,7 @@ - If the user already named the target file(s), prefer `knowledge_base_search` with exact `document_names`. - If you already know exact document IDs, prefer `knowledge_base_search` with `document_ids`. - If the file names are unknown and scoping would help, call `kb_ls` first, then use scoped `knowledge_base_search` or `kb_head`. - - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters. + - If spreadsheet files are present, especially larger spreadsheets, `knowledge_base_search` may be less reliable for exact table details. If results seem weak or irrelevant, use `kb_ls` to narrow candidates and `kb_head` to verify. - If the question is broad and no file-level scope is needed, call `knowledge_base_search` globally. D) **Knowledge base management** (optional, only if tools exist) @@ -98,7 +98,7 @@ - Use scoped `knowledge_base_search(document_names=[...])` when the user already knows exact file names. - Use scoped `knowledge_base_search(document_ids=[...])` when exact document IDs are already known. - If you need exact file names or want to narrow the scope before searching, call `kb_ls` first, then use scoped search or `kb_head`. - - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters. + - If spreadsheet files are present, especially larger spreadsheets, `knowledge_base_search` may be less reliable for exact table details. If results seem weak or irrelevant, use `kb_ls` to narrow candidates and `kb_head` to verify. - If results are relevant: answer using KB content and cite sources. - If results are empty/irrelevant: you may answer from general knowledge, and clearly state the KB had no relevant info. - If `knowledge_base_search` is unavailable/limited (rag_not_configured / rejected / call-limit warnings ⚠️/🚨): switch to `kb_ls` → `kb_head` to retrieve evidence manually. @@ -112,7 +112,7 @@ - If the KB has no relevant content, say so and answer from general knowledge - For "what's in the KB" questions, `kb_ls` is usually higher-signal than `knowledge_base_search` - `document_names` matching is exact, so use `kb_ls` first when you do not know the exact file names -- If spreadsheet files are present, verify precision-sensitive table details with `kb_head` instead of relying on `knowledge_base_search` alone +- If spreadsheet files are present, especially larger spreadsheets, and search results look weak or irrelevant, verify precision-sensitive table details with `kb_head` """ diff --git a/shared/tests/test_prompts.py b/shared/tests/test_prompts.py index d1dde5711e..a64dfc2217 100644 --- a/shared/tests/test_prompts.py +++ b/shared/tests/test_prompts.py @@ -49,7 +49,7 @@ def test_kb_prompt_strict_contains_required_content(self): assert "kb_ls" in KB_PROMPT_STRICT assert "kb_head" in KB_PROMPT_STRICT assert "spreadsheet files" in strict_lower - assert "lower-confidence" in strict_lower + assert "weak or irrelevant" in strict_lower def test_kb_prompt_relaxed_contains_required_content(self): """KB_PROMPT_RELAXED should contain relaxed mode instructions.""" @@ -67,6 +67,7 @@ def test_kb_prompt_relaxed_contains_required_content(self): assert "kb_ls" in KB_PROMPT_RELAXED assert "spreadsheet files" in relaxed_lower assert "kb_head" in relaxed_lower + assert "weak or irrelevant" in relaxed_lower def test_prompts_are_different(self): """Strict and relaxed prompts should be different."""