From a8e2ee0f00bfe6dc1293fe35ab32d27913faee0b Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Sat, 4 Apr 2026 17:13:20 +0800
Subject: [PATCH 1/7] docs(spec): add KB RAG retrieval design

---
 .../2026-04-04-kb-rag-retrieval-design.md     | 229 ++++++++++++++++++
 1 file changed, 229 insertions(+)
 create mode 100644 docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md

diff --git a/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md b/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md
new file mode 100644
index 0000000000..74f15fb539
--- /dev/null
+++ b/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md
@@ -0,0 +1,229 @@
+---
+sidebar_position: 1
+---
+
+# KB RAG Retrieval Design
+
+## Overview
+
+This design improves knowledge-base retrieval behavior for AI agents without adding new tools, new persistent configuration, or new backend I/O requests.
+
+The current system already provides three complementary KB capabilities:
+
+- `kb_ls` for document-level overview
+- `knowledge_base_search` for fast retrieval
+- `kb_head` for direct content reading
+
+The main gap is not missing infrastructure. The gap is that agents do not have enough guidance or enough scoped-search controls to use these capabilities proactively on large or complex knowledge bases.
+
+This design keeps the current intent-routing model, but changes the routing outcome for KB content questions from a mostly fixed "search first" path to a more flexible "decide the next best retrieval action" path.
+
+## Goals
+
+- Preserve intent routing as the first decision step for KB interactions.
+- Let the agent choose between direct search, document overview, scoped search, and manual reading based on the current question and KB metadata.
+- Allow `knowledge_base_search` to narrow search scope by specific documents.
+- Reuse existing retrieval infrastructure and call-limit behavior.
+- Avoid extra network requests or new long-lived configuration.
+
+## Non-Goals
+
+- No new "smart KB" tool that wraps `kb_ls`, `knowledge_base_search`, and `kb_head`.
+- No backend-enforced size thresholds such as small/medium/large KB routing.
+- No new generic metadata filter object exposed to the model in the first version.
+- No changes to existing call-limit policies.
+
+## Current Problems
+
+1. Prompt guidance over-biases KB question answering toward `knowledge_base_search`, which weakens retrieval quality on larger or more heterogeneous KBs.
+2. `knowledge_base_search` already supports document filtering internally through `document_ids`, but that capability is not exposed to the model as part of the tool schema.
+3. Dynamic KB metadata injected into the conversation does not currently include retrieval-relevant facts such as document count or whether RAG is available.
+4. `kb_ls` is described mainly as a fallback path instead of a proactive range-discovery tool.
+
+## Proposed Behavior
+
+### Intent Routing
+
+Intent routing remains the first step.
+
+For KB-related requests, the model should still first classify whether the user is asking about:
+
+- KB metadata or selection
+- KB document overview
+- KB content question
+- KB management operations
+- Manual content reading in no-RAG situations
+
+The change is in the next-step guidance for KB content questions:
+
+- If the question is precise and direct retrieval is likely enough, the model may call `knowledge_base_search` immediately.
+- If the question is broad, ambiguous, or likely to depend on document structure or document choice, the model should prefer `kb_ls` first.
+- After `kb_ls`, the model may call `knowledge_base_search` again with scoped document arguments.
+- If RAG is unavailable, the model should use `kb_ls` and `kb_head`.
+
+### Dynamic KB Metadata
+
+The request-scoped KB metadata block should remain lightweight and objective.
+
+Each KB entry should include:
+
+- `kb_name`
+- `kb_id`
+- `document_count`
+- `rag_enabled`
+- existing optional summary/topic hints when available
+
+No derived size buckets should be added. Exact values are simpler to maintain and give the model enough signal to decide whether it should first inspect document scope.
+
+Example shape:
+
+```text
+Knowledge Bases In Scope:
+- KB Name: Product Docs, KB ID: 12, Documents: 128, RAG: enabled
+  - Summary: Internal product documentation and runbooks
+  - Topics: release process, deployment, alerts
+```
+
+### Scoped Search
+
+`knowledge_base_search` should support narrowing retrieval to specific documents with:
+
+- `document_ids: number[]`
+- `document_names: string[]`
+
+Expected usage:
+
+- If the model already knows exact document names from the user prompt, previous `kb_ls` output, or conversation history, it may use `document_names` directly.
+- If the model does not know the target documents, it should use `kb_ls` first.
+- If the model has document IDs, it should prefer `document_ids`.
+
+## Interface Design
+
+### Shared Prompt Layer
+
+Update the KB prompt templates in `shared/prompts/knowledge_base.py`.
+
+Key prompt adjustments:
+
+- Keep `Intent Routing (DO THIS FIRST)`.
+- Keep the current strict / relaxed / no-RAG / restricted mode structure.
+- Change KB content-question guidance from effectively "search first" to "decide whether to inspect scope first".
+- Reframe `kb_ls` as a proactive document-overview tool, not only a fallback tool.
+- Document that scoped `knowledge_base_search` can use `document_ids` or `document_names` when the target documents are known.
+
+Restricted mode remains search-only. It does not expose `kb_ls` or `kb_head`.
+
+### Dynamic KB Meta Prompt
+
+Update the backend KB meta formatter so each entry includes `document_count` and `rag_enabled`.
+
+This change should stay formatting-only from the prompt module's perspective. The preprocessing layer remains responsible for assembling the meta list from already available backend state.
+
+### Chat Shell Tool Schema
+
+Extend the `KnowledgeBaseInput` schema used by `knowledge_base_search`:
+
+- add optional `document_ids`
+- add optional `document_names`
+
+The tool continues to call the same backend internal retrieve endpoint. Existing calls with only `query` and `max_results` remain valid and unchanged.
+
+### Backend Internal Retrieve Surface
+
+Extend the internal retrieval request model to accept:
+
+- `document_ids`
+- `document_names`
+
+Resolution rules:
+
+- If `document_ids` is provided, retrieval uses those IDs directly.
+- If `document_names` is provided, backend resolves them to document IDs within the currently scoped KB set.
+- If both are provided, `document_ids` takes precedence and `document_names` is ignored.
+- Name resolution uses exact matching only.
+- If no names resolve, return a structured error instructing the caller to use `kb_ls` first.
+
+The backend should resolve `document_names` into `document_ids` before entering the existing retrieval/filtering flow so that the downstream retrieval path remains unchanged.
+
+## Matching Rules
+
+### `document_names`
+
+- Matching is exact.
+- Matching is scoped to the KBs already attached to the current tool invocation.
+- If multiple KBs contain the same exact document name, all matching documents are included.
+- No fuzzy matching or contains matching is added in the first version.
+
+### Ambiguity
+
+If multiple exact matches exist across KBs, that is acceptable and should not be treated as an error. The current system already supports multi-KB retrieval, so the scoped result set can include all exact matches.
+
+## Error Handling
+
+- If `document_names` resolves to no documents, return a structured error and guidance to call `kb_ls`.
+- If `document_ids` or `document_names` is an empty list, treat it as absent input.
+- If RAG is not configured, preserve the current no-RAG response and guide the model toward `kb_ls` and `kb_head`.
+- Restricted mode continues to allow only `knowledge_base_search`, even when scoped arguments are used.
+- Do not silently fall back from failed scoped search to global search. Silent widening would reduce precision and make tool behavior unpredictable.
+
+## Compatibility
+
+- Existing callers that use only `query` and `max_results` keep the current behavior.
+- Existing retrieval logic based on `document_ids -> metadata_condition -> backend storage retrieval` remains the main implementation path.
+- No new tool is introduced.
+- No new persistent KB configuration is introduced.
+- No new call-limit behavior is introduced.
+
+## Implementation Boundaries
+
+### Shared
+
+Responsible for updating the KB prompt instructions only.
+
+### Chat Shell
+
+Responsible for exposing scoped-search inputs to the model and forwarding them to backend retrieval.
+
+### Backend
+
+Responsible for:
+
+- enriching KB metadata with `document_count` and `rag_enabled`
+- resolving `document_names` to `document_ids`
+- keeping retrieval execution on the existing filtered retrieval path
+
+## Testing
+
+### Shared
+
+- Update prompt tests to reflect the new KB guidance wording.
+- Add or update tests for KB meta prompt formatting with `document_count` and `rag_enabled`.
+
+### Chat Shell
+
+- Verify `knowledge_base_search` schema includes `document_ids`.
+- Verify `knowledge_base_search` schema includes `document_names`.
+- Verify scoped arguments are forwarded in HTTP mode.
+- Verify legacy unscoped usage remains unchanged.
+
+### Backend
+
+- Verify internal retrieve accepts `document_names`.
+- Verify exact-match name resolution within KB scope.
+- Verify unresolved names return the expected structured error.
+- Verify multiple exact matches across KBs are included.
+- Verify direct `document_ids` filtering still works.
+- Verify the downstream retrieval path still uses document-based metadata filtering.
+
+## Rollout Notes
+
+This design is intentionally incremental.
+
+The first version should stop at:
+
+1. prompt guidance updates
+2. dynamic metadata enrichment
+3. scoped search support through `document_ids` and `document_names`
+4. backend exact-match document-name resolution
+
+Do not expand the first version into generic filter support, fuzzy matching, or backend hard-coded KB-size routing.

From dbc5695df59dfea486752e03770ed8623e350509 Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Sat, 4 Apr 2026 18:44:57 +0800
Subject: [PATCH 2/7] feat(knowledge): enrich dynamic KB retrieval metadata

---
 .../services/chat/preprocessing/contexts.py   |  22 ++++
 .../services/chat/preprocessing/kb_meta.py    |  19 ++-
 .../services/knowledge/knowledge_service.py   |  46 ++++++-
 backend/tests/services/test_kb_meta_prompt.py | 117 ++++++++++++++++++
 4 files changed, 202 insertions(+), 2 deletions(-)

diff --git a/backend/app/services/chat/preprocessing/contexts.py b/backend/app/services/chat/preprocessing/contexts.py
index c6f338322f..927b6e30da 100644
--- a/backend/app/services/chat/preprocessing/contexts.py
+++ b/backend/app/services/chat/preprocessing/contexts.py
@@ -1319,6 +1319,7 @@ def _build_kb_meta_prompt(
             format_restricted_kb_meta_prompt,
             select_kb_summary_text,
         )
+        from app.services.knowledge import KnowledgeService
         from app.services.knowledge.task_knowledge_base_service import (
             task_knowledge_base_service,
         )
@@ -1326,6 +1327,9 @@ def _build_kb_meta_prompt(
         kb_map = task_knowledge_base_service.get_knowledge_bases_by_ids(
             db, knowledge_base_ids
         )
+        kb_prompt_stats = KnowledgeService.get_document_prompt_stats(
+            db, knowledge_base_ids
+        )
 
         kb_meta_list: list[dict[str, Any]] = []
         for kb_id in knowledge_base_ids:
@@ -1335,6 +1339,10 @@ def _build_kb_meta_prompt(
                     {
                         "kb_id": kb_id,
                         "kb_name": "Unknown",
+                        "search_available": False,
+                        "total_document_count": 0,
+                        "searchable_document_count": 0,
+                        "spreadsheet_document_count": 0,
                         "summary_text": "",
                         "topics": [],
                     }
@@ -1365,10 +1373,24 @@ def _build_kb_meta_prompt(
                     exc_info=True,
                 )
 
+            stats = kb_prompt_stats.get(
+                kb_id,
+                {
+                    "total_document_count": 0,
+                    "searchable_document_count": 0,
+                    "spreadsheet_document_count": 0,
+                },
+            )
+            retrieval_config = kb_spec.get("retrievalConfig") or {}
+
             kb_meta_list.append(
                 {
                     "kb_id": kb_id,
                     "kb_name": kb_name,
+                    "search_available": bool(retrieval_config.get("retriever_name")),
+                    "total_document_count": stats["total_document_count"],
+                    "searchable_document_count": stats["searchable_document_count"],
+                    "spreadsheet_document_count": stats["spreadsheet_document_count"],
                     "summary_text": summary_text,
                     "topics": topics,
                 }
diff --git a/backend/app/services/chat/preprocessing/kb_meta.py b/backend/app/services/chat/preprocessing/kb_meta.py
index 7437d793ca..580a69efd3 100644
--- a/backend/app/services/chat/preprocessing/kb_meta.py
+++ b/backend/app/services/chat/preprocessing/kb_meta.py
@@ -105,7 +105,24 @@ def format_kb_meta_prompt(kb_meta_list: list[dict[str, Any]]) -> str:
             kb_meta.get("kb_name", "Unknown"), "Unknown"
         )
         kb_id = sanitize_prompt_identifier(kb_meta.get("kb_id", "N/A"), "N/A")
-        kb_lines.append(f"- KB Name: {kb_name}, KB ID: {kb_id}")
+        search_available = (
+            "available" if kb_meta.get("search_available") else "unavailable"
+        )
+        total_document_count = int(kb_meta.get("total_document_count", 0) or 0)
+        searchable_document_count = int(
+            kb_meta.get("searchable_document_count", 0) or 0
+        )
+        spreadsheet_document_count = int(
+            kb_meta.get("spreadsheet_document_count", 0) or 0
+        )
+
+        kb_lines.append(
+            f"- KB Name: {kb_name}, KB ID: {kb_id}, "
+            f"Search: {search_available}, "
+            f"Total Docs: {total_document_count}, "
+            f"Searchable Docs: {searchable_document_count}, "
+            f"Spreadsheets: {spreadsheet_document_count}"
+        )
 
         summary_text = kb_meta.get("summary_text") or ""
         topics = kb_meta.get("topics") or []
diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py
index 06898e58e1..c6dddf7f4b 100644
--- a/backend/app/services/knowledge/knowledge_service.py
+++ b/backend/app/services/knowledge/knowledge_service.py
@@ -9,7 +9,7 @@
 from dataclasses import dataclass
 from typing import Optional
 
-from sqlalchemy import and_, func
+from sqlalchemy import and_, case, func
 from sqlalchemy.orm import Session
 from sqlalchemy.orm.attributes import flag_modified
 
@@ -845,6 +845,50 @@ def get_active_document_counts(
 
         return {kb_id: count for kb_id, count in results}
 
+    @staticmethod
+    def get_document_prompt_stats(
+        db: Session,
+        knowledge_base_ids: list[int],
+    ) -> dict[int, dict[str, int]]:
+        """Get prompt-oriented document stats for multiple knowledge bases."""
+        if not knowledge_base_ids:
+            return {}
+
+        spreadsheet_exts = ["csv", "xls", "xlsx"]
+
+        results = (
+            db.query(
+                KnowledgeDocument.kind_id,
+                func.count(KnowledgeDocument.id).label("total_count"),
+                func.sum(case((KnowledgeDocument.is_active == True, 1), else_=0)).label(
+                    "searchable_count"
+                ),
+                func.sum(
+                    case(
+                        (
+                            func.lower(KnowledgeDocument.file_extension).in_(
+                                spreadsheet_exts
+                            ),
+                            1,
+                        ),
+                        else_=0,
+                    )
+                ).label("spreadsheet_count"),
+            )
+            .filter(KnowledgeDocument.kind_id.in_(knowledge_base_ids))
+            .group_by(KnowledgeDocument.kind_id)
+            .all()
+        )
+
+        return {
+            kb_id: {
+                "total_document_count": int(total_count or 0),
+                "searchable_document_count": int(searchable_count or 0),
+                "spreadsheet_document_count": int(spreadsheet_count or 0),
+            }
+            for kb_id, total_count, searchable_count, spreadsheet_count in results
+        }
+
     @staticmethod
     def get_active_document_text_length_stats(
         db: Session,
diff --git a/backend/tests/services/test_kb_meta_prompt.py b/backend/tests/services/test_kb_meta_prompt.py
index 30089a4f93..6f2e92f38d 100644
--- a/backend/tests/services/test_kb_meta_prompt.py
+++ b/backend/tests/services/test_kb_meta_prompt.py
@@ -7,6 +7,8 @@
 All comments must be written in English.
 """
 
+from unittest.mock import MagicMock, patch
+
 import pytest
 
 
@@ -25,12 +27,20 @@ def test_format_kb_meta_prompt_with_summary_and_topics(self):
                 {
                     "kb_id": 1,
                     "kb_name": "KB1",
+                    "search_available": True,
+                    "total_document_count": 3,
+                    "searchable_document_count": 2,
+                    "spreadsheet_document_count": 1,
                     "summary_text": "S1",
                     "topics": ["t1", "t2"],
                 },
                 {
                     "kb_id": 2,
                     "kb_name": "KB2",
+                    "search_available": False,
+                    "total_document_count": 0,
+                    "searchable_document_count": 0,
+                    "spreadsheet_document_count": 0,
                     "summary_text": "",
                     "topics": [],
                 },
@@ -40,9 +50,14 @@ def test_format_kb_meta_prompt_with_summary_and_topics(self):
         assert "Knowledge Bases In Scope" in prompt
         assert "KB Name: KB1" in prompt
         assert "KB ID: 1" in prompt
+        assert "Search: available" in prompt
+        assert "Total Docs: 3" in prompt
+        assert "Searchable Docs: 2" in prompt
+        assert "Spreadsheets: 1" in prompt
         assert "Summary: S1" in prompt
         assert "Topics: t1, t2" in prompt
         assert "KB Name: KB2" in prompt
+        assert "Search: unavailable" in prompt
         assert "request-scoped metadata only" in prompt
 
     def test_format_kb_meta_prompt_marks_single_selected_kb_as_target(self):
@@ -67,6 +82,54 @@ def test_format_kb_meta_prompt_marks_single_selected_kb_as_target(self):
         assert "list_knowledge_bases" not in prompt
         assert "clarifying questions" not in prompt
 
+    def test_format_kb_meta_prompt_includes_runtime_retrieval_fields(self):
+        from app.services.chat.preprocessing.kb_meta import format_kb_meta_prompt
+
+        prompt = format_kb_meta_prompt(
+            [
+                {
+                    "kb_id": 12,
+                    "kb_name": "Product Docs",
+                    "search_available": True,
+                    "total_document_count": 128,
+                    "searchable_document_count": 113,
+                    "spreadsheet_document_count": 9,
+                    "summary_text": "Internal product documentation and runbooks",
+                    "topics": ["release process", "deployment"],
+                }
+            ]
+        )
+
+        assert "KB Name: Product Docs" in prompt
+        assert "KB ID: 12" in prompt
+        assert "Search: available" in prompt
+        assert "Total Docs: 128" in prompt
+        assert "Searchable Docs: 113" in prompt
+        assert "Spreadsheets: 9" in prompt
+
+    def test_format_kb_meta_prompt_marks_search_unavailable(self):
+        from app.services.chat.preprocessing.kb_meta import format_kb_meta_prompt
+
+        prompt = format_kb_meta_prompt(
+            [
+                {
+                    "kb_id": 7,
+                    "kb_name": "Ops KB",
+                    "search_available": False,
+                    "total_document_count": 10,
+                    "searchable_document_count": 6,
+                    "spreadsheet_document_count": 2,
+                    "summary_text": "",
+                    "topics": [],
+                }
+            ]
+        )
+
+        assert "Search: unavailable" in prompt
+        assert "Total Docs: 10" in prompt
+        assert "Searchable Docs: 6" in prompt
+        assert "Spreadsheets: 2" in prompt
+
     def test_select_kb_summary_text_prefers_long_for_small_list(self):
         from app.services.chat.preprocessing.kb_meta import select_kb_summary_text
 
@@ -113,3 +176,57 @@ def test_format_restricted_kb_meta_prompt_includes_safe_routing_hints(self):
         assert "retrieval guidance only" in prompt
         assert "document structure" not in prompt
         assert "high-level analysis" not in prompt
+
+    def test_build_kb_meta_prompt_passes_retrieval_metadata(self):
+        from app.services.chat.preprocessing.contexts import _build_kb_meta_prompt
+
+        kb_kind = MagicMock()
+        kb_kind.json = {
+            "spec": {
+                "name": "Product Docs",
+                "retrievalConfig": {"retriever_name": "retriever-a"},
+                "summaryEnabled": False,
+            }
+        }
+        captured_meta = {}
+
+        def _capture_meta(kb_meta_list):
+            captured_meta["value"] = kb_meta_list
+            return "captured"
+
+        with (
+            patch(
+                "app.services.knowledge.task_knowledge_base_service.task_knowledge_base_service.get_knowledge_bases_by_ids",
+                return_value={12: kb_kind},
+            ),
+            patch(
+                "app.services.chat.preprocessing.kb_meta.format_kb_meta_prompt",
+                side_effect=_capture_meta,
+            ),
+            patch(
+                "app.services.knowledge.KnowledgeService.get_document_prompt_stats",
+                return_value={
+                    12: {
+                        "total_document_count": 128,
+                        "searchable_document_count": 113,
+                        "spreadsheet_document_count": 9,
+                    }
+                },
+                create=True,
+            ),
+        ):
+            result = _build_kb_meta_prompt(MagicMock(), [12])
+
+        assert result == "captured"
+        assert captured_meta["value"] == [
+            {
+                "kb_id": 12,
+                "kb_name": "Product Docs",
+                "search_available": True,
+                "total_document_count": 128,
+                "searchable_document_count": 113,
+                "spreadsheet_document_count": 9,
+                "summary_text": "",
+                "topics": [],
+            }
+        ]

From 83630b71ccdb171e311e97fd63a75559dcf9fd59 Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Sat, 4 Apr 2026 18:47:40 +0800
Subject: [PATCH 3/7] feat(chat-shell): add scoped KB search arguments

---
 .../tools/builtin/knowledge_base.py           | 43 ++++++++++-
 .../test_knowledge_base_scoped_search.py      | 71 +++++++++++++++++++
 2 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100644 chat_shell/tests/test_knowledge_base_scoped_search.py

diff --git a/chat_shell/chat_shell/tools/builtin/knowledge_base.py b/chat_shell/chat_shell/tools/builtin/knowledge_base.py
index 5076de47a5..ff5d6b77d1 100644
--- a/chat_shell/chat_shell/tools/builtin/knowledge_base.py
+++ b/chat_shell/chat_shell/tools/builtin/knowledge_base.py
@@ -10,7 +10,8 @@
 
 import json
 import logging
-from typing import Any, Dict, List, Optional
+from contextlib import contextmanager
+from typing import Any, Dict, Iterator, List, Optional
 
 from langchain_core.callbacks import CallbackManagerForToolRun
 from langchain_core.tools import BaseTool
@@ -46,6 +47,14 @@ class KnowledgeBaseInput(BaseModel):
         default=20,
         description="Maximum number of results to return. Increased from 5 to 20 for better RAG coverage.",
     )
+    document_ids: list[int] = Field(
+        default_factory=list,
+        description="Optional document IDs to restrict retrieval scope.",
+    )
+    document_names: list[str] = Field(
+        default_factory=list,
+        description="Optional exact document names to restrict retrieval scope when document IDs are not known.",
+    )
 
 
 class KnowledgeBaseTool(BaseTool):
@@ -73,6 +82,7 @@ class KnowledgeBaseTool(BaseTool):
     # Document IDs to filter (optional, for searching specific documents only)
     # When set, only chunks from these documents will be returned
     document_ids: list[int] = Field(default_factory=list)
+    document_names: list[str] = Field(default_factory=list)
 
     # User ID for access control
     user_id: int = 0
@@ -468,7 +478,36 @@ def _run(
         """Synchronous run - not implemented, use async version."""
         raise NotImplementedError("KnowledgeBaseTool only supports async execution")
 
+    @contextmanager
+    def _apply_call_scoped_filters(
+        self,
+        document_ids: Optional[list[int]] = None,
+        document_names: Optional[list[str]] = None,
+    ) -> Iterator[None]:
+        """Temporarily apply per-call document filters."""
+        original_document_ids = self.document_ids
+        original_document_names = self.document_names
+        self.document_ids = document_ids or original_document_ids
+        self.document_names = document_names or original_document_names
+        try:
+            yield
+        finally:
+            self.document_ids = original_document_ids
+            self.document_names = original_document_names
+
     async def _arun(
+        self,
+        query: str,
+        max_results: int = 20,
+        document_ids: Optional[list[int]] = None,
+        document_names: Optional[list[str]] = None,
+        run_manager: CallbackManagerForToolRun | None = None,
+    ) -> str:
+        """Execute knowledge base search with optional per-call scoped filters."""
+        with self._apply_call_scoped_filters(document_ids, document_names):
+            return await self._arun_impl(query, max_results, run_manager)
+
+    async def _arun_impl(
         self,
         query: str,
         max_results: int = 20,
@@ -900,6 +939,8 @@ async def _retrieve_with_strategy_via_http(
             payload["mediation_context"] = mediation_context
         if self.document_ids:
             payload["document_ids"] = self.document_ids
+        if self.document_names:
+            payload["document_names"] = self.document_names
         if self.user_name is not None:
             payload["user_name"] = self.user_name
 
diff --git a/chat_shell/tests/test_knowledge_base_scoped_search.py b/chat_shell/tests/test_knowledge_base_scoped_search.py
new file mode 100644
index 0000000000..2c36288152
--- /dev/null
+++ b/chat_shell/tests/test_knowledge_base_scoped_search.py
@@ -0,0 +1,71 @@
+# SPDX-FileCopyrightText: 2026 Weibo, Inc.
+#
+# SPDX-License-Identifier: Apache-2.0
+
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import pytest
+
+from chat_shell.tools.builtin import KnowledgeBaseTool
+
+
+def test_knowledge_base_input_supports_document_names():
+    from chat_shell.tools.builtin.knowledge_base import KnowledgeBaseInput
+
+    payload = KnowledgeBaseInput(
+        query="checklist",
+        max_results=5,
+        document_names=["release.md"],
+    )
+
+    assert payload.document_names == ["release.md"]
+
+
+@pytest.mark.asyncio
+async def test_http_mode_forwards_document_names_and_ids():
+    tool = KnowledgeBaseTool(
+        knowledge_base_ids=[1, 2],
+        user_id=7,
+    )
+    mock_response = MagicMock()
+    mock_response.status_code = 200
+    mock_response.json.return_value = {
+        "mode": "rag_retrieval",
+        "records": [],
+        "total": 0,
+        "total_estimated_tokens": 0,
+    }
+
+    with (
+        patch("httpx.AsyncClient") as mock_client,
+        patch.object(
+            tool,
+            "_get_kb_info",
+            AsyncMock(
+                return_value={
+                    "items": [
+                        {
+                            "id": 1,
+                            "name": "Test KB",
+                            "rag_enabled": True,
+                            "max_calls_per_conversation": 10,
+                            "exempt_calls_before_check": 5,
+                        }
+                    ]
+                }
+            ),
+        ),
+    ):
+        post = AsyncMock(return_value=mock_response)
+        mock_client.return_value.__aenter__.return_value.post = post
+
+        await tool._arun(
+            query="release checklist",
+            max_results=8,
+            document_ids=[101],
+            document_names=["release.md"],
+        )
+
+    payload = post.call_args.kwargs["json"]
+    assert payload["document_ids"] == [101]
+    assert payload["document_names"] == ["release.md"]

From d7d81406a5f2c017d429e84cb0c0e6c83483c91a Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Sat, 4 Apr 2026 19:17:11 +0800
Subject: [PATCH 4/7] feat(knowledge): improve KB retrieval routing

---
 backend/app/api/endpoints/internal/rag.py     | 49 ++++++++++++--
 .../services/knowledge/knowledge_service.py   | 26 ++++++++
 .../internal/test_rag_retrieve_endpoint.py    | 51 +++++++++++++++
 .../services/rag/test_retrieval_service.py    | 64 +++++++++++++++++++
 .../tools/builtin/knowledge_base.py           | 33 +++++++++-
 .../test_knowledge_base_scoped_search.py      | 47 ++++++++++++++
 shared/prompts/knowledge_base.py              | 27 +++++---
 shared/tests/test_prompts.py                  |  9 ++-
 8 files changed, 289 insertions(+), 17 deletions(-)

diff --git a/backend/app/api/endpoints/internal/rag.py b/backend/app/api/endpoints/internal/rag.py
index 777f6381aa..ae4ce6667f 100644
--- a/backend/app/api/endpoints/internal/rag.py
+++ b/backend/app/api/endpoints/internal/rag.py
@@ -118,6 +118,10 @@ class InternalRetrieveRequest(BaseModel):
         default=None,
         description="Optional list of document IDs to filter. Only chunks from these documents will be returned.",
     )
+    document_names: Optional[list[str]] = Field(
+        default=None,
+        description="Optional exact document names to resolve into document IDs before retrieval.",
+    )
     route_mode: Literal["auto", "direct_injection", "rag_retrieval"] = Field(
         default="auto",
         description="Routing mode: auto decides in Backend, direct_injection forces all-chunks, rag_retrieval forces standard retrieval",
@@ -164,6 +168,22 @@ class InternalRetrieveResponse(BaseModel):
     records: list[RetrieveRecord]
     total: int
     total_estimated_tokens: int = 0
+    message: Optional[str] = None
+
+
+def _resolve_document_names(
+    db: Session,
+    knowledge_base_ids: list[int],
+    document_names: list[str],
+) -> list[int]:
+    """Resolve exact document names into document IDs within KB scope."""
+    from app.services.knowledge import KnowledgeService
+
+    return KnowledgeService.resolve_document_ids_by_names(
+        db=db,
+        knowledge_base_ids=knowledge_base_ids,
+        document_names=document_names,
+    )
 
 
 @router.post(
@@ -193,11 +213,27 @@ async def internal_retrieve(
         if request.knowledge_base_id is not None:
             knowledge_base_ids = [request.knowledge_base_id]
 
-        if request.document_ids:
+        resolved_document_ids = request.document_ids or []
+        if not resolved_document_ids and request.document_names:
+            resolved_document_ids = _resolve_document_names(
+                db=db,
+                knowledge_base_ids=knowledge_base_ids,
+                document_names=request.document_names,
+            )
+            if not resolved_document_ids:
+                return {
+                    "mode": "rag_retrieval",
+                    "records": [],
+                    "total": 0,
+                    "total_estimated_tokens": 0,
+                    "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.",
+                }
+
+        if resolved_document_ids:
             logger.info(
                 "[internal_rag] Filtering by %d documents: %s",
-                len(request.document_ids),
-                request.document_ids,
+                len(resolved_document_ids),
+                resolved_document_ids,
             )
 
         runtime_context = request.runtime_context
@@ -210,7 +246,7 @@ async def internal_retrieve(
             knowledge_base_ids=knowledge_base_ids,
             query=request.query,
             max_results=request.max_results,
-            document_ids=request.document_ids,
+            document_ids=resolved_document_ids or None,
             route_mode=request.route_mode,
             user_id=persistence_context.user_id if persistence_context else None,
             user_name=request.user_name,
@@ -277,8 +313,8 @@ async def internal_retrieve(
             available_injection_tokens,
             request.query[:50],
             (
-                f", filtered by {len(request.document_ids)} docs"
-                if request.document_ids
+                f", filtered by {len(resolved_document_ids)} docs"
+                if resolved_document_ids
                 else ""
             ),
         )
@@ -328,6 +364,7 @@ async def internal_retrieve(
             ],
             total=len(records),
             total_estimated_tokens=total_estimated_tokens,
+            message=result.get("message"),
         )
 
     except ValueError as e:
diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py
index c6dddf7f4b..ceee3f3eda 100644
--- a/backend/app/services/knowledge/knowledge_service.py
+++ b/backend/app/services/knowledge/knowledge_service.py
@@ -889,6 +889,32 @@ def get_document_prompt_stats(
             for kb_id, total_count, searchable_count, spreadsheet_count in results
         }
 
+    @staticmethod
+    def resolve_document_ids_by_names(
+        db: Session,
+        knowledge_base_ids: list[int],
+        document_names: list[str],
+    ) -> list[int]:
+        """Resolve exact document names within the provided knowledge-base scope."""
+        if not knowledge_base_ids or not document_names:
+            return []
+
+        normalized_names = [
+            name.strip() for name in document_names if name and name.strip()
+        ]
+        if not normalized_names:
+            return []
+
+        rows = (
+            db.query(KnowledgeDocument.id)
+            .filter(
+                KnowledgeDocument.kind_id.in_(knowledge_base_ids),
+                KnowledgeDocument.name.in_(normalized_names),
+            )
+            .all()
+        )
+        return [row.id for row in rows]
+
     @staticmethod
     def get_active_document_text_length_stats(
         db: Session,
diff --git a/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py b/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py
index 7d51d22a98..8e936fa723 100644
--- a/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py
+++ b/backend/tests/api/endpoints/internal/test_rag_retrieve_endpoint.py
@@ -111,3 +111,54 @@ def test_internal_retrieve_keeps_user_subtask_id_out_of_gateway(test_client):
     assert response.status_code == 200
     mock_query.assert_awaited_once_with(ANY, db=ANY)
     mock_persist.assert_called_once()
+
+
+def test_internal_retrieve_resolves_document_names_before_query(test_client):
+    with (
+        patch(
+            "app.api.endpoints.internal.rag._resolve_document_names",
+            return_value=[101, 102],
+        ) as mock_resolve,
+        patch(
+            "app.api.endpoints.internal.rag.LocalRagGateway.query",
+            new_callable=AsyncMock,
+            return_value={
+                "mode": "rag_retrieval",
+                "records": [],
+                "total": 0,
+                "total_estimated_tokens": 0,
+            },
+        ) as mock_query,
+    ):
+        response = test_client.post(
+            "/api/internal/rag/retrieve",
+            json={
+                "query": "release checklist",
+                "knowledge_base_ids": [12],
+                "document_names": ["release.md"],
+            },
+        )
+
+    assert response.status_code == 200
+    mock_resolve.assert_called_once()
+    mock_query.assert_awaited_once()
+
+
+def test_internal_retrieve_returns_error_when_document_names_not_found(test_client):
+    with patch(
+        "app.api.endpoints.internal.rag._resolve_document_names",
+        return_value=[],
+    ):
+        response = test_client.post(
+            "/api/internal/rag/retrieve",
+            json={
+                "query": "release checklist",
+                "knowledge_base_ids": [12],
+                "document_names": ["missing.md"],
+            },
+        )
+
+    assert response.status_code == 200
+    assert response.json()["mode"] == "rag_retrieval"
+    assert response.json()["records"] == []
+    assert response.json()["message"].startswith("Document names not found")
diff --git a/backend/tests/services/rag/test_retrieval_service.py b/backend/tests/services/rag/test_retrieval_service.py
index 37e463e5d5..1d7da646d4 100644
--- a/backend/tests/services/rag/test_retrieval_service.py
+++ b/backend/tests/services/rag/test_retrieval_service.py
@@ -383,6 +383,70 @@ async def test_force_rag_route_uses_standard_retrieval(self):
         assert result["records"][0]["score"] == 0.9
         assert result["records"][0]["knowledge_base_id"] == 123
 
+    @pytest.mark.asyncio
+    async def test_package_mode_resolves_document_names_before_retrieval(self):
+        """Package mode should resolve document_names before calling retrieval service."""
+        from chat_shell.tools.builtin import KnowledgeBaseTool
+
+        tool = KnowledgeBaseTool(
+            knowledge_base_ids=[1],
+            db_session=MagicMock(),
+            user_id=7,
+        )
+
+        with (
+            patch.object(
+                tool,
+                "_get_kb_info",
+                AsyncMock(
+                    return_value={
+                        "items": [
+                            {
+                                "id": 1,
+                                "name": "Test KB",
+                                "rag_enabled": True,
+                                "max_calls_per_conversation": 10,
+                                "exempt_calls_before_check": 5,
+                            }
+                        ]
+                    }
+                ),
+            ),
+            patch(
+                "app.services.knowledge.KnowledgeService.resolve_document_ids_by_names",
+                return_value=[301],
+                create=True,
+            ) as mock_resolve,
+            patch(
+                "app.services.rag.retrieval_service.RetrievalService.retrieve_for_chat_shell",
+                new_callable=AsyncMock,
+                return_value={
+                    "mode": "rag_retrieval",
+                    "records": [
+                        {
+                            "content": "match",
+                            "score": 0.9,
+                            "title": "release.md",
+                            "knowledge_base_id": 1,
+                        }
+                    ],
+                    "total": 1,
+                    "total_estimated_tokens": 0,
+                },
+            ) as mock_retrieve,
+        ):
+            await tool._arun(
+                query="release checklist",
+                document_names=["release.md"],
+            )
+
+        mock_resolve.assert_called_once_with(
+            db=tool.db_session,
+            knowledge_base_ids=[1],
+            document_names=["release.md"],
+        )
+        assert mock_retrieve.await_args.kwargs["document_ids"] == [301]
+
     @pytest.mark.asyncio
     async def test_force_rag_route_sorts_and_limits_results_globally(self):
         """Backend should return the final globally ranked RAG records."""
diff --git a/chat_shell/chat_shell/tools/builtin/knowledge_base.py b/chat_shell/chat_shell/tools/builtin/knowledge_base.py
index ff5d6b77d1..bcec1c64e6 100644
--- a/chat_shell/chat_shell/tools/builtin/knowledge_base.py
+++ b/chat_shell/chat_shell/tools/builtin/knowledge_base.py
@@ -628,11 +628,12 @@ async def _arun_impl(
                 raw_result.get("records", [])
             )
             if not kb_chunks:
-                message = (
+                default_message = (
                     "No documents found in the knowledge base."
                     if route_mode == InjectionMode.DIRECT_INJECTION
                     else "No relevant information found in the knowledge base for this query."
                 )
+                message = raw_result.get("message") or default_message
                 return json.dumps(
                     {
                         "query": query,
@@ -840,6 +841,34 @@ async def _retrieve_with_strategy_from_all_kbs(
             )
         else:
             try:
+                resolved_document_ids = self.document_ids or None
+                if not resolved_document_ids and self.document_names:
+                    from app.services.knowledge import KnowledgeService
+
+                    resolved_document_ids = (
+                        KnowledgeService.resolve_document_ids_by_names(
+                            db=self.db_session,
+                            knowledge_base_ids=self.knowledge_base_ids,
+                            document_names=self.document_names,
+                        )
+                        or None
+                    )
+                    if not resolved_document_ids:
+                        result = {
+                            "mode": InjectionMode.RAG_ONLY,
+                            "records": [],
+                            "total": 0,
+                            "total_estimated_tokens": 0,
+                            "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.",
+                        }
+                        mode = result.get("mode", InjectionMode.RAG_ONLY)
+                        logger.info(
+                            "[KnowledgeBaseTool] Retrieved %d records via Backend route mode=%s",
+                            len(result.get("records", [])),
+                            mode,
+                        )
+                        return mode, result
+
                 from app.services.rag.retrieval_service import RetrievalService
 
                 retrieval_service = RetrievalService()
@@ -848,7 +877,7 @@ async def _retrieve_with_strategy_from_all_kbs(
                     knowledge_base_ids=self.knowledge_base_ids,
                     db=self.db_session,
                     max_results=max_results,
-                    document_ids=self.document_ids or None,
+                    document_ids=resolved_document_ids,
                     user_name=self.user_name,
                     route_mode=route_mode,
                     user_id=self.user_id,
diff --git a/chat_shell/tests/test_knowledge_base_scoped_search.py b/chat_shell/tests/test_knowledge_base_scoped_search.py
index 2c36288152..39ae9459e0 100644
--- a/chat_shell/tests/test_knowledge_base_scoped_search.py
+++ b/chat_shell/tests/test_knowledge_base_scoped_search.py
@@ -2,6 +2,7 @@
 #
 # SPDX-License-Identifier: Apache-2.0
 
+import json
 from unittest.mock import AsyncMock, MagicMock, patch
 
 import pytest
@@ -69,3 +70,49 @@ async def test_http_mode_forwards_document_names_and_ids():
     payload = post.call_args.kwargs["json"]
     assert payload["document_ids"] == [101]
     assert payload["document_names"] == ["release.md"]
+
+
+@pytest.mark.asyncio
+async def test_arun_preserves_backend_scoped_search_error_message():
+    tool = KnowledgeBaseTool(
+        knowledge_base_ids=[1],
+        user_id=7,
+    )
+
+    with (
+        patch.object(
+            tool,
+            "_get_kb_info",
+            AsyncMock(
+                return_value={
+                    "items": [
+                        {
+                            "id": 1,
+                            "name": "Test KB",
+                            "rag_enabled": True,
+                            "max_calls_per_conversation": 10,
+                            "exempt_calls_before_check": 5,
+                        }
+                    ]
+                }
+            ),
+        ),
+        patch.object(
+            tool,
+            "_retrieve_with_strategy_from_all_kbs",
+            AsyncMock(
+                return_value=(
+                    "rag_retrieval",
+                    {
+                        "mode": "rag_retrieval",
+                        "records": [],
+                        "total": 0,
+                        "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.",
+                    },
+                )
+            ),
+        ),
+    ):
+        result = json.loads(await tool._arun(query="release checklist"))
+
+    assert result["message"].startswith("Document names not found")
diff --git a/shared/prompts/knowledge_base.py b/shared/prompts/knowledge_base.py
index bcc611af31..d51c0c1a37 100644
--- a/shared/prompts/knowledge_base.py
+++ b/shared/prompts/knowledge_base.py
@@ -34,7 +34,11 @@
 - Action: Call `kb_ls` for the selected knowledge base(s). Summarize the document list and ask which document(s) to open if needed.
 
 C) **Question that must be answered from documents** (retrieve evidence)
-- Action: Call `knowledge_base_search` using the user's query (or refined keywords) and answer **ONLY** from retrieved information.
+- Action: Choose the highest-signal retrieval path first:
+  - If the user already named the target file(s), prefer `knowledge_base_search` with exact `document_names`.
+  - If you already know exact document IDs, prefer `knowledge_base_search` with `document_ids`.
+  - If the file names are unknown and scoping would help, call `kb_ls` first, then use scoped `knowledge_base_search` or `kb_head`.
+  - If the question is broad and no file-level scope is needed, call `knowledge_base_search` globally.
 
 D) **Knowledge base management** (optional, only if tools exist)
 - Examples: "Create a KB", "Add/update a document", "List all my KBs" (management, not Q&A)
@@ -43,14 +47,15 @@
 
 ### Required Workflow:
 (ONLY for type C)
-1. Call `knowledge_base_search` first
-2. Wait for results
-3. Answer **ONLY** from retrieved information
-4. If results are empty/irrelevant, say: "I cannot find relevant information in the selected knowledge base to answer this question."
-5. Do not use general knowledge or assumptions
+1. Decide whether global search, scoped search, or `kb_ls` is the best first move
+2. If you know exact files, use scoped `knowledge_base_search(document_names=[...])` or `knowledge_base_search(document_ids=[...])`
+3. Wait for results or inspect documents with `kb_head` when browsing is required
+4. Answer **ONLY** from retrieved or directly read knowledge-base content
+5. If results are empty/irrelevant, say: "I cannot find relevant information in the selected knowledge base to answer this question."
+6. Do not use general knowledge or assumptions
 
 ### Critical Rules:
-- Type C: you MUST NOT answer without searching first
+- Type C: you MUST retrieve evidence before answering, but you may choose `kb_ls` first when that is higher-signal
 - Type A/B: you MUST NOT force `knowledge_base_search` first (it is often low-signal)
 - Do not invent information not present in the knowledge base
 
@@ -60,6 +65,7 @@
 
 Use exploration tools when:
 - The user asks for an overview / document list (type B)
+- You need exact document names before scoped search
 - `knowledge_base_search` is unavailable (rag_not_configured / rejected) or you hit call-limit warnings (⚠️/🚨)
 
 Do not use exploration tools just because RAG returned no results.
@@ -86,7 +92,11 @@
 - Action: Prefer `kb_ls` (then `kb_head` only when the user wants to open a specific document).
 
 C) **Content question**
-- Action: Prefer `knowledge_base_search`.
+- Action: Choose the highest-signal retrieval path first.
+  - Use `knowledge_base_search` globally for broad KB questions.
+  - Use scoped `knowledge_base_search(document_names=[...])` when the user already knows exact file names.
+  - Use scoped `knowledge_base_search(document_ids=[...])` when exact document IDs are already known.
+  - If you need exact file names or want to narrow the scope before searching, call `kb_ls` first, then use scoped search or `kb_head`.
   - If results are relevant: answer using KB content and cite sources.
   - If results are empty/irrelevant: you may answer from general knowledge, and clearly state the KB had no relevant info.
   - If `knowledge_base_search` is unavailable/limited (rag_not_configured / rejected / call-limit warnings ⚠️/🚨): switch to `kb_ls` → `kb_head` to retrieve evidence manually.
@@ -99,6 +109,7 @@
 - Prefer knowledge base content when relevant; cite sources when used
 - If the KB has no relevant content, say so and answer from general knowledge
 - For "what's in the KB" questions, `kb_ls` is usually higher-signal than `knowledge_base_search`
+- `document_names` matching is exact, so use `kb_ls` first when you do not know the exact file names
 </knowledge_base>
 """
 
diff --git a/shared/tests/test_prompts.py b/shared/tests/test_prompts.py
index 5dc7559faf..201076dcad 100644
--- a/shared/tests/test_prompts.py
+++ b/shared/tests/test_prompts.py
@@ -40,10 +40,14 @@ def test_kb_prompt_strict_contains_required_content(self):
         assert "Intent Routing" in KB_PROMPT_STRICT
         assert "knowledge base selection / metadata" in strict_lower
         assert "question that must be answered from documents" in strict_lower
-        assert "must not answer without searching first" in strict_lower
+        assert "must retrieve evidence before answering" in strict_lower
         assert "general knowledge" in strict_lower
         assert "selected knowledge base content only" in strict_lower
         assert "knowledge base management" in strict_lower
+        assert "document_names" in KB_PROMPT_STRICT
+        assert "document_ids" in KB_PROMPT_STRICT
+        assert "kb_ls" in KB_PROMPT_STRICT
+        assert "kb_head" in KB_PROMPT_STRICT
 
     def test_kb_prompt_relaxed_contains_required_content(self):
         """KB_PROMPT_RELAXED should contain relaxed mode instructions."""
@@ -55,6 +59,9 @@ def test_kb_prompt_relaxed_contains_required_content(self):
         assert "general knowledge" in KB_PROMPT_RELAXED
         # All KB prompts use Intent Routing pattern
         assert "Intent Routing" in KB_PROMPT_RELAXED
+        assert "document_names" in KB_PROMPT_RELAXED
+        assert "document_ids" in KB_PROMPT_RELAXED
+        assert "kb_ls" in KB_PROMPT_RELAXED
 
     def test_prompts_are_different(self):
         """Strict and relaxed prompts should be different."""

From eb7f72ad0c962e924981a2c46e4dc5b74ec8a824 Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Tue, 7 Apr 2026 10:44:19 +0800
Subject: [PATCH 5/7] fix(knowledge): tighten KB scoped retrieval

---
 backend/app/api/endpoints/internal/rag.py     | 14 ++--
 .../services/knowledge/knowledge_service.py   |  1 +
 .../knowledge/test_knowledge_service.py       | 49 +++++++++++++
 .../tools/builtin/knowledge_base.py           | 69 ++++++++++++-------
 .../test_knowledge_base_scoped_search.py      | 58 ++++++++++++++++
 5 files changed, 158 insertions(+), 33 deletions(-)

diff --git a/backend/app/api/endpoints/internal/rag.py b/backend/app/api/endpoints/internal/rag.py
index ae4ce6667f..66871619b8 100644
--- a/backend/app/api/endpoints/internal/rag.py
+++ b/backend/app/api/endpoints/internal/rag.py
@@ -221,13 +221,13 @@ async def internal_retrieve(
                 document_names=request.document_names,
             )
             if not resolved_document_ids:
-                return {
-                    "mode": "rag_retrieval",
-                    "records": [],
-                    "total": 0,
-                    "total_estimated_tokens": 0,
-                    "message": "Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.",
-                }
+                return InternalRetrieveResponse(
+                    mode="rag_retrieval",
+                    records=[],
+                    total=0,
+                    total_estimated_tokens=0,
+                    message="Document names not found in the selected knowledge bases. Use kb_ls to inspect available documents first.",
+                )
 
         if resolved_document_ids:
             logger.info(
diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py
index ceee3f3eda..3a298543dd 100644
--- a/backend/app/services/knowledge/knowledge_service.py
+++ b/backend/app/services/knowledge/knowledge_service.py
@@ -910,6 +910,7 @@ def resolve_document_ids_by_names(
             .filter(
                 KnowledgeDocument.kind_id.in_(knowledge_base_ids),
                 KnowledgeDocument.name.in_(normalized_names),
+                KnowledgeDocument.is_active == True,
             )
             .all()
         )
diff --git a/backend/tests/services/knowledge/test_knowledge_service.py b/backend/tests/services/knowledge/test_knowledge_service.py
index 8793503631..40e980b5b9 100644
--- a/backend/tests/services/knowledge/test_knowledge_service.py
+++ b/backend/tests/services/knowledge/test_knowledge_service.py
@@ -7,6 +7,7 @@
 
 import pytest
 
+from app.models.knowledge import KnowledgeDocument
 from app.services.context import context_service
 from app.services.knowledge.knowledge_service import KnowledgeService
 
@@ -62,3 +63,51 @@ def test_update_document_content_overwrites_attachment_binary(self) -> None:
             binary_data="# Updated release notes".encode("utf-8"),
         )
         db.refresh.assert_called_once_with(document)
+
+
+@pytest.mark.unit
+class TestKnowledgeServiceResolveDocumentIdsByNames:
+    def test_resolve_document_ids_by_names_returns_only_active_docs_in_scope(
+        self, test_db, test_user
+    ) -> None:
+        """Document-name resolution should ignore inactive and out-of-scope rows."""
+        active_doc = KnowledgeDocument(
+            kind_id=10,
+            attachment_id=0,
+            name="release.md",
+            file_extension="md",
+            file_size=10,
+            user_id=test_user.id,
+            is_active=True,
+            source_type="file",
+        )
+        inactive_doc = KnowledgeDocument(
+            kind_id=10,
+            attachment_id=0,
+            name="release.md",
+            file_extension="md",
+            file_size=10,
+            user_id=test_user.id,
+            is_active=False,
+            source_type="file",
+        )
+        other_kb_doc = KnowledgeDocument(
+            kind_id=11,
+            attachment_id=0,
+            name="release.md",
+            file_extension="md",
+            file_size=10,
+            user_id=test_user.id,
+            is_active=True,
+            source_type="file",
+        )
+        test_db.add_all([active_doc, inactive_doc, other_kb_doc])
+        test_db.commit()
+
+        resolved_ids = KnowledgeService.resolve_document_ids_by_names(
+            db=test_db,
+            knowledge_base_ids=[10],
+            document_names=["release.md"],
+        )
+
+        assert resolved_ids == [active_doc.id]
diff --git a/chat_shell/chat_shell/tools/builtin/knowledge_base.py b/chat_shell/chat_shell/tools/builtin/knowledge_base.py
index bcec1c64e6..803080759e 100644
--- a/chat_shell/chat_shell/tools/builtin/knowledge_base.py
+++ b/chat_shell/chat_shell/tools/builtin/knowledge_base.py
@@ -10,8 +10,7 @@
 
 import json
 import logging
-from contextlib import contextmanager
-from typing import Any, Dict, Iterator, List, Optional
+from typing import Any, Dict, List, Optional
 
 from langchain_core.callbacks import CallbackManagerForToolRun
 from langchain_core.tools import BaseTool
@@ -478,22 +477,19 @@ def _run(
         """Synchronous run - not implemented, use async version."""
         raise NotImplementedError("KnowledgeBaseTool only supports async execution")
 
-    @contextmanager
-    def _apply_call_scoped_filters(
+    def _resolve_scoped_filters(
         self,
         document_ids: Optional[list[int]] = None,
         document_names: Optional[list[str]] = None,
-    ) -> Iterator[None]:
-        """Temporarily apply per-call document filters."""
-        original_document_ids = self.document_ids
-        original_document_names = self.document_names
-        self.document_ids = document_ids or original_document_ids
-        self.document_names = document_names or original_document_names
-        try:
-            yield
-        finally:
-            self.document_ids = original_document_ids
-            self.document_names = original_document_names
+    ) -> tuple[list[int], list[str]]:
+        """Resolve per-call filters without mutating tool instance state."""
+        effective_document_ids = (
+            self.document_ids if document_ids is None else document_ids
+        )
+        effective_document_names = (
+            self.document_names if document_names is None else document_names
+        )
+        return effective_document_ids, effective_document_names
 
     async def _arun(
         self,
@@ -504,14 +500,25 @@ async def _arun(
         run_manager: CallbackManagerForToolRun | None = None,
     ) -> str:
         """Execute knowledge base search with optional per-call scoped filters."""
-        with self._apply_call_scoped_filters(document_ids, document_names):
-            return await self._arun_impl(query, max_results, run_manager)
+        effective_document_ids, effective_document_names = self._resolve_scoped_filters(
+            document_ids=document_ids,
+            document_names=document_names,
+        )
+        return await self._arun_impl(
+            query,
+            max_results,
+            run_manager,
+            document_ids=effective_document_ids,
+            document_names=effective_document_names,
+        )
 
     async def _arun_impl(
         self,
         query: str,
         max_results: int = 20,
         run_manager: CallbackManagerForToolRun | None = None,
+        document_ids: Optional[list[int]] = None,
+        document_names: Optional[list[str]] = None,
     ) -> str:
         """Execute knowledge base search with intelligent injection strategy.
 
@@ -533,6 +540,8 @@ async def _arun_impl(
             JSON string with search results or injected content
         """
         try:
+            effective_document_ids = document_ids or []
+            effective_document_names = document_names or []
             if not self.knowledge_base_ids:
                 return json.dumps(
                     {"error": "No knowledge bases configured for this conversation."}
@@ -597,8 +606,8 @@ async def _arun_impl(
             logger.info(
                 f"[KnowledgeBaseTool] Searching {len(self.knowledge_base_ids)} knowledge bases with query: {query}"
                 + (
-                    f", filtering by {len(self.document_ids)} documents"
-                    if self.document_ids
+                    f", filtering by {len(effective_document_ids)} documents"
+                    if effective_document_ids
                     else ""
                 )
             )
@@ -613,6 +622,8 @@ async def _arun_impl(
                 query=query,
                 max_results=max_results,
                 route_mode=preferred_route_mode,
+                document_ids=effective_document_ids,
+                document_names=effective_document_names,
             )
 
             logger.info(
@@ -831,6 +842,8 @@ async def _retrieve_with_strategy_from_all_kbs(
         query: str,
         max_results: int,
         route_mode: str = "auto",
+        document_ids: Optional[list[int]] = None,
+        document_names: Optional[list[str]] = None,
     ) -> tuple[str, Dict[str, Any]]:
         """Retrieve KB data using Backend-side route selection."""
         if self.db_session is None:
@@ -838,18 +851,20 @@ async def _retrieve_with_strategy_from_all_kbs(
                 query=query,
                 max_results=max_results,
                 route_mode=route_mode,
+                document_ids=document_ids,
+                document_names=document_names,
             )
         else:
             try:
-                resolved_document_ids = self.document_ids or None
-                if not resolved_document_ids and self.document_names:
+                resolved_document_ids = document_ids or None
+                if not resolved_document_ids and document_names:
                     from app.services.knowledge import KnowledgeService
 
                     resolved_document_ids = (
                         KnowledgeService.resolve_document_ids_by_names(
                             db=self.db_session,
                             knowledge_base_ids=self.knowledge_base_ids,
-                            document_names=self.document_names,
+                            document_names=document_names,
                         )
                         or None
                     )
@@ -941,6 +956,8 @@ async def _retrieve_with_strategy_via_http(
         query: str,
         max_results: int,
         route_mode: str = "auto",
+        document_ids: Optional[list[int]] = None,
+        document_names: Optional[list[str]] = None,
     ) -> Dict[str, Any]:
         """Retrieve KB data from Backend internal retrieve endpoint."""
         import httpx
@@ -966,10 +983,10 @@ async def _retrieve_with_strategy_via_http(
         mediation_context = self._build_mediation_context()
         if mediation_context:
             payload["mediation_context"] = mediation_context
-        if self.document_ids:
-            payload["document_ids"] = self.document_ids
-        if self.document_names:
-            payload["document_names"] = self.document_names
+        if document_ids:
+            payload["document_ids"] = document_ids
+        if document_names:
+            payload["document_names"] = document_names
         if self.user_name is not None:
             payload["user_name"] = self.user_name
 
diff --git a/chat_shell/tests/test_knowledge_base_scoped_search.py b/chat_shell/tests/test_knowledge_base_scoped_search.py
index 39ae9459e0..d00fb4434d 100644
--- a/chat_shell/tests/test_knowledge_base_scoped_search.py
+++ b/chat_shell/tests/test_knowledge_base_scoped_search.py
@@ -116,3 +116,61 @@ async def test_arun_preserves_backend_scoped_search_error_message():
         result = json.loads(await tool._arun(query="release checklist"))
 
     assert result["message"].startswith("Document names not found")
+
+
+@pytest.mark.asyncio
+async def test_arun_passes_scoped_filters_without_mutating_tool_defaults():
+    tool = KnowledgeBaseTool(
+        knowledge_base_ids=[1],
+        user_id=7,
+        document_ids=[999],
+        document_names=["default.md"],
+    )
+
+    async def _fake_retrieve(**kwargs):
+        assert kwargs["document_ids"] == [101]
+        assert kwargs["document_names"] == ["release.md"]
+        assert tool.document_ids == [999]
+        assert tool.document_names == ["default.md"]
+        return (
+            "rag_retrieval",
+            {
+                "mode": "rag_retrieval",
+                "records": [],
+                "total": 0,
+                "total_estimated_tokens": 0,
+            },
+        )
+
+    with (
+        patch.object(
+            tool,
+            "_get_kb_info",
+            AsyncMock(
+                return_value={
+                    "items": [
+                        {
+                            "id": 1,
+                            "name": "Test KB",
+                            "rag_enabled": True,
+                            "max_calls_per_conversation": 10,
+                            "exempt_calls_before_check": 5,
+                        }
+                    ]
+                }
+            ),
+        ),
+        patch.object(
+            tool,
+            "_retrieve_with_strategy_from_all_kbs",
+            AsyncMock(side_effect=_fake_retrieve),
+        ),
+    ):
+        await tool._arun(
+            query="release checklist",
+            document_ids=[101],
+            document_names=["release.md"],
+        )
+
+    assert tool.document_ids == [999]
+    assert tool.document_names == ["default.md"]

From dad292820693650a2518f9e33bce1eeec404e04c Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Tue, 7 Apr 2026 10:51:32 +0800
Subject: [PATCH 6/7] chore(docs): remove local KB spec from branch

---
 .../2026-04-04-kb-rag-retrieval-design.md     | 229 ------------------
 shared/prompts/knowledge_base.py              |   3 +
 shared/tests/test_prompts.py                  |   5 +
 3 files changed, 8 insertions(+), 229 deletions(-)
 delete mode 100644 docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md

diff --git a/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md b/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md
deleted file mode 100644
index 74f15fb539..0000000000
--- a/docs/superpowers/specs/2026-04-04-kb-rag-retrieval-design.md
+++ /dev/null
@@ -1,229 +0,0 @@
----
-sidebar_position: 1
----
-
-# KB RAG Retrieval Design
-
-## Overview
-
-This design improves knowledge-base retrieval behavior for AI agents without adding new tools, new persistent configuration, or new backend I/O requests.
-
-The current system already provides three complementary KB capabilities:
-
-- `kb_ls` for document-level overview
-- `knowledge_base_search` for fast retrieval
-- `kb_head` for direct content reading
-
-The main gap is not missing infrastructure. The gap is that agents do not have enough guidance or enough scoped-search controls to use these capabilities proactively on large or complex knowledge bases.
-
-This design keeps the current intent-routing model, but changes the routing outcome for KB content questions from a mostly fixed "search first" path to a more flexible "decide the next best retrieval action" path.
-
-## Goals
-
-- Preserve intent routing as the first decision step for KB interactions.
-- Let the agent choose between direct search, document overview, scoped search, and manual reading based on the current question and KB metadata.
-- Allow `knowledge_base_search` to narrow search scope by specific documents.
-- Reuse existing retrieval infrastructure and call-limit behavior.
-- Avoid extra network requests or new long-lived configuration.
-
-## Non-Goals
-
-- No new "smart KB" tool that wraps `kb_ls`, `knowledge_base_search`, and `kb_head`.
-- No backend-enforced size thresholds such as small/medium/large KB routing.
-- No new generic metadata filter object exposed to the model in the first version.
-- No changes to existing call-limit policies.
-
-## Current Problems
-
-1. Prompt guidance over-biases KB question answering toward `knowledge_base_search`, which weakens retrieval quality on larger or more heterogeneous KBs.
-2. `knowledge_base_search` already supports document filtering internally through `document_ids`, but that capability is not exposed to the model as part of the tool schema.
-3. Dynamic KB metadata injected into the conversation does not currently include retrieval-relevant facts such as document count or whether RAG is available.
-4. `kb_ls` is described mainly as a fallback path instead of a proactive range-discovery tool.
-
-## Proposed Behavior
-
-### Intent Routing
-
-Intent routing remains the first step.
-
-For KB-related requests, the model should still first classify whether the user is asking about:
-
-- KB metadata or selection
-- KB document overview
-- KB content question
-- KB management operations
-- Manual content reading in no-RAG situations
-
-The change is in the next-step guidance for KB content questions:
-
-- If the question is precise and direct retrieval is likely enough, the model may call `knowledge_base_search` immediately.
-- If the question is broad, ambiguous, or likely to depend on document structure or document choice, the model should prefer `kb_ls` first.
-- After `kb_ls`, the model may call `knowledge_base_search` again with scoped document arguments.
-- If RAG is unavailable, the model should use `kb_ls` and `kb_head`.
-
-### Dynamic KB Metadata
-
-The request-scoped KB metadata block should remain lightweight and objective.
-
-Each KB entry should include:
-
-- `kb_name`
-- `kb_id`
-- `document_count`
-- `rag_enabled`
-- existing optional summary/topic hints when available
-
-No derived size buckets should be added. Exact values are simpler to maintain and give the model enough signal to decide whether it should first inspect document scope.
-
-Example shape:
-
-```text
-Knowledge Bases In Scope:
-- KB Name: Product Docs, KB ID: 12, Documents: 128, RAG: enabled
-  - Summary: Internal product documentation and runbooks
-  - Topics: release process, deployment, alerts
-```
-
-### Scoped Search
-
-`knowledge_base_search` should support narrowing retrieval to specific documents with:
-
-- `document_ids: number[]`
-- `document_names: string[]`
-
-Expected usage:
-
-- If the model already knows exact document names from the user prompt, previous `kb_ls` output, or conversation history, it may use `document_names` directly.
-- If the model does not know the target documents, it should use `kb_ls` first.
-- If the model has document IDs, it should prefer `document_ids`.
-
-## Interface Design
-
-### Shared Prompt Layer
-
-Update the KB prompt templates in `shared/prompts/knowledge_base.py`.
-
-Key prompt adjustments:
-
-- Keep `Intent Routing (DO THIS FIRST)`.
-- Keep the current strict / relaxed / no-RAG / restricted mode structure.
-- Change KB content-question guidance from effectively "search first" to "decide whether to inspect scope first".
-- Reframe `kb_ls` as a proactive document-overview tool, not only a fallback tool.
-- Document that scoped `knowledge_base_search` can use `document_ids` or `document_names` when the target documents are known.
-
-Restricted mode remains search-only. It does not expose `kb_ls` or `kb_head`.
-
-### Dynamic KB Meta Prompt
-
-Update the backend KB meta formatter so each entry includes `document_count` and `rag_enabled`.
-
-This change should stay formatting-only from the prompt module's perspective. The preprocessing layer remains responsible for assembling the meta list from already available backend state.
-
-### Chat Shell Tool Schema
-
-Extend the `KnowledgeBaseInput` schema used by `knowledge_base_search`:
-
-- add optional `document_ids`
-- add optional `document_names`
-
-The tool continues to call the same backend internal retrieve endpoint. Existing calls with only `query` and `max_results` remain valid and unchanged.
-
-### Backend Internal Retrieve Surface
-
-Extend the internal retrieval request model to accept:
-
-- `document_ids`
-- `document_names`
-
-Resolution rules:
-
-- If `document_ids` is provided, retrieval uses those IDs directly.
-- If `document_names` is provided, backend resolves them to document IDs within the currently scoped KB set.
-- If both are provided, `document_ids` takes precedence and `document_names` is ignored.
-- Name resolution uses exact matching only.
-- If no names resolve, return a structured error instructing the caller to use `kb_ls` first.
-
-The backend should resolve `document_names` into `document_ids` before entering the existing retrieval/filtering flow so that the downstream retrieval path remains unchanged.
-
-## Matching Rules
-
-### `document_names`
-
-- Matching is exact.
-- Matching is scoped to the KBs already attached to the current tool invocation.
-- If multiple KBs contain the same exact document name, all matching documents are included.
-- No fuzzy matching or contains matching is added in the first version.
-
-### Ambiguity
-
-If multiple exact matches exist across KBs, that is acceptable and should not be treated as an error. The current system already supports multi-KB retrieval, so the scoped result set can include all exact matches.
-
-## Error Handling
-
-- If `document_names` resolves to no documents, return a structured error and guidance to call `kb_ls`.
-- If `document_ids` or `document_names` is an empty list, treat it as absent input.
-- If RAG is not configured, preserve the current no-RAG response and guide the model toward `kb_ls` and `kb_head`.
-- Restricted mode continues to allow only `knowledge_base_search`, even when scoped arguments are used.
-- Do not silently fall back from failed scoped search to global search. Silent widening would reduce precision and make tool behavior unpredictable.
-
-## Compatibility
-
-- Existing callers that use only `query` and `max_results` keep the current behavior.
-- Existing retrieval logic based on `document_ids -> metadata_condition -> backend storage retrieval` remains the main implementation path.
-- No new tool is introduced.
-- No new persistent KB configuration is introduced.
-- No new call-limit behavior is introduced.
-
-## Implementation Boundaries
-
-### Shared
-
-Responsible for updating the KB prompt instructions only.
-
-### Chat Shell
-
-Responsible for exposing scoped-search inputs to the model and forwarding them to backend retrieval.
-
-### Backend
-
-Responsible for:
-
-- enriching KB metadata with `document_count` and `rag_enabled`
-- resolving `document_names` to `document_ids`
-- keeping retrieval execution on the existing filtered retrieval path
-
-## Testing
-
-### Shared
-
-- Update prompt tests to reflect the new KB guidance wording.
-- Add or update tests for KB meta prompt formatting with `document_count` and `rag_enabled`.
-
-### Chat Shell
-
-- Verify `knowledge_base_search` schema includes `document_ids`.
-- Verify `knowledge_base_search` schema includes `document_names`.
-- Verify scoped arguments are forwarded in HTTP mode.
-- Verify legacy unscoped usage remains unchanged.
-
-### Backend
-
-- Verify internal retrieve accepts `document_names`.
-- Verify exact-match name resolution within KB scope.
-- Verify unresolved names return the expected structured error.
-- Verify multiple exact matches across KBs are included.
-- Verify direct `document_ids` filtering still works.
-- Verify the downstream retrieval path still uses document-based metadata filtering.
-
-## Rollout Notes
-
-This design is intentionally incremental.
-
-The first version should stop at:
-
-1. prompt guidance updates
-2. dynamic metadata enrichment
-3. scoped search support through `document_ids` and `document_names`
-4. backend exact-match document-name resolution
-
-Do not expand the first version into generic filter support, fuzzy matching, or backend hard-coded KB-size routing.
diff --git a/shared/prompts/knowledge_base.py b/shared/prompts/knowledge_base.py
index d51c0c1a37..f564c0aca2 100644
--- a/shared/prompts/knowledge_base.py
+++ b/shared/prompts/knowledge_base.py
@@ -38,6 +38,7 @@
   - If the user already named the target file(s), prefer `knowledge_base_search` with exact `document_names`.
   - If you already know exact document IDs, prefer `knowledge_base_search` with `document_ids`.
   - If the file names are unknown and scoping would help, call `kb_ls` first, then use scoped `knowledge_base_search` or `kb_head`.
+  - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters.
   - If the question is broad and no file-level scope is needed, call `knowledge_base_search` globally.
 
 D) **Knowledge base management** (optional, only if tools exist)
@@ -97,6 +98,7 @@
   - Use scoped `knowledge_base_search(document_names=[...])` when the user already knows exact file names.
   - Use scoped `knowledge_base_search(document_ids=[...])` when exact document IDs are already known.
   - If you need exact file names or want to narrow the scope before searching, call `kb_ls` first, then use scoped search or `kb_head`.
+  - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters.
   - If results are relevant: answer using KB content and cite sources.
   - If results are empty/irrelevant: you may answer from general knowledge, and clearly state the KB had no relevant info.
   - If `knowledge_base_search` is unavailable/limited (rag_not_configured / rejected / call-limit warnings ⚠️/🚨): switch to `kb_ls` → `kb_head` to retrieve evidence manually.
@@ -110,6 +112,7 @@
 - If the KB has no relevant content, say so and answer from general knowledge
 - For "what's in the KB" questions, `kb_ls` is usually higher-signal than `knowledge_base_search`
 - `document_names` matching is exact, so use `kb_ls` first when you do not know the exact file names
+- If spreadsheet files are present, verify precision-sensitive table details with `kb_head` instead of relying on `knowledge_base_search` alone
 </knowledge_base>
 """
 
diff --git a/shared/tests/test_prompts.py b/shared/tests/test_prompts.py
index 201076dcad..d1dde5711e 100644
--- a/shared/tests/test_prompts.py
+++ b/shared/tests/test_prompts.py
@@ -48,12 +48,15 @@ def test_kb_prompt_strict_contains_required_content(self):
         assert "document_ids" in KB_PROMPT_STRICT
         assert "kb_ls" in KB_PROMPT_STRICT
         assert "kb_head" in KB_PROMPT_STRICT
+        assert "spreadsheet files" in strict_lower
+        assert "lower-confidence" in strict_lower
 
     def test_kb_prompt_relaxed_contains_required_content(self):
         """KB_PROMPT_RELAXED should contain relaxed mode instructions."""
         from shared.prompts import KB_PROMPT_RELAXED
 
         # Check for key phrases in relaxed mode
+        relaxed_lower = KB_PROMPT_RELAXED.lower()
         assert "knowledge_base_search" in KB_PROMPT_RELAXED
         # Relaxed mode must allow fallback to general knowledge
         assert "general knowledge" in KB_PROMPT_RELAXED
@@ -62,6 +65,8 @@ def test_kb_prompt_relaxed_contains_required_content(self):
         assert "document_names" in KB_PROMPT_RELAXED
         assert "document_ids" in KB_PROMPT_RELAXED
         assert "kb_ls" in KB_PROMPT_RELAXED
+        assert "spreadsheet files" in relaxed_lower
+        assert "kb_head" in relaxed_lower
 
     def test_prompts_are_different(self):
         """Strict and relaxed prompts should be different."""

From 9ac8342bbb693033620670154134deda713da19b Mon Sep 17 00:00:00 2001
From: yanhe1 <yanhe1@staff.weibo.com>
Date: Tue, 7 Apr 2026 11:41:08 +0800
Subject: [PATCH 7/7] fix(knowledge): refine spreadsheet retrieval guidance

---
 .../services/knowledge/knowledge_service.py   |  7 ++-
 .../knowledge/test_knowledge_service.py       | 51 +++++++++++++++++++
 shared/prompts/knowledge_base.py              |  6 +--
 shared/tests/test_prompts.py                  |  3 +-
 4 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/backend/app/services/knowledge/knowledge_service.py b/backend/app/services/knowledge/knowledge_service.py
index 3a298543dd..5715d2d8b8 100644
--- a/backend/app/services/knowledge/knowledge_service.py
+++ b/backend/app/services/knowledge/knowledge_service.py
@@ -866,8 +866,11 @@ def get_document_prompt_stats(
                 func.sum(
                     case(
                         (
-                            func.lower(KnowledgeDocument.file_extension).in_(
-                                spreadsheet_exts
+                            and_(
+                                KnowledgeDocument.is_active == True,
+                                func.lower(KnowledgeDocument.file_extension).in_(
+                                    spreadsheet_exts
+                                ),
                             ),
                             1,
                         ),
diff --git a/backend/tests/services/knowledge/test_knowledge_service.py b/backend/tests/services/knowledge/test_knowledge_service.py
index 40e980b5b9..e5924acf21 100644
--- a/backend/tests/services/knowledge/test_knowledge_service.py
+++ b/backend/tests/services/knowledge/test_knowledge_service.py
@@ -111,3 +111,54 @@ def test_resolve_document_ids_by_names_returns_only_active_docs_in_scope(
         )
 
         assert resolved_ids == [active_doc.id]
+
+
+@pytest.mark.unit
+class TestKnowledgeServiceGetDocumentPromptStats:
+    def test_get_document_prompt_stats_counts_only_active_spreadsheets(
+        self, test_db, test_user
+    ) -> None:
+        """Spreadsheet count should reflect searchable spreadsheet documents only."""
+        active_csv = KnowledgeDocument(
+            kind_id=10,
+            attachment_id=0,
+            name="active-report.csv",
+            file_extension="csv",
+            file_size=10,
+            user_id=test_user.id,
+            is_active=True,
+            source_type="file",
+        )
+        inactive_xlsx = KnowledgeDocument(
+            kind_id=10,
+            attachment_id=0,
+            name="inactive-report.xlsx",
+            file_extension="xlsx",
+            file_size=10,
+            user_id=test_user.id,
+            is_active=False,
+            source_type="file",
+        )
+        active_markdown = KnowledgeDocument(
+            kind_id=10,
+            attachment_id=0,
+            name="guide.md",
+            file_extension="md",
+            file_size=10,
+            user_id=test_user.id,
+            is_active=True,
+            source_type="file",
+        )
+        test_db.add_all([active_csv, inactive_xlsx, active_markdown])
+        test_db.commit()
+
+        stats = KnowledgeService.get_document_prompt_stats(
+            db=test_db,
+            knowledge_base_ids=[10],
+        )
+
+        assert stats[10] == {
+            "total_document_count": 3,
+            "searchable_document_count": 2,
+            "spreadsheet_document_count": 1,
+        }
diff --git a/shared/prompts/knowledge_base.py b/shared/prompts/knowledge_base.py
index f564c0aca2..ebf69a901c 100644
--- a/shared/prompts/knowledge_base.py
+++ b/shared/prompts/knowledge_base.py
@@ -38,7 +38,7 @@
   - If the user already named the target file(s), prefer `knowledge_base_search` with exact `document_names`.
   - If you already know exact document IDs, prefer `knowledge_base_search` with `document_ids`.
   - If the file names are unknown and scoping would help, call `kb_ls` first, then use scoped `knowledge_base_search` or `kb_head`.
-  - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters.
+  - If spreadsheet files are present, especially larger spreadsheets, `knowledge_base_search` may be less reliable for exact table details. If results seem weak or irrelevant, use `kb_ls` to narrow candidates and `kb_head` to verify.
   - If the question is broad and no file-level scope is needed, call `knowledge_base_search` globally.
 
 D) **Knowledge base management** (optional, only if tools exist)
@@ -98,7 +98,7 @@
   - Use scoped `knowledge_base_search(document_names=[...])` when the user already knows exact file names.
   - Use scoped `knowledge_base_search(document_ids=[...])` when exact document IDs are already known.
   - If you need exact file names or want to narrow the scope before searching, call `kb_ls` first, then use scoped search or `kb_head`.
-  - If the selected KBs include spreadsheet files, treat `knowledge_base_search` as lower-confidence for exact table details and prefer `kb_ls` plus targeted `kb_head` verification when precision matters.
+  - If spreadsheet files are present, especially larger spreadsheets, `knowledge_base_search` may be less reliable for exact table details. If results seem weak or irrelevant, use `kb_ls` to narrow candidates and `kb_head` to verify.
   - If results are relevant: answer using KB content and cite sources.
   - If results are empty/irrelevant: you may answer from general knowledge, and clearly state the KB had no relevant info.
   - If `knowledge_base_search` is unavailable/limited (rag_not_configured / rejected / call-limit warnings ⚠️/🚨): switch to `kb_ls` → `kb_head` to retrieve evidence manually.
@@ -112,7 +112,7 @@
 - If the KB has no relevant content, say so and answer from general knowledge
 - For "what's in the KB" questions, `kb_ls` is usually higher-signal than `knowledge_base_search`
 - `document_names` matching is exact, so use `kb_ls` first when you do not know the exact file names
-- If spreadsheet files are present, verify precision-sensitive table details with `kb_head` instead of relying on `knowledge_base_search` alone
+- If spreadsheet files are present, especially larger spreadsheets, and search results look weak or irrelevant, verify precision-sensitive table details with `kb_head`
 </knowledge_base>
 """
 
diff --git a/shared/tests/test_prompts.py b/shared/tests/test_prompts.py
index d1dde5711e..a64dfc2217 100644
--- a/shared/tests/test_prompts.py
+++ b/shared/tests/test_prompts.py
@@ -49,7 +49,7 @@ def test_kb_prompt_strict_contains_required_content(self):
         assert "kb_ls" in KB_PROMPT_STRICT
         assert "kb_head" in KB_PROMPT_STRICT
         assert "spreadsheet files" in strict_lower
-        assert "lower-confidence" in strict_lower
+        assert "weak or irrelevant" in strict_lower
 
     def test_kb_prompt_relaxed_contains_required_content(self):
         """KB_PROMPT_RELAXED should contain relaxed mode instructions."""
@@ -67,6 +67,7 @@ def test_kb_prompt_relaxed_contains_required_content(self):
         assert "kb_ls" in KB_PROMPT_RELAXED
         assert "spreadsheet files" in relaxed_lower
         assert "kb_head" in relaxed_lower
+        assert "weak or irrelevant" in relaxed_lower
 
     def test_prompts_are_different(self):
         """Strict and relaxed prompts should be different."""