Skip to content

Commit 3067cd8

Browse files
committed
fix(agent-workspace): recover title hits across scopes
Add document-only planner scope recovery when a scoped query returns no evidence but a title-like knowledge document exists outside the active corpus. Surface the active scope and recovered source in the Knowledge Workspace API status strip, and cover the behavior with backend and frontend regressions.
1 parent c95e6ac commit 3067cd8

10 files changed

Lines changed: 335 additions & 55 deletions

File tree

docs/diataxis/en/explanation/development-progress-dashboard.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,30 @@
33
This page is the implementation-facing dashboard for the Knowledge Mastery evolution plan.
44
It tracks what is already implemented, where the hard gaps remain, and how to verify progress from code and runtime behavior.
55

6+
## 2026-06-06 Active-Scope Miss Recovery and Document-Augmented RAG Patch
7+
8+
This patch resolves the live "what is water glass?" failure that reproduced while the WebView was already running on `npm run tauri:dev:mini:gpu`.
9+
10+
Runtime probes showed that the current sidecar could answer correctly when called with an explicit `waterglass` scope: it returned one grouped knowledge point, eight citations, and `matchedSpans`.
11+
The WebView, however, had `folder-select=financial`, `localStorage.nc_last_target=financial`, and `window.__NC_ACTIVE_SOURCE_TARGET.scope.sourcePathPrefixes=["Knowledge_Base/financial"]`.
12+
The user question was therefore sent as a scoped financial query. The existing planner found the global title-like `water glass` document, but then intersected that document id with the explicit financial workspace/corpus/prefix scope, reducing the retrieval candidate set to zero indexed atoms.
13+
14+
Code-vs-plan reconciliation for this patch:
15+
16+
| Requirement | Current implementation evidence | Progress call |
17+
|---|---|---|
18+
| Positive answer when the selected scope misses but the query clearly names another knowledge point | `buildQueryBackendContext()` now distinguishes title hits inside the requested scope from title hits outside it. If an explicit scope has no compatible title hit but a document title/alias hit exists elsewhere, retrieval switches to a document-only `planner_scope_recovery` scope instead of intersecting incompatible corpus constraints. | Implemented |
19+
| Return results by knowledge point, not duplicated sections | The prior document-level conversation grouping remains intact. The recovery query still returns segment-level evidence internally, then `mergeAgentConversationKnowledgePoints()` groups hits by `documentId` and exposes `matchedSpans` inside the single knowledge-point card. | Implemented |
20+
| RSE + document augmentation direction | The implementation keeps Relevant Segment Extraction behavior at retrieval time while adding document augmentation at planning time: title-like queries can recover the target document, and section hits inside that document become marked evidence spans rather than duplicated cards. | Operational baseline |
21+
| User-visible diagnosis of scope behavior | The Knowledge Workspace API status strip now includes the active scope label and, when recovery is used, the recovered source path. This directly exposes cases such as "Scope: financial" plus "Recovered: Knowledge_Base/waterglass/water glass.md". | Implemented |
22+
| Backward compatibility | Public response fields remain additive. Existing `assistantMessage`, `answer`, `assistantBlocks`, citations, and legacy sync/SSE flows remain supported. `scopeSource` gains a new optional value, `planner_scope_recovery`, without removing existing values. | Preserved |
23+
24+
Verification for this patch:
25+
26+
- Red/green backend regression: `KnowledgeLearningPlatform.test.ts` now covers `financial` active scope plus a `water glass` title-like query recovering the `waterglass` document and returning one grouped knowledge point with multiple matched spans.
27+
- Red/green frontend regression: `agent_workspace.frontend.test.ts` now covers status-strip scope and recovered-source visibility.
28+
- Live root-cause evidence: CDP showed the running WebView was scoped to `financial`; direct sidecar probing with `waterglass` scope returned grouped evidence correctly.
29+
630
## 2026-06-06 Knowledge Workspace RAG Answering and API Observability Slice
731

832
This update closes a practical Knowledge Workspace gap observed while `npm run tauri:dev:mini:gpu` was already running: the live sidecar could retrieve scoped `waterglass` evidence after hydration, but the user-facing answer still used the old "strongest scoped match" template and returned repeated section-level cards from the same knowledge point.

docs/diataxis/zh/explanation/development-progress-dashboard.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,30 @@
33
本页是“知识彻底掌握演进方案”的实现侧进度看板。
44
它用于回答三件事:哪些能力已落地、哪些关键缺口仍在、如何用代码与运行时证据验证推进结果。
55

6+
## 2026-06-06 active scope miss recovery 与 document-augmented RAG 修复
7+
8+
本次补丁修复了 WebView 已在 `npm run tauri:dev:mini:gpu` 中运行时复现的 “what is water glass?” 失败。
9+
10+
运行时探针显示:当前 sidecar 如果显式使用 `waterglass` scope 调用,会正确返回 1 个按知识点合并后的结果、8 条引用以及 `matchedSpans`
11+
但 WebView 当前状态是 `folder-select=financial``localStorage.nc_last_target=financial`,并且 `window.__NC_ACTIVE_SOURCE_TARGET.scope.sourcePathPrefixes=["Knowledge_Base/financial"]`
12+
因此用户问题实际上被发送成了 financial 限定范围内的 scoped query。旧 planner 虽然能在全局找到 `water glass` 的 title-like 文档命中,但随后把该 document id 与显式 financial workspace/corpus/prefix scope 做交集,最终把候选集压成 0 个 indexed atoms。
13+
14+
本补丁的代码 / 方案对齐结果:
15+
16+
| 要求 | 当前实现证据 | 进度判断 |
17+
|---|---|---|
18+
| 当前 scope 未命中但问题明确指向另一个知识点时仍能正面回答 | `buildQueryBackendContext()` 现在会区分 title hit 是否落在请求 scope 内。如果显式 scope 内没有兼容 title hit,但其他位置存在明确文档标题 / 别名命中,检索会切换到 document-only 的 `planner_scope_recovery` scope,而不是继续相交不兼容的 corpus 约束。 | 已实现 |
19+
| 按知识点返回,而不是重复返回 section | 之前的 document-level conversation grouping 继续保留。recovery query 内部仍保留 segment-level evidence,然后由 `mergeAgentConversationKnowledgePoints()``documentId` 合并,并把命中的 section 作为单一知识点卡片内的 `matchedSpans` 展示。 | 已实现 |
20+
| RSE + document augmentation 推进方向 | 当前实现把 Relevant Segment Extraction 留在检索阶段,同时在 planning 阶段加入 document augmentation:title-like query 可以恢复目标文档,文档内 section 命中会成为标注证据片段,而不是重复卡片。 | Operational baseline |
21+
| 用户可见 scope 诊断 | Knowledge Workspace API 状态条现在会显示 active scope;如果触发 recovery,还会显示恢复到的 source path。用户可以直接看到类似 “Scope: financial” 与 “Recovered: Knowledge_Base/waterglass/water glass.md” 的状态。 | 已实现 |
22+
| 向前兼容 | 公共响应字段只做加法。既有 `assistantMessage``answer``assistantBlocks`、citations、legacy sync/SSE 流程都继续保留。`scopeSource` 仅新增可选值 `planner_scope_recovery`,不删除旧值。 | 已保留 |
23+
24+
本补丁验证:
25+
26+
- Red/green 后端回归:`KnowledgeLearningPlatform.test.ts` 现在覆盖 active scope 为 `financial`、title-like query 为 `water glass` 时恢复到 `waterglass` 文档,并返回 1 个包含多个 matched spans 的合并知识点。
27+
- Red/green 前端回归:`agent_workspace.frontend.test.ts` 现在覆盖状态条中的 active scope 与 recovered source 可见性。
28+
- 运行时根因证据:CDP 显示当前 WebView scope 是 `financial`;直接用 `waterglass` scope 探针调用 sidecar 时,后端已能正确返回分组证据。
29+
630
## 2026-06-06 知识工作区 RAG 回答与 API 可观测性切片
731

832
本次更新修复的是 `npm run tauri:dev:mini:gpu` 已经运行时暴露出的实际知识工作区问题:运行中的 sidecar 在完成 hydration 后已经可以召回 `waterglass` 作用域证据,但用户可见回答仍使用旧的 “strongest scoped match” 模板,并且会把同一知识点文档内的多个 section 命中渲染成重复知识点卡片。
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
version https://git-lfs.github.com/spec/v1
2-
oid sha256:714f60fd700e5df870a16d0bb28c2e5d90a3bc429523c013fba49a69e5aadcb4
3-
size 77472205
2+
oid sha256:bd24633a79b1ae330d2e0c3384f75f622d57dcb9f761316791534523fc7d9ea6
3+
size 77471155

src/agent_workspace.frontend.test.ts

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3001,6 +3001,15 @@ describe('agent workspace learning-path integration', () => {
30013001
if (!fetchMock) {
30023002
throw new Error('expected fetch mock');
30033003
}
3004+
(window as any).__NC_ACTIVE_SOURCE_TARGET = {
3005+
target: 'financial',
3006+
source: 'test',
3007+
scope: {
3008+
workspaceId: 'financial',
3009+
corpusId: 'financial',
3010+
sourcePathPrefixes: ['Knowledge_Base/financial'],
3011+
},
3012+
};
30043013

30053014
fetchMock.mockImplementationOnce(async () => createSseResponse([
30063015
{
@@ -3041,6 +3050,27 @@ describe('agent workspace learning-path integration', () => {
30413050
recalledMemoryCount: 0,
30423051
queryEvidenceCoverageRatioPct: 100,
30433052
},
3053+
trace: {
3054+
usedScope: {
3055+
source: 'scoped',
3056+
workspaceId: null,
3057+
corpusId: null,
3058+
documentIds: ['doc_status'],
3059+
atomIds: [],
3060+
sourcePathPrefixes: [],
3061+
languages: [],
3062+
matchedAtomCount: 1,
3063+
scopeSource: 'planner_scope_recovery',
3064+
},
3065+
retrieval: {
3066+
retrievalModes: ['keyword', 'planner_scope_recovery'],
3067+
scopeRecovery: {
3068+
reason: 'title_like_document_hit_outside_requested_scope',
3069+
recoveredDocumentIds: ['doc_status'],
3070+
recoveredSourcePaths: ['Knowledge_Base/waterglass/water glass.md'],
3071+
},
3072+
},
3073+
},
30443074
},
30453075
},
30463076
},
@@ -3058,6 +3088,8 @@ describe('agent workspace learning-path integration', () => {
30583088
expect(statusText).toContain('SSE');
30593089
expect(statusText).toContain('1 knowledge point');
30603090
expect(statusText).toContain('1 citation');
3091+
expect(statusText).toContain('Scope: financial');
3092+
expect(statusText).toContain('Recovered: Knowledge_Base/waterglass/water glass.md');
30613093
expect(statusText).toMatch(/\d+ ms/);
30623094
});
30633095

src/frontend/agent_workspace.js

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -634,8 +634,19 @@
634634
? Math.max(0, Math.round(Number(status.latencyMs)))
635635
: null;
636636
const error = String(status && status.error || '').trim();
637+
const activeTarget = String(status && status.activeTarget || '').trim();
637638
const result = status && typeof status.result === 'object' ? status.result : null;
638639
const summary = result && typeof result.summary === 'object' ? result.summary : {};
640+
const trace = result && typeof result.trace === 'object' ? result.trace : {};
641+
const retrievalTrace = trace && typeof trace.retrieval === 'object' ? trace.retrieval : {};
642+
const scopeRecovery = retrievalTrace && typeof retrievalTrace.scopeRecovery === 'object'
643+
? retrievalTrace.scopeRecovery
644+
: null;
645+
const recoveredSourcePaths = Array.isArray(scopeRecovery && scopeRecovery.recoveredSourcePaths)
646+
? scopeRecovery.recoveredSourcePaths
647+
.map((sourcePath) => String(sourcePath || '').trim())
648+
.filter(Boolean)
649+
: [];
639650
const knowledgePointCount = Number.isFinite(Number(summary.returnedKnowledgePoints))
640651
? Number(summary.returnedKnowledgePoints)
641652
: (Array.isArray(result && result.knowledgePoints) ? result.knowledgePoints.length : 0);
@@ -656,9 +667,17 @@
656667
endpoint,
657668
transport,
658669
latencyMs !== null ? `${latencyMs} ms` : '',
670+
activeTarget
671+
? translate('agentWorkspace.apiStatus.scope', 'Scope: {scope}', { scope: activeTarget })
672+
: '',
659673
state === 'ok' ? pluralizeApiStatusCount(knowledgePointCount, 'knowledge point', 'knowledge points') : '',
660674
state === 'ok' ? pluralizeApiStatusCount(citationCount, 'citation', 'citations') : '',
661675
state === 'ok' ? pluralizeApiStatusCount(memoryCount, 'memory', 'memories') : '',
676+
state === 'ok' && recoveredSourcePaths.length > 0
677+
? translate('agentWorkspace.apiStatus.recovered', 'Recovered: {sources}', {
678+
sources: recoveredSourcePaths.slice(0, 2).join(', '),
679+
})
680+
: '',
662681
error,
663682
].filter(Boolean);
664683
node.setAttribute('data-api-state', state);
@@ -3255,9 +3274,11 @@
32553274
input.value = '';
32563275
appendUserMessage(message);
32573276
const sendStartedAt = Date.now();
3277+
let requestActiveTarget = '';
32583278
try {
32593279
const userId = getUserId();
32603280
const requestContext = resolveKnowledgeWorkspaceRequestContext();
3281+
requestActiveTarget = requestContext.activeTarget;
32613282
const requestPayload = {
32623283
userId,
32633284
sessionId: getOrCreateConversationSessionId(userId),
@@ -3271,6 +3292,7 @@
32713292
state: 'pending',
32723293
endpoint: AGENT_CONVERSATION_ENDPOINT,
32733294
transport: 'SSE',
3295+
activeTarget: requestContext.activeTarget,
32743296
});
32753297
const conversationCall = await requestConversationWithStreamingFallback(requestPayload);
32763298
const result = conversationCall && typeof conversationCall === 'object' && conversationCall.result
@@ -3281,6 +3303,7 @@
32813303
endpoint: AGENT_CONVERSATION_ENDPOINT,
32823304
transport: String(conversationCall && conversationCall.transport || 'SSE'),
32833305
latencyMs: Number(conversationCall && conversationCall.latencyMs),
3306+
activeTarget: requestContext.activeTarget,
32843307
result,
32853308
});
32863309
const appendedAssistant = await appendAssistantConversationResult(result);
@@ -3307,6 +3330,7 @@
33073330
state: 'error',
33083331
endpoint: AGENT_CONVERSATION_ENDPOINT,
33093332
latencyMs: Date.now() - sendStartedAt,
3333+
activeTarget: requestActiveTarget,
33103334
error: String(error && error.message || error || 'unknown_error'),
33113335
});
33123336
appendLocalizedAssistantMessage(

src/frontend/locales/en.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -407,7 +407,9 @@
407407
"idle": "Idle",
408408
"pending": "Checking",
409409
"ok": "Available",
410-
"error": "Failed"
410+
"error": "Failed",
411+
"scope": "Scope: {scope}",
412+
"recovered": "Recovered: {sources}"
411413
},
412414
"graphFocus": {
413415
"title": "Knowledge Focus",

src/frontend/locales/zh.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -407,7 +407,9 @@
407407
"idle": "空闲",
408408
"pending": "检测中",
409409
"ok": "可用",
410-
"error": "失败"
410+
"error": "失败",
411+
"scope": "范围:{scope}",
412+
"recovered": "已扩展:{sources}"
411413
},
412414
"graphFocus": {
413415
"title": "知识聚焦",

src/learning/KnowledgeLearningPlatform.test.ts

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1553,6 +1553,65 @@ describe('KnowledgeLearningPlatform', () => {
15531553
);
15541554
});
15551555

1556+
test('agent conversation recovers a title-like knowledge point when the active scope misses another corpus', async () => {
1557+
await platform.ingestKnowledge({
1558+
incremental: true,
1559+
documents: [
1560+
{
1561+
documentId: 'doc_financial_scope',
1562+
sourcePath: 'Knowledge_Base/financial/liquidity.md',
1563+
language: 'en',
1564+
workspaceId: 'financial',
1565+
corpusId: 'financial',
1566+
content: '# Liquidity\nLiquidity analysis explains cash conversion and working capital timing.',
1567+
},
1568+
{
1569+
documentId: 'doc_water_glass_scope_recovery',
1570+
sourcePath: 'Knowledge_Base/waterglass/water glass.md',
1571+
language: 'en',
1572+
workspaceId: 'waterglass',
1573+
corpusId: 'waterglass',
1574+
content: [
1575+
'# Water Glass',
1576+
'A water glass is a transparent drinking vessel that contains water for use.',
1577+
'',
1578+
'## Material role',
1579+
'The water glass body provides a boundary between the liquid and the environment.',
1580+
].join('\n'),
1581+
},
1582+
],
1583+
});
1584+
1585+
const response = await platform.agentConversation({
1586+
userId: 'agent_scope_recovery_user',
1587+
sessionId: 'session_scope_recovery',
1588+
message: 'what is water glass?',
1589+
scope: {
1590+
workspaceId: 'financial',
1591+
corpusId: 'financial',
1592+
sourcePathPrefixes: ['Knowledge_Base/financial'],
1593+
},
1594+
topK: 8,
1595+
persistMemory: false,
1596+
});
1597+
1598+
expect(response.answer).toMatch(/^A water glass is/i);
1599+
expect(response.knowledgePoints).toHaveLength(1);
1600+
expect(response.summary.returnedKnowledgePoints).toBe(1);
1601+
expect(response.summary.returnedCitations).toBeGreaterThanOrEqual(2);
1602+
expect(response.trace.usedScope.scopeSource).toBe('planner_scope_recovery');
1603+
expect(response.trace.retrieval.retrievalModes).toContain('planner_scope_recovery');
1604+
expect(response.trace.planner?.titleHitDocumentIds).toContain('doc_water_glass_scope_recovery');
1605+
1606+
const recoveredPoint = response.knowledgePoints[0] as any;
1607+
expect(recoveredPoint.documentId).toBe('doc_water_glass_scope_recovery');
1608+
expect(recoveredPoint.sourcePath).toBe('Knowledge_Base/waterglass/water glass.md');
1609+
expect(recoveredPoint.matchCount).toBeGreaterThanOrEqual(2);
1610+
expect(recoveredPoint.matchedSpans.map((span: any) => span.title)).toEqual(
1611+
expect.arrayContaining(['Water Glass', 'Material role'])
1612+
);
1613+
});
1614+
15561615
test('agent conversation explanation and next actions adapt to comparison-style queries', async () => {
15571616
await platform.ingestKnowledge({
15581617
incremental: true,

0 commit comments

Comments
 (0)