docs(roadmap): solidify knowledge mastery evolution plan and docs navigation

Jacobinwwey · Jacobinwwey · commit 1e6d8f9e9543 · 2026-03-31T21:16:55.000+08:00
diff --git a/docs/diataxis-map.json b/docs/diataxis-map.json
@@ -145,6 +145,18 @@
         "canonical": ["docs/zh/Interface Document.md", "docs/zh/User_Manual.md"],
         "diataxis": "docs/diataxis/zh/explanation/startup-node-update-acceleration-plan.md"
       }
+    },
+    {
+      "id": "knowledge-mastery-evolution-roadmap",
+      "category": "explanation",
+      "en": {
+        "canonical": ["docs/en/knowledge_mastery_evolution_plan.md"],
+        "diataxis": "docs/diataxis/en/explanation/knowledge-mastery-evolution-roadmap.md"
+      },
+      "zh": {
+        "canonical": ["docs/zh/knowledge_mastery_evolution_plan.md"],
+        "diataxis": "docs/diataxis/zh/explanation/knowledge-mastery-evolution-roadmap.md"
+      }
     }
   ]
 }
diff --git a/docs/diataxis/en/explanation/knowledge-mastery-evolution-roadmap.md b/docs/diataxis/en/explanation/knowledge-mastery-evolution-roadmap.md
@@ -0,0 +1,27 @@
+# Explanation: Knowledge Mastery Evolution Roadmap
+
+This page explains the strategic shift from a pure knowledge-visualization product to a local-first, verifiable learning system.
+
+## Why This Roadmap Exists
+
+- Visualization alone cannot guarantee user mastery.
+- LLM assistance without evidence and memory governance can produce low-trust guidance.
+- Long-term learning outcomes require structured atoms, explainable retrieval, and mastery-state updates.
+
+## Strategic Direction
+
+1. Keep local-first as the default architecture.
+2. Introduce graph-backed explainable retrieval with temporal validity.
+3. Build a dual-core learning loop:
+   - mastery closure loop
+   - divergence exploration loop
+4. Add pluggable local/cloud LLM tutor actions under evidence-first guardrails.
+
+## Canonical Plan Source
+
+- [docs/en/knowledge_mastery_evolution_plan.md](../../../en/knowledge_mastery_evolution_plan.md)
+
+## Related Explanation Sources
+
+- [Architecture and Migration](./architecture-and-migration.md)
+- [Startup Node Update Acceleration Plan](./startup-node-update-acceleration-plan.md)
diff --git a/docs/diataxis/zh/explanation/knowledge-mastery-evolution-roadmap.md b/docs/diataxis/zh/explanation/knowledge-mastery-evolution-roadmap.md
@@ -0,0 +1,27 @@
+# 解释：知识彻底掌握演进路线图
+
+本页用于说明项目为何从“知识可视化”进一步演进为“本地优先、可验证学习成效”的系统。
+
+## 为什么需要这条路线
+
+- 仅做可视化无法直接保证用户掌握度。
+- 缺少证据链与记忆治理的 LLM 辅助，难以形成高可信学习反馈。
+- 长期学习效果需要原子化知识建模、可解释检索与掌握状态更新闭环。
+
+## 战略方向
+
+1. 继续坚持本地优先架构。
+2. 引入图数据库支撑的可解释检索与时序有效性。
+3. 建立双核学习回路：
+   - 掌握闭环
+   - 发散探索回路
+4. 在证据优先护栏下提供本地/云模型可插拔 LLM 导师动作。
+
+## 权威计划来源
+
+- [docs/zh/knowledge_mastery_evolution_plan.md](../../../zh/knowledge_mastery_evolution_plan.md)
+
+## 关联解释文档
+
+- [架构与迁移](./architecture-and-migration.md)
+- [启动节点更新提速方案](./startup-node-update-acceleration-plan.md)
diff --git a/docs/en/knowledge_mastery_evolution_plan.md b/docs/en/knowledge_mastery_evolution_plan.md
@@ -0,0 +1,200 @@
+# 2026-03-31 v1.7.0 - Knowledge Mastery Evolution Plan
+
+## Purpose
+
+This plan solidifies the next-stage evolution of NoteConnection from a "knowledge visualization system" into a local-first "knowledge parsing + mastery loop + divergence thinking + pluggable LLM tutor" platform.
+
+This document is implementation-facing and decision-complete for the next 6-9 months.
+
+## Locked Product Decisions
+
+1. Deployment priority: local-first and privacy-preserving by default.
+2. Learning objective: dual-core strategy, mastery closure + divergence thinking.
+3. LLM strategy: pluggable adapter for both local and cloud models.
+4. Graph backbone: introduce a local graph database as advanced engine.
+5. Delivery cadence: three phases in 6-9 months.
+6. Primary success metric: mastery improvement.
+
+## Core Terms
+
+1. Knowledge Atom: smallest independently assessable unit of knowledge.
+2. Evidence Span: traceable source segment backing a knowledge atom.
+3. Relation Edge: prerequisite/analogy/contrast/causal/application relationship between atoms.
+4. Temporal Evolution: versioned state change and validity window of atoms and relations.
+5. Mastery State: observable probability of user mastery per atom.
+6. Divergence Graph: graph of cross-domain expansion around a current topic.
+7. Learning Action: executable next step (quiz, explanation, analysis, reflection, transfer task).
+
+## Layered Architecture and Contracts
+
+### L0 Representation Layer
+
+- Parse Markdown, code blocks, formulas, and Mermaid blocks into `KnowledgeAtom + EvidenceSpan`.
+- Every atom must keep source provenance for explainable retrieval.
+
+### L1 Structure Layer
+
+- Build static graph, process graph, and temporal graph from atoms and relation edges.
+- Distinguish fact edges from inferred edges to prevent path hallucination.
+
+### L2 Retrieval Layer
+
+- Hybrid retrieval: keyword + vector + graph traversal + temporal filtering.
+- Every answer must include evidence, relation path, and temporal validity.
+
+### L3 Learning Layer
+
+- Decide next learning action using `MasteryState + DivergenceGraph`.
+- Do not consume raw black-box LLM output without evidence binding.
+
+### L4 Interaction Layer
+
+- Provide learning workspace + tutor action APIs + evaluation feedback loop.
+- Support local model and cloud model through one adapter contract.
+
+### L5 Governance Layer
+
+- Enforce freshness checks, API contracts, rollback switches, quality gates, and privacy boundaries.
+- Gate the full chain from L0 through L4.
+
+## External Strategy Absorption
+
+1. Fast-GraphRAG: absorb stateful insertion/query and high-speed local retrieval pipeline.
+2. LightRAG: absorb dual-level retrieval and incremental update orientation.
+3. Graphiti: absorb temporal knowledge graph concepts for evolving context.
+4. Neo4j GraphRAG: absorb graph-driven explainable retrieval patterns and tool contract discipline.
+5. MemOS: absorb layered memory policy (session/unit/long-term) and memory scheduling.
+6. GitNexus: absorb process-context, staleness discipline, and agent-consumable interface patterns.
+
+### Explicit Non-goals for v1
+
+1. No cloud-first multi-tenant architecture.
+2. No deep distributed complexity at v1.
+3. No direct one-to-one reuse of code intelligence patterns as learning intelligence.
+
+## 3-Phase Delivery Blueprint (6-9 Months)
+
+### Phase 1 (Weeks 1-8): Deep Parsing + Graph Backbone
+
+1. Build unified parser pipeline for `KnowledgeAtom + EvidenceSpan`.
+2. Introduce local graph database as advanced engine, keep lightweight path for compatibility.
+3. Implement temporal model for atom/relation versioning and validity.
+4. Add staleness detection by source hash binding.
+5. Deliverables:
+   - Incremental rebuildable knowledge graph service.
+   - Evidence-traceable query interface.
+   - Temporal validity annotations.
+
+### Phase 2 (Weeks 9-16): Mastery Loop + Divergence Engine
+
+1. Introduce `LearnerConceptState` per atom (mastery, error tags, retest outcomes).
+2. Build mastery closure loop: diagnose -> classify errors -> personalized practice -> retest update.
+3. Build divergence engine for same-level expansion, cross-level transfer, and counter-example exploration.
+4. Support dual output paths: `MasteryPath[]` and `DivergencePath[]`.
+5. Deliverables:
+   - Learning path orchestrator.
+   - Error taxonomy knowledge base.
+   - Dual-core learning panel.
+
+### Phase 3 (Weeks 17-36): Pluggable LLM Tutor + Memory OS
+
+1. Build unified LLM adapter for local and cloud providers.
+2. Implement tutor actions: quiz generation, probing questions, answer analysis, misconception diagnosis, transfer-task generation, recap synthesis.
+3. Implement layered memory: session memory, unit memory, long-term mastery memory.
+4. Add safety guardrails: evidence-first responses, source traceability, confidence-based downgrade.
+5. Deliverables:
+   - LLM tutor orchestration layer.
+   - Memory policy engine.
+   - Learning quality dashboard.
+
+## Public Interfaces and Types (Must Implement)
+
+### Public APIs
+
+1. `KnowledgeIngestAPI`
+   - Input: document payload + incremental change metadata.
+   - Output: atom/evidence/relation/temporal metadata.
+2. `KnowledgeQueryAPI`
+   - Unified retrieval entry with evidence-first response contract.
+3. `MasteryDiagnosticsAPI`
+   - Input: learner answer/behavior events.
+   - Output: mastery updates + error labels.
+4. `LearningPathAPI`
+   - Output: prioritized `MasteryPath[]` and `DivergencePath[]`.
+5. `TutorActionAPI`
+   - Unified tutor action contract (ask/analyze/feedback/recap).
+6. `MemoryPolicyAPI`
+   - Session/unit/long-term memory write and eviction policy management.
+
+### New Core Types
+
+- `KnowledgeAtom`
+- `EvidenceSpan`
+- `RelationEdge`
+- `TemporalEdge`
+- `LearnerConceptState`
+- `LearningAction`
+- `TutorTrace`
+
+## Quality Gates and Acceptance
+
+### Core Test Areas
+
+1. Parsing correctness: atom extraction, evidence alignment, relation consistency.
+2. Retrieval trust: evidence traceability, path explainability, temporal validity hit rate.
+3. Learning effectiveness: mastery uplift, misconception recurrence decline, retest pass-rate uplift.
+4. Divergence quality: cross-topic linkage quality, counter-example quality, transfer-task quality.
+5. Performance: p95 query latency and rebuild duration at 10k atom scale.
+6. Privacy/security: no external leakage by default, model-call auditability, boundary enforcement.
+
+### v1.5 Acceptance Thresholds
+
+1. Retest pass-rate uplift >= 20%.
+2. High-frequency misconception recurrence reduction >= 25%.
+3. Evidence-backed learning suggestion ratio >= 90%.
+4. Path effectiveness significantly better than random baseline.
+5. Key p95 interactions remain at interactive latency and gates stay green.
+
+## First-Principles Explanation
+
+1. Learning is a state-estimation + intervention-control problem, not only a content presentation problem.
+2. Without atomization, mastery cannot be measured or improved robustly.
+3. Without explainable retrieval, feedback loops are not trustworthy.
+4. Without temporal and memory layers, forgetting and transfer cannot be modeled correctly.
+5. Without governance gates, quality drifts and model hallucinations will erode learning reliability.
+
+## Mental Models and Common Pitfalls
+
+### Mental Models
+
+1. State-space loop: `Knowledge State -> Observation -> Update -> Policy`.
+2. Dual-objective optimization: mastery gain and divergence quality under explicit constraints.
+3. Evidence-first orchestration: every tutor action must map to source evidence and relation path.
+4. Layered memory: separate short-term interaction memory from long-term mastery memory.
+5. Controlled evolution: each capability expansion is contract-tested and gate-verified.
+
+### Common Pitfalls
+
+1. Optimizing vector recall only without graph and evidence chains.
+2. Treating raw LLM output as ground truth.
+3. Recommending paths without updating mastery state and retest loops.
+4. Pursuing full-modality scope too early and increasing architecture risk.
+5. Ignoring local privacy and auditability until late-stage.
+
+## 5-Point Summary
+
+1. The direction shift to a verifiable learning system is feasible and strategically sound.
+2. Local-first plus graph-database backbone is foundational for long-term capability ceiling.
+3. Dual-core value requires mastery loop and divergence engine to be implemented together.
+4. Pluggable LLM works only when built on evidence-first retrieval and layered memory.
+5. A 6-9 month phased roadmap can generate measurable value while controlling risk.
+
+## References
+
+1. GitNexus: <https://github.com/abhigyanpatwari/GitNexus>
+2. Fast-GraphRAG: <https://github.com/circlemind-ai/fast-graphrag>
+3. LightRAG: <https://github.com/HKUDS/LightRAG>
+4. Graphiti: <https://github.com/getzep/graphiti>
+5. Neo4j GraphRAG Python: <https://github.com/neo4j/neo4j-graphrag-python>
+6. MemOS: <https://github.com/MemTensor/MemOS>
+7. Neo4j GraphRAG Docs: <https://neo4j.com/docs/neo4j-graphrag-python/current/>
diff --git a/docs/index.md b/docs/index.md
@@ -15,6 +15,7 @@ This site adopts the Diataxis framework to make documentation easier to navigate
 - Use [app_config.toml Schema](diataxis/en/reference/app-config-schema.md) for exact config keys/defaults/effects.
 - Use [Explanation](diataxis/en/explanation/architecture-and-migration.md) for architecture decisions.
 - Use [Startup Node Update Acceleration Plan](diataxis/en/explanation/startup-node-update-acceleration-plan.md) for phased performance rollout.
+- Use [Knowledge Mastery Evolution Roadmap](diataxis/en/explanation/knowledge-mastery-evolution-roadmap.md) for the next-stage learning-system strategy.
 
 ## 中文
 
@@ -29,3 +30,4 @@ This site adopts the Diataxis framework to make documentation easier to navigate
 - 参数键/默认值/效果请查看 [app_config.toml 结构](diataxis/zh/reference/app-config-schema.md)。
 - 需要理解架构决策请查看 [解释文档](diataxis/zh/explanation/architecture-and-migration.md)。
 - 启动性能分阶段落地请查看 [启动节点更新提速方案](diataxis/zh/explanation/startup-node-update-acceleration-plan.md)。
+- 下一阶段学习系统战略请查看 [知识彻底掌握演进路线图](diataxis/zh/explanation/knowledge-mastery-evolution-roadmap.md)。
diff --git a/docs/zh/knowledge_mastery_evolution_plan.md b/docs/zh/knowledge_mastery_evolution_plan.md
diff --git a/mkdocs.yml b/mkdocs.yml

Original file line number	Diff line number	Diff line change
`@@ -145,6 +145,18 @@`
`145`	`145`	`"canonical": ["docs/zh/Interface Document.md", "docs/zh/User_Manual.md"],`
`146`	`146`	`"diataxis": "docs/diataxis/zh/explanation/startup-node-update-acceleration-plan.md"`
`147`	`147`	`}`
	`148`	`+ },`
	`149`	`+ {`
	`150`	`+ "id": "knowledge-mastery-evolution-roadmap",`
	`151`	`+ "category": "explanation",`
	`152`	`+ "en": {`
	`153`	`+ "canonical": ["docs/en/knowledge_mastery_evolution_plan.md"],`
	`154`	`+ "diataxis": "docs/diataxis/en/explanation/knowledge-mastery-evolution-roadmap.md"`
	`155`	`+ },`
	`156`	`+ "zh": {`
	`157`	`+ "canonical": ["docs/zh/knowledge_mastery_evolution_plan.md"],`
	`158`	`+ "diataxis": "docs/diataxis/zh/explanation/knowledge-mastery-evolution-roadmap.md"`
	`159`	`+ }`
`148`	`160`	`}`
`149`	`161`	`]`
`150`	`162`	`}`