|
| 1 | +# 2026-03-31 v1.7.0 - Knowledge Mastery Evolution Plan |
| 2 | + |
| 3 | +## Purpose |
| 4 | + |
| 5 | +This plan solidifies the next-stage evolution of NoteConnection from a "knowledge visualization system" into a local-first "knowledge parsing + mastery loop + divergence thinking + pluggable LLM tutor" platform. |
| 6 | + |
| 7 | +This document is implementation-facing and decision-complete for the next 6-9 months. |
| 8 | + |
| 9 | +## Locked Product Decisions |
| 10 | + |
| 11 | +1. Deployment priority: local-first and privacy-preserving by default. |
| 12 | +2. Learning objective: dual-core strategy, mastery closure + divergence thinking. |
| 13 | +3. LLM strategy: pluggable adapter for both local and cloud models. |
| 14 | +4. Graph backbone: introduce a local graph database as advanced engine. |
| 15 | +5. Delivery cadence: three phases in 6-9 months. |
| 16 | +6. Primary success metric: mastery improvement. |
| 17 | + |
| 18 | +## Core Terms |
| 19 | + |
| 20 | +1. Knowledge Atom: smallest independently assessable unit of knowledge. |
| 21 | +2. Evidence Span: traceable source segment backing a knowledge atom. |
| 22 | +3. Relation Edge: prerequisite/analogy/contrast/causal/application relationship between atoms. |
| 23 | +4. Temporal Evolution: versioned state change and validity window of atoms and relations. |
| 24 | +5. Mastery State: observable probability of user mastery per atom. |
| 25 | +6. Divergence Graph: graph of cross-domain expansion around a current topic. |
| 26 | +7. Learning Action: executable next step (quiz, explanation, analysis, reflection, transfer task). |
| 27 | + |
| 28 | +## Layered Architecture and Contracts |
| 29 | + |
| 30 | +### L0 Representation Layer |
| 31 | + |
| 32 | +- Parse Markdown, code blocks, formulas, and Mermaid blocks into `KnowledgeAtom + EvidenceSpan`. |
| 33 | +- Every atom must keep source provenance for explainable retrieval. |
| 34 | + |
| 35 | +### L1 Structure Layer |
| 36 | + |
| 37 | +- Build static graph, process graph, and temporal graph from atoms and relation edges. |
| 38 | +- Distinguish fact edges from inferred edges to prevent path hallucination. |
| 39 | + |
| 40 | +### L2 Retrieval Layer |
| 41 | + |
| 42 | +- Hybrid retrieval: keyword + vector + graph traversal + temporal filtering. |
| 43 | +- Every answer must include evidence, relation path, and temporal validity. |
| 44 | + |
| 45 | +### L3 Learning Layer |
| 46 | + |
| 47 | +- Decide next learning action using `MasteryState + DivergenceGraph`. |
| 48 | +- Do not consume raw black-box LLM output without evidence binding. |
| 49 | + |
| 50 | +### L4 Interaction Layer |
| 51 | + |
| 52 | +- Provide learning workspace + tutor action APIs + evaluation feedback loop. |
| 53 | +- Support local model and cloud model through one adapter contract. |
| 54 | + |
| 55 | +### L5 Governance Layer |
| 56 | + |
| 57 | +- Enforce freshness checks, API contracts, rollback switches, quality gates, and privacy boundaries. |
| 58 | +- Gate the full chain from L0 through L4. |
| 59 | + |
| 60 | +## External Strategy Absorption |
| 61 | + |
| 62 | +1. Fast-GraphRAG: absorb stateful insertion/query and high-speed local retrieval pipeline. |
| 63 | +2. LightRAG: absorb dual-level retrieval and incremental update orientation. |
| 64 | +3. Graphiti: absorb temporal knowledge graph concepts for evolving context. |
| 65 | +4. Neo4j GraphRAG: absorb graph-driven explainable retrieval patterns and tool contract discipline. |
| 66 | +5. MemOS: absorb layered memory policy (session/unit/long-term) and memory scheduling. |
| 67 | +6. GitNexus: absorb process-context, staleness discipline, and agent-consumable interface patterns. |
| 68 | + |
| 69 | +### Explicit Non-goals for v1 |
| 70 | + |
| 71 | +1. No cloud-first multi-tenant architecture. |
| 72 | +2. No deep distributed complexity at v1. |
| 73 | +3. No direct one-to-one reuse of code intelligence patterns as learning intelligence. |
| 74 | + |
| 75 | +## 3-Phase Delivery Blueprint (6-9 Months) |
| 76 | + |
| 77 | +### Phase 1 (Weeks 1-8): Deep Parsing + Graph Backbone |
| 78 | + |
| 79 | +1. Build unified parser pipeline for `KnowledgeAtom + EvidenceSpan`. |
| 80 | +2. Introduce local graph database as advanced engine, keep lightweight path for compatibility. |
| 81 | +3. Implement temporal model for atom/relation versioning and validity. |
| 82 | +4. Add staleness detection by source hash binding. |
| 83 | +5. Deliverables: |
| 84 | + - Incremental rebuildable knowledge graph service. |
| 85 | + - Evidence-traceable query interface. |
| 86 | + - Temporal validity annotations. |
| 87 | + |
| 88 | +### Phase 2 (Weeks 9-16): Mastery Loop + Divergence Engine |
| 89 | + |
| 90 | +1. Introduce `LearnerConceptState` per atom (mastery, error tags, retest outcomes). |
| 91 | +2. Build mastery closure loop: diagnose -> classify errors -> personalized practice -> retest update. |
| 92 | +3. Build divergence engine for same-level expansion, cross-level transfer, and counter-example exploration. |
| 93 | +4. Support dual output paths: `MasteryPath[]` and `DivergencePath[]`. |
| 94 | +5. Deliverables: |
| 95 | + - Learning path orchestrator. |
| 96 | + - Error taxonomy knowledge base. |
| 97 | + - Dual-core learning panel. |
| 98 | + |
| 99 | +### Phase 3 (Weeks 17-36): Pluggable LLM Tutor + Memory OS |
| 100 | + |
| 101 | +1. Build unified LLM adapter for local and cloud providers. |
| 102 | +2. Implement tutor actions: quiz generation, probing questions, answer analysis, misconception diagnosis, transfer-task generation, recap synthesis. |
| 103 | +3. Implement layered memory: session memory, unit memory, long-term mastery memory. |
| 104 | +4. Add safety guardrails: evidence-first responses, source traceability, confidence-based downgrade. |
| 105 | +5. Deliverables: |
| 106 | + - LLM tutor orchestration layer. |
| 107 | + - Memory policy engine. |
| 108 | + - Learning quality dashboard. |
| 109 | + |
| 110 | +## Public Interfaces and Types (Must Implement) |
| 111 | + |
| 112 | +### Public APIs |
| 113 | + |
| 114 | +1. `KnowledgeIngestAPI` |
| 115 | + - Input: document payload + incremental change metadata. |
| 116 | + - Output: atom/evidence/relation/temporal metadata. |
| 117 | +2. `KnowledgeQueryAPI` |
| 118 | + - Unified retrieval entry with evidence-first response contract. |
| 119 | +3. `MasteryDiagnosticsAPI` |
| 120 | + - Input: learner answer/behavior events. |
| 121 | + - Output: mastery updates + error labels. |
| 122 | +4. `LearningPathAPI` |
| 123 | + - Output: prioritized `MasteryPath[]` and `DivergencePath[]`. |
| 124 | +5. `TutorActionAPI` |
| 125 | + - Unified tutor action contract (ask/analyze/feedback/recap). |
| 126 | +6. `MemoryPolicyAPI` |
| 127 | + - Session/unit/long-term memory write and eviction policy management. |
| 128 | + |
| 129 | +### New Core Types |
| 130 | + |
| 131 | +- `KnowledgeAtom` |
| 132 | +- `EvidenceSpan` |
| 133 | +- `RelationEdge` |
| 134 | +- `TemporalEdge` |
| 135 | +- `LearnerConceptState` |
| 136 | +- `LearningAction` |
| 137 | +- `TutorTrace` |
| 138 | + |
| 139 | +## Quality Gates and Acceptance |
| 140 | + |
| 141 | +### Core Test Areas |
| 142 | + |
| 143 | +1. Parsing correctness: atom extraction, evidence alignment, relation consistency. |
| 144 | +2. Retrieval trust: evidence traceability, path explainability, temporal validity hit rate. |
| 145 | +3. Learning effectiveness: mastery uplift, misconception recurrence decline, retest pass-rate uplift. |
| 146 | +4. Divergence quality: cross-topic linkage quality, counter-example quality, transfer-task quality. |
| 147 | +5. Performance: p95 query latency and rebuild duration at 10k atom scale. |
| 148 | +6. Privacy/security: no external leakage by default, model-call auditability, boundary enforcement. |
| 149 | + |
| 150 | +### v1.5 Acceptance Thresholds |
| 151 | + |
| 152 | +1. Retest pass-rate uplift >= 20%. |
| 153 | +2. High-frequency misconception recurrence reduction >= 25%. |
| 154 | +3. Evidence-backed learning suggestion ratio >= 90%. |
| 155 | +4. Path effectiveness significantly better than random baseline. |
| 156 | +5. Key p95 interactions remain at interactive latency and gates stay green. |
| 157 | + |
| 158 | +## First-Principles Explanation |
| 159 | + |
| 160 | +1. Learning is a state-estimation + intervention-control problem, not only a content presentation problem. |
| 161 | +2. Without atomization, mastery cannot be measured or improved robustly. |
| 162 | +3. Without explainable retrieval, feedback loops are not trustworthy. |
| 163 | +4. Without temporal and memory layers, forgetting and transfer cannot be modeled correctly. |
| 164 | +5. Without governance gates, quality drifts and model hallucinations will erode learning reliability. |
| 165 | + |
| 166 | +## Mental Models and Common Pitfalls |
| 167 | + |
| 168 | +### Mental Models |
| 169 | + |
| 170 | +1. State-space loop: `Knowledge State -> Observation -> Update -> Policy`. |
| 171 | +2. Dual-objective optimization: mastery gain and divergence quality under explicit constraints. |
| 172 | +3. Evidence-first orchestration: every tutor action must map to source evidence and relation path. |
| 173 | +4. Layered memory: separate short-term interaction memory from long-term mastery memory. |
| 174 | +5. Controlled evolution: each capability expansion is contract-tested and gate-verified. |
| 175 | + |
| 176 | +### Common Pitfalls |
| 177 | + |
| 178 | +1. Optimizing vector recall only without graph and evidence chains. |
| 179 | +2. Treating raw LLM output as ground truth. |
| 180 | +3. Recommending paths without updating mastery state and retest loops. |
| 181 | +4. Pursuing full-modality scope too early and increasing architecture risk. |
| 182 | +5. Ignoring local privacy and auditability until late-stage. |
| 183 | + |
| 184 | +## 5-Point Summary |
| 185 | + |
| 186 | +1. The direction shift to a verifiable learning system is feasible and strategically sound. |
| 187 | +2. Local-first plus graph-database backbone is foundational for long-term capability ceiling. |
| 188 | +3. Dual-core value requires mastery loop and divergence engine to be implemented together. |
| 189 | +4. Pluggable LLM works only when built on evidence-first retrieval and layered memory. |
| 190 | +5. A 6-9 month phased roadmap can generate measurable value while controlling risk. |
| 191 | + |
| 192 | +## References |
| 193 | + |
| 194 | +1. GitNexus: <https://github.com/abhigyanpatwari/GitNexus> |
| 195 | +2. Fast-GraphRAG: <https://github.com/circlemind-ai/fast-graphrag> |
| 196 | +3. LightRAG: <https://github.com/HKUDS/LightRAG> |
| 197 | +4. Graphiti: <https://github.com/getzep/graphiti> |
| 198 | +5. Neo4j GraphRAG Python: <https://github.com/neo4j/neo4j-graphrag-python> |
| 199 | +6. MemOS: <https://github.com/MemTensor/MemOS> |
| 200 | +7. Neo4j GraphRAG Docs: <https://neo4j.com/docs/neo4j-graphrag-python/current/> |
0 commit comments