docs: enrich centroid as Kansas-space vantage, ARCH-REVIEW zero-drift caveat, README must-language

Copilot · devlux76 · Copilot · commit 622570b261ad · 2026-03-13T19:53:55.000Z
DESIGN.md:
- c definition: synthetic center of mass, typically empty "Kansas space", neutral vantage
  point for unbiased scoring; properly formatted continuation lines
- Step 4: Kansas space explanation in short focused sentences
- Step 5: rewritten as bulleted zone-classification list (thesis/antithesis/synthesis/
  third-region); cross-ref to Dialectical Search; scoring-from-c rationale kept here
- Dialectical Search: removed duplicate "key property" paragraph; merged anchoring-bias
  explanation into single scoring paragraph; zone table uses closer-to-c framing
- Terminology Centroid: multiline, concise Kansas-space explanation

ARCHITECTURE-REVIEW.md:
- Query.ts / QueryResult.ts explicitly flagged "must be substantially rewritten"
- Zero-drift caveat restructured as 4 bulleted points for clarity: what it means,
  what it does not mean, per-file impact, authoritative status pointer (PLAN.md)
- Recommended Fix Order step 4: "Rewrite" not "Upgrade"

README.md:
- Cortex section: "Required behavior (v0.5+ engineering target)" with "must" language
  on every bullet; Kansas space note as sub-bullet of Metroid bullet
- Current behavior relabeled "(v0.1 — placeholder)"

Co-authored-by: devlux76 &lt;86517969+devlux76@users.noreply.github.com&gt;
diff --git a/ARCHITECTURE-REVIEW.md b/ARCHITECTURE-REVIEW.md
@@ -255,11 +255,16 @@ The following components are correctly implemented (or partially implemented in
 - `hippocampus/Chunker.ts` — Text chunking; **implemented and correct**
 - `hippocampus/PageBuilder.ts` — Page entity construction; **implemented and correct**
 - `hippocampus/Ingest.ts` — Minimal ingest path; **partially implemented** (chunk→embed→persist→Book→hotpath); correct direction, hierarchy and neighbor insertion deferred
-- `cortex/Query.ts` — Minimal query path; **partially implemented** (hotpath-first flat scoring); correct direction, MetroidBuilder deferred
-- `cortex/QueryResult.ts` — Minimal result DTO; **partially implemented**; correct direction, provenance fields deferred
+- `cortex/Query.ts` — Minimal query path; **partially implemented** (hotpath-first flat scoring); **must be substantially rewritten** for the dialectical pipeline (P1-E)
+- `cortex/QueryResult.ts` — Minimal result DTO; **partially implemented**; **must be rewritten** to add coherencePath, metroid, knowledgeGap, provenance fields (P1-E2)
 - All `VectorBackend` implementations — correct
 
-> **Note:** PLAN.md v1.2 has been updated to reflect the actual implementation status of all Hippocampus and Cortex modules. The initial v1.1 plan incorrectly marked `Chunker.ts`, `PageBuilder.ts`, `Ingest.ts`, `Query.ts`, and `QueryResult.ts` as missing; this has been corrected.
+> **Important caveat on "zero drift":**
+>
+> - **What it means:** No architectural logic in these files conflicts with the corrected design. They do not need to be deleted or redesigned from scratch.
+> - **What it does not mean:** Unaffected by future work. The "roughed in" implementations (`Ingest.ts`, `Query.ts`, `QueryResult.ts`) were scaffolded before the MetroidBuilder design was fully specified.
+> - **Impact:** `Query.ts` and `QueryResult.ts` must be substantially rewritten (P1-E); `Ingest.ts` must gain hierarchy building and neighbor insertion (P1-B, P1-C). Each is a correct stub in the right direction, but not a complete implementation.
+> - **Authoritative status:** Refer to **PLAN.md**, not this section, when assessing whether a file needs additional work.
 
 ---
 
@@ -268,5 +273,5 @@ The following components are correctly implemented (or partially implemented in
 1. **P0-X1–X7** — Fix naming drift in `core/types.ts`, `storage/IndexedDbMetadataStore.ts`, `cortex/Query.ts`, and planned file names. This unblocks MetroidBuilder without risking collision.
 2. **P1-M1–M3** — Add `Metroid` and `KnowledgeGap` types; implement `MetroidBuilder`.
 3. **P1-N1–N4** — Implement `KnowledgeGapDetector`.
-4. **P1-E1–E3** — Upgrade `cortex/Query.ts` to full dialectical orchestrator.
+4. **P1-E1–E3** — Rewrite `cortex/Query.ts` to full dialectical orchestrator (not backward-compatible with existing flat top-K code).
 5. **P1-C1–C3** — Implement `FastNeighborInsert` (correctly named after P0-X).
diff --git a/DESIGN.md b/DESIGN.md
@@ -104,7 +104,10 @@ Metroid = { m1, m2, c }
 Where:
 - **m1** — thesis medoid: the cluster representative most relevant to the query topic
 - **m2** — antithesis medoid: a cluster representative discovered through constrained Matryoshka search to represent semantic opposition to m1
-- **c** — centroid: the geometric midpoint between m1 and m2, used as the balanced search origin
+- **c** — centroid: the synthetic center of mass between m1 and m2.
+  `c` is a "Kansas space" position — typically empty; no real node lives at the centroid.
+  Its value is as a neutral vantage point: from `c`, distances to both poles and all
+  candidates can be measured without anchoring bias toward either m1 or m2.
 
 The Metroid is constructed at query time by the `MetroidBuilder`. It is **not** a persistent graph structure. It is a transient epistemological instrument.
 
@@ -115,11 +118,23 @@ The Metroid is constructed at query time by the `MetroidBuilder`. It is **not**
 1. **Select m1** — Identify the topic medoid most relevant to the query embedding.
 2. **Freeze protected dimensions** — Lock the lower Matryoshka embedding dimensions that encode invariant semantic context (domain, language register, topic class). These dimensions are never searched for antithesis.
 3. **Search for m2** — Within the remaining (unfrozen) upper dimensions, search for the nearest medoid that represents semantic opposition to m1.
-4. **Compute centroid** — Compute `c` as follows:
+4. **Compute centroid** — Compute `c` as a center of mass between m1 and m2:
    - Protected dimensions (index < `matryoshkaProtectedDim`): copy directly from m1. These dimensions are invariant; averaging them would dilute the domain anchor that makes the antithesis search meaningful.
    - Unfrozen dimensions (index >= `matryoshkaProtectedDim`): compute the element-wise average of m1 and m2 — `c[i] = (m1[i] + m2[i]) / 2`.
    - The result is a full-dimensional vector that can be used directly as a scoring anchor.
-5. **Prefer centroid as search origin** — Use `c` as the primary starting point for subgraph expansion. This prevents semantic drift toward either pole.
+
+   **Important:** `c` is a synthetic position — a "Kansas space". In most cases nothing actually
+   exists at the centroid; it is an empty field in embedding space, equidistant from both poles.
+   Its value is as a neutral vantage point. Standing at `c`, you can immediately measure whether
+   any candidate is closer to m1 (thesis), closer to m2 (antithesis), or equidistant from both
+   (genuinely synthetic). Scoring by proximity to `c` produces unbiased, balanced retrieval.
+   Scoring from m1 or m2 would pull all results toward one pole.
+5. **Use centroid as scoring vantage point** — Weight candidates by their distance to `c`, not to m1 or m2.
+   - Near `c`: synthesis territory — balanced between both poles.
+   - Much closer to m1 than to `c`: thesis-supporting.
+   - Much closer to m2 than to `c`: antithesis-supporting.
+   - Far from `c`, m1, and m2 simultaneously: a third conceptual region not captured by either pole — signal for further Matryoshka unwinding or a knowledge gap.
+   Scoring from `c` avoids anchoring bias; see the Dialectical Search section for the full zone model.
 6. **Unwind Matryoshka layers** — Progressively free deeper embedding dimensions and repeat from step 3. Each unwinding broadens the antithesis search.
 7. **Stop at the protected dimension** — The protected lower dimensions are never unwound. This preserves semantic invariants throughout all levels of search.
 
@@ -148,13 +163,15 @@ This produces progressively wider dialectical exploration while maintaining sema
 
 ### Dialectical Search
 
-Every Metroid-driven query explores three zones:
+Every Metroid-driven query explores three zones, with all scoring anchored at the centroid `c`:
 
 | Zone | Pole | Meaning |
 |------|------|---------|
-| Thesis zone | around m1 | Supporting ideas, corroborating evidence |
-| Antithesis zone | around m2 | Opposing ideas, counterevidence, alternative perspectives |
-| Synthesis zone | around c | Conceptually balanced territory between both poles |
+| Thesis zone | closer to m1 than to c | Supporting ideas, corroborating evidence |
+| Antithesis zone | closer to m2 than to c | Opposing ideas, counterevidence, alternative perspectives |
+| Synthesis zone | near c, equidistant from m1 and m2 | Conceptually balanced territory between both poles |
+
+**Scoring from the centroid vantage point:** candidates are ranked by their distance to `c`. A candidate significantly closer to m1 than to `c` is thesis-supporting; significantly closer to m2 is antithesis-supporting; near `c` is synthesis-zone content. Candidates far from all three (`c`, m1, m2) indicate a third conceptual region — either an undiscovered knowledge area or a signal to unwind another Matryoshka layer. Scoring from m1 or m2 instead of `c` would anchor all results toward one pole, introducing confirmation bias.
 
 This three-zone exploration prevents **confirmation bias**: a system that only retrieves nearest neighbors to m1 returns documents that confirm the query's premise. By also exploring m2 and c, CORTEX surfaces contradictions, alternatives, and knowledge gaps.
 
@@ -705,7 +722,12 @@ Smart sharing is a core capability, not a post-v1 extra. The v1 exchange path mu
 
 **Medoid** (mathematical term): The existing memory node selected as the statistical representative of a cluster. Selected by minimising the sum of distances to all other nodes in the cluster. Used throughout algorithmic descriptions and internal implementation comments.
 
-**Centroid** (mathematical term): In MetroidBuilder, the centroid `c` is a full-dimensional vector where protected dimensions are copied from m1 (domain invariant) and unfrozen dimensions are the element-wise average of m1 and m2. Used as the balanced search origin in dialectical scoring.
+**Centroid** (mathematical term): In MetroidBuilder, the centroid `c` is a full-dimensional vector
+where protected dimensions are copied from m1 (domain invariant) and unfrozen dimensions are the
+element-wise average of m1 and m2. `c` is a synthetic "Kansas space" position — a center of mass
+where nothing in the memory graph typically exists. Its value is as a neutral vantage point:
+scoring candidates by distance to `c` gives equal weight to both poles. A candidate closer to m1
+is thesis-supporting; closer to m2 is antithesis-supporting; near `c` is genuinely balanced.
 
 **Metroid** (CORTEX architectural term): A structured dialectical search probe constructed at query time: `{ m1, m2, c }`, where m1 is the thesis medoid, m2 is the antithesis medoid, and c is the centroid (protected dims from m1; unfrozen dims averaged). **A Metroid is never stored as a persistent graph structure.** It is an ephemeral instrument used by the CORTEX retrieval subsystem.
 
diff --git a/README.md b/README.md
@@ -62,19 +62,20 @@ When new observations arrive, Hippocampus immediately:
 This is the rapid, multi-path "write" system that turns raw experience into structured memory scaffolding.
 
 ### 🧩 Cortex — Intelligent Routing & Coherence
-When you ask a question, Cortex does **not** return a bag of similar vectors.
-
-**Planned target behavior (v0.5+):**
-- Constructs a **Metroid** `{ m1, m2, c }` for the query — a structured dialectical search probe pairing the thesis medoid (m1) with an antithesis medoid (m2) and a balanced centroid (c)
-- Performs Matryoshka dimensional unwinding to discover semantically opposing knowledge
-- Performs parallel WebGPU "scoops" across the entire active universe (sub-millisecond)
-- Pulls relevant sub-graphs from IndexedDB
-- Traces closed-loop paths through Hebbian connections
-- Returns only self-consistent, coherent context chains
-- Detects **knowledge gaps** when no antithesis medoid exists within dimensional constraints
-- Broadcasts P2P curiosity probes (with `mimeType` + `modelUrn` for commensurability) to discover missing knowledge from peers
-
-**Current behavior (v0.1):**
+Cortex does **not** return a bag of similar vectors.
+
+**Required behavior (v0.5+ engineering target):**
+- Must construct a **Metroid** `{ m1, m2, c }` for every query — a structured dialectical search probe pairing the thesis medoid (m1) with an antithesis medoid (m2) and a balanced centroid (c)
+  - The centroid `c` is a synthetic "Kansas space" vantage point (no real node lives there); scoring from `c` must give equal weight to both poles
+- Must perform Matryoshka dimensional unwinding to discover semantically opposing knowledge
+- Must perform parallel WebGPU "scoops" across the entire active universe (sub-millisecond)
+- Must pull relevant sub-graphs from IndexedDB
+- Must trace closed-loop paths through Hebbian connections
+- Must return only self-consistent, coherent context chains
+- Must detect **knowledge gaps** when no antithesis medoid exists within dimensional constraints
+- Must broadcast P2P curiosity probes (with `mimeType` + `modelUrn` for commensurability) to discover missing knowledge from peers
+
+**Current behavior (v0.1 — placeholder):**
 - Flat top-K similarity scoring against the hotpath resident index with warm/cold spill
 - No MetroidBuilder, no dialectical pipeline, no knowledge gap detection yet