Skip to content

Commit 622570b

Browse files
Copilotdevlux76
andcommitted
docs: enrich centroid as Kansas-space vantage, ARCH-REVIEW zero-drift caveat, README must-language
DESIGN.md: - c definition: synthetic center of mass, typically empty "Kansas space", neutral vantage point for unbiased scoring; properly formatted continuation lines - Step 4: Kansas space explanation in short focused sentences - Step 5: rewritten as bulleted zone-classification list (thesis/antithesis/synthesis/ third-region); cross-ref to Dialectical Search; scoring-from-c rationale kept here - Dialectical Search: removed duplicate "key property" paragraph; merged anchoring-bias explanation into single scoring paragraph; zone table uses closer-to-c framing - Terminology Centroid: multiline, concise Kansas-space explanation ARCHITECTURE-REVIEW.md: - Query.ts / QueryResult.ts explicitly flagged "must be substantially rewritten" - Zero-drift caveat restructured as 4 bulleted points for clarity: what it means, what it does not mean, per-file impact, authoritative status pointer (PLAN.md) - Recommended Fix Order step 4: "Rewrite" not "Upgrade" README.md: - Cortex section: "Required behavior (v0.5+ engineering target)" with "must" language on every bullet; Kansas space note as sub-bullet of Metroid bullet - Current behavior relabeled "(v0.1 — placeholder)" Co-authored-by: devlux76 <86517969+devlux76@users.noreply.github.com>
1 parent be34986 commit 622570b

3 files changed

Lines changed: 53 additions & 25 deletions

File tree

ARCHITECTURE-REVIEW.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -255,11 +255,16 @@ The following components are correctly implemented (or partially implemented in
255255
- `hippocampus/Chunker.ts` — Text chunking; **implemented and correct**
256256
- `hippocampus/PageBuilder.ts` — Page entity construction; **implemented and correct**
257257
- `hippocampus/Ingest.ts` — Minimal ingest path; **partially implemented** (chunk→embed→persist→Book→hotpath); correct direction, hierarchy and neighbor insertion deferred
258-
- `cortex/Query.ts` — Minimal query path; **partially implemented** (hotpath-first flat scoring); correct direction, MetroidBuilder deferred
259-
- `cortex/QueryResult.ts` — Minimal result DTO; **partially implemented**; correct direction, provenance fields deferred
258+
- `cortex/Query.ts` — Minimal query path; **partially implemented** (hotpath-first flat scoring); **must be substantially rewritten** for the dialectical pipeline (P1-E)
259+
- `cortex/QueryResult.ts` — Minimal result DTO; **partially implemented**; **must be rewritten** to add coherencePath, metroid, knowledgeGap, provenance fields (P1-E2)
260260
- All `VectorBackend` implementations — correct
261261

262-
> **Note:** PLAN.md v1.2 has been updated to reflect the actual implementation status of all Hippocampus and Cortex modules. The initial v1.1 plan incorrectly marked `Chunker.ts`, `PageBuilder.ts`, `Ingest.ts`, `Query.ts`, and `QueryResult.ts` as missing; this has been corrected.
262+
> **Important caveat on "zero drift":**
263+
>
264+
> - **What it means:** No architectural logic in these files conflicts with the corrected design. They do not need to be deleted or redesigned from scratch.
265+
> - **What it does not mean:** Unaffected by future work. The "roughed in" implementations (`Ingest.ts`, `Query.ts`, `QueryResult.ts`) were scaffolded before the MetroidBuilder design was fully specified.
266+
> - **Impact:** `Query.ts` and `QueryResult.ts` must be substantially rewritten (P1-E); `Ingest.ts` must gain hierarchy building and neighbor insertion (P1-B, P1-C). Each is a correct stub in the right direction, but not a complete implementation.
267+
> - **Authoritative status:** Refer to **PLAN.md**, not this section, when assessing whether a file needs additional work.
263268
264269
---
265270

@@ -268,5 +273,5 @@ The following components are correctly implemented (or partially implemented in
268273
1. **P0-X1–X7** — Fix naming drift in `core/types.ts`, `storage/IndexedDbMetadataStore.ts`, `cortex/Query.ts`, and planned file names. This unblocks MetroidBuilder without risking collision.
269274
2. **P1-M1–M3** — Add `Metroid` and `KnowledgeGap` types; implement `MetroidBuilder`.
270275
3. **P1-N1–N4** — Implement `KnowledgeGapDetector`.
271-
4. **P1-E1–E3**Upgrade `cortex/Query.ts` to full dialectical orchestrator.
276+
4. **P1-E1–E3**Rewrite `cortex/Query.ts` to full dialectical orchestrator (not backward-compatible with existing flat top-K code).
272277
5. **P1-C1–C3** — Implement `FastNeighborInsert` (correctly named after P0-X).

DESIGN.md

Lines changed: 30 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,10 @@ Metroid = { m1, m2, c }
104104
Where:
105105
- **m1** — thesis medoid: the cluster representative most relevant to the query topic
106106
- **m2** — antithesis medoid: a cluster representative discovered through constrained Matryoshka search to represent semantic opposition to m1
107-
- **c** — centroid: the geometric midpoint between m1 and m2, used as the balanced search origin
107+
- **c** — centroid: the synthetic center of mass between m1 and m2.
108+
`c` is a "Kansas space" position — typically empty; no real node lives at the centroid.
109+
Its value is as a neutral vantage point: from `c`, distances to both poles and all
110+
candidates can be measured without anchoring bias toward either m1 or m2.
108111

109112
The Metroid is constructed at query time by the `MetroidBuilder`. It is **not** a persistent graph structure. It is a transient epistemological instrument.
110113

@@ -115,11 +118,23 @@ The Metroid is constructed at query time by the `MetroidBuilder`. It is **not**
115118
1. **Select m1** — Identify the topic medoid most relevant to the query embedding.
116119
2. **Freeze protected dimensions** — Lock the lower Matryoshka embedding dimensions that encode invariant semantic context (domain, language register, topic class). These dimensions are never searched for antithesis.
117120
3. **Search for m2** — Within the remaining (unfrozen) upper dimensions, search for the nearest medoid that represents semantic opposition to m1.
118-
4. **Compute centroid** — Compute `c` as follows:
121+
4. **Compute centroid** — Compute `c` as a center of mass between m1 and m2:
119122
- Protected dimensions (index < `matryoshkaProtectedDim`): copy directly from m1. These dimensions are invariant; averaging them would dilute the domain anchor that makes the antithesis search meaningful.
120123
- Unfrozen dimensions (index >= `matryoshkaProtectedDim`): compute the element-wise average of m1 and m2 — `c[i] = (m1[i] + m2[i]) / 2`.
121124
- The result is a full-dimensional vector that can be used directly as a scoring anchor.
122-
5. **Prefer centroid as search origin** — Use `c` as the primary starting point for subgraph expansion. This prevents semantic drift toward either pole.
125+
126+
**Important:** `c` is a synthetic position — a "Kansas space". In most cases nothing actually
127+
exists at the centroid; it is an empty field in embedding space, equidistant from both poles.
128+
Its value is as a neutral vantage point. Standing at `c`, you can immediately measure whether
129+
any candidate is closer to m1 (thesis), closer to m2 (antithesis), or equidistant from both
130+
(genuinely synthetic). Scoring by proximity to `c` produces unbiased, balanced retrieval.
131+
Scoring from m1 or m2 would pull all results toward one pole.
132+
5. **Use centroid as scoring vantage point** — Weight candidates by their distance to `c`, not to m1 or m2.
133+
- Near `c`: synthesis territory — balanced between both poles.
134+
- Much closer to m1 than to `c`: thesis-supporting.
135+
- Much closer to m2 than to `c`: antithesis-supporting.
136+
- Far from `c`, m1, and m2 simultaneously: a third conceptual region not captured by either pole — signal for further Matryoshka unwinding or a knowledge gap.
137+
Scoring from `c` avoids anchoring bias; see the Dialectical Search section for the full zone model.
123138
6. **Unwind Matryoshka layers** — Progressively free deeper embedding dimensions and repeat from step 3. Each unwinding broadens the antithesis search.
124139
7. **Stop at the protected dimension** — The protected lower dimensions are never unwound. This preserves semantic invariants throughout all levels of search.
125140

@@ -148,13 +163,15 @@ This produces progressively wider dialectical exploration while maintaining sema
148163

149164
### Dialectical Search
150165

151-
Every Metroid-driven query explores three zones:
166+
Every Metroid-driven query explores three zones, with all scoring anchored at the centroid `c`:
152167

153168
| Zone | Pole | Meaning |
154169
|------|------|---------|
155-
| Thesis zone | around m1 | Supporting ideas, corroborating evidence |
156-
| Antithesis zone | around m2 | Opposing ideas, counterevidence, alternative perspectives |
157-
| Synthesis zone | around c | Conceptually balanced territory between both poles |
170+
| Thesis zone | closer to m1 than to c | Supporting ideas, corroborating evidence |
171+
| Antithesis zone | closer to m2 than to c | Opposing ideas, counterevidence, alternative perspectives |
172+
| Synthesis zone | near c, equidistant from m1 and m2 | Conceptually balanced territory between both poles |
173+
174+
**Scoring from the centroid vantage point:** candidates are ranked by their distance to `c`. A candidate significantly closer to m1 than to `c` is thesis-supporting; significantly closer to m2 is antithesis-supporting; near `c` is synthesis-zone content. Candidates far from all three (`c`, m1, m2) indicate a third conceptual region — either an undiscovered knowledge area or a signal to unwind another Matryoshka layer. Scoring from m1 or m2 instead of `c` would anchor all results toward one pole, introducing confirmation bias.
158175

159176
This three-zone exploration prevents **confirmation bias**: a system that only retrieves nearest neighbors to m1 returns documents that confirm the query's premise. By also exploring m2 and c, CORTEX surfaces contradictions, alternatives, and knowledge gaps.
160177

@@ -705,7 +722,12 @@ Smart sharing is a core capability, not a post-v1 extra. The v1 exchange path mu
705722

706723
**Medoid** (mathematical term): The existing memory node selected as the statistical representative of a cluster. Selected by minimising the sum of distances to all other nodes in the cluster. Used throughout algorithmic descriptions and internal implementation comments.
707724

708-
**Centroid** (mathematical term): In MetroidBuilder, the centroid `c` is a full-dimensional vector where protected dimensions are copied from m1 (domain invariant) and unfrozen dimensions are the element-wise average of m1 and m2. Used as the balanced search origin in dialectical scoring.
725+
**Centroid** (mathematical term): In MetroidBuilder, the centroid `c` is a full-dimensional vector
726+
where protected dimensions are copied from m1 (domain invariant) and unfrozen dimensions are the
727+
element-wise average of m1 and m2. `c` is a synthetic "Kansas space" position — a center of mass
728+
where nothing in the memory graph typically exists. Its value is as a neutral vantage point:
729+
scoring candidates by distance to `c` gives equal weight to both poles. A candidate closer to m1
730+
is thesis-supporting; closer to m2 is antithesis-supporting; near `c` is genuinely balanced.
709731

710732
**Metroid** (CORTEX architectural term): A structured dialectical search probe constructed at query time: `{ m1, m2, c }`, where m1 is the thesis medoid, m2 is the antithesis medoid, and c is the centroid (protected dims from m1; unfrozen dims averaged). **A Metroid is never stored as a persistent graph structure.** It is an ephemeral instrument used by the CORTEX retrieval subsystem.
711733

README.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -62,19 +62,20 @@ When new observations arrive, Hippocampus immediately:
6262
This is the rapid, multi-path "write" system that turns raw experience into structured memory scaffolding.
6363

6464
### 🧩 Cortex — Intelligent Routing & Coherence
65-
When you ask a question, Cortex does **not** return a bag of similar vectors.
66-
67-
**Planned target behavior (v0.5+):**
68-
- Constructs a **Metroid** `{ m1, m2, c }` for the query — a structured dialectical search probe pairing the thesis medoid (m1) with an antithesis medoid (m2) and a balanced centroid (c)
69-
- Performs Matryoshka dimensional unwinding to discover semantically opposing knowledge
70-
- Performs parallel WebGPU "scoops" across the entire active universe (sub-millisecond)
71-
- Pulls relevant sub-graphs from IndexedDB
72-
- Traces closed-loop paths through Hebbian connections
73-
- Returns only self-consistent, coherent context chains
74-
- Detects **knowledge gaps** when no antithesis medoid exists within dimensional constraints
75-
- Broadcasts P2P curiosity probes (with `mimeType` + `modelUrn` for commensurability) to discover missing knowledge from peers
76-
77-
**Current behavior (v0.1):**
65+
Cortex does **not** return a bag of similar vectors.
66+
67+
**Required behavior (v0.5+ engineering target):**
68+
- Must construct a **Metroid** `{ m1, m2, c }` for every query — a structured dialectical search probe pairing the thesis medoid (m1) with an antithesis medoid (m2) and a balanced centroid (c)
69+
- The centroid `c` is a synthetic "Kansas space" vantage point (no real node lives there); scoring from `c` must give equal weight to both poles
70+
- Must perform Matryoshka dimensional unwinding to discover semantically opposing knowledge
71+
- Must perform parallel WebGPU "scoops" across the entire active universe (sub-millisecond)
72+
- Must pull relevant sub-graphs from IndexedDB
73+
- Must trace closed-loop paths through Hebbian connections
74+
- Must return only self-consistent, coherent context chains
75+
- Must detect **knowledge gaps** when no antithesis medoid exists within dimensional constraints
76+
- Must broadcast P2P curiosity probes (with `mimeType` + `modelUrn` for commensurability) to discover missing knowledge from peers
77+
78+
**Current behavior (v0.1 — placeholder):**
7879
- Flat top-K similarity scoring against the hotpath resident index with warm/cold spill
7980
- No MetroidBuilder, no dialectical pipeline, no knowledge gap detection yet
8081

0 commit comments

Comments
 (0)