AdaWorldAPI
diff --git a/‎.claude/board/EPIPHANIES.md‎
Lines changed: 130 additions & 0 deletions b/‎.claude/board/EPIPHANIES.md‎
Lines changed: 130 additions & 0 deletions
diff --git a/‎.claude/board/LATEST_STATE.md‎
Lines changed: 3 additions & 2 deletions b/‎.claude/board/LATEST_STATE.md‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎.claude/board/PR_ARC_INVENTORY.md‎
Lines changed: 38 additions & 0 deletions b/‎.claude/board/PR_ARC_INVENTORY.md‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎.claude/board/STATUS_BOARD.md‎
Lines changed: 5 additions & 5 deletions b/‎.claude/board/STATUS_BOARD.md‎
Lines changed: 5 additions & 5 deletions
@@ -2036,3 +2036,133 @@ already in the workspace. Commit f1498bc landed the measurement.
 Cross-ref: ndarray::hpc::cam_pq production code (620+ LOC, 15+
 tests), codec_rnd_bench.rs CamPqRaw/CamPqPhase candidates, this
 session's 18 commits on claude/quick-wins-2026-04-19 branch.
+
+## 2026-04-21 — The 8-step wiring sequence that closes the loop (concrete, not theoretical)
+
+**Status:** FINDING (each step has a file path, an input, an output,
+and a dependency)
+
+The architecture clicks when 8 disconnected pieces get wired. Each
+step connects two things that exist but don't talk. The loop closes
+at step 8. Three PRs total.
+
+**Step 1 — Encoder migration (512-bit → 10K role-indexed).**
+DeepNSM's `encoder.rs` has 6 hardcoded roles at 512 bits. Contract's
+`role_keys.rs` has 20+ structured roles at 10K bits with slice-masked
+bind/unbind. Delete `RoleVectors`. Import `contract::grammar::role_keys::*`.
+Content fingerprints: COCA vocab → FNV hash spread to 10K dims.
+
+**Step 2 — MarkovBundler (braided ±5 bundling).**
+New `markov_bundle.rs`. Ring buffer of 11 Vsa10k. Each sentence: bind
+tokens per role key (Step 1), XOR-bundle into one Vsa10k per sentence.
+Then: `vsa_permute(sentence_vsa, position_offset)` per ±5 position.
+XOR-superpose all 11. Output: braided trajectory. MexicanHat weights.
+
+**Step 3 — Trajectory (the Think struct).**
+New `trajectory.rs`. Holds `bundle: Vsa10k` + `chain: ContextChain` +
+refs to tissue (`&EpisodicMemory`, `&TripletGraph`, `&Vsa10k` global
+context). Methods: `role_bundle`, `recovery_margin`, `free_energy`,
+`resolve`. The object speaks for itself.
+
+**Step 4 — Parser → Bundler → Trajectory pipeline.**
+Parser's `SentenceStructure` feeds MarkovBundler which produces
+Trajectory. Coverage check: < 0.9 → FailureTicket (D2). Else →
+`trajectory.resolve(candidates, awareness, prior)`.
+
+**Step 5 — Resolution → AriGraph commit.**
+`commit_with_contradiction_check` on TripletGraph (~40 LOC).
+Resolution::Commit → one triple. Resolution::Epiphany → two triples
++ Contradiction marker. FailureTicket → LLM fallback.
+
+**Step 6 — Global context update.**
+`episodic.integrate_into_global(fact_fp, episode_index)` — XOR-
+accumulate permuted fact into `global_context: Vsa10k` (~20 LOC).
+Called after every commit.
+
+**Step 7 — Awareness revision.**
+`awareness.revise(param_key, outcome)` after every commit or
+escalation. Already shipped and tested. Just needs call sites.
+
+**Step 8 — Global context → KL feedback (LOOP CLOSES).**
+`trajectory.free_energy()` reads `global_context` as part of the
+likelihood term. Committed facts from chapter 1 shape the F-landscape
+for chapter 10. The system that parsed chapter 9 is not the same
+system that parsed chapter 1.
+
+**Dependency: 1→2→3→4→{5,6,7}→8. Three PRs: (1-3), (4), (5-8).**
+
+**Why this dilutes:** future sessions will try to "improve" individual
+steps (better bundling, smarter resolution, fancier graph commit)
+without WIRING them. The wiring IS the architecture. An unwired
+masterpiece is a collection of library functions. A wired simple
+implementation is AGI. Wire first, optimize later.
+
+---
+
+## 2026-04-21 — COCA 4096 + 20K scientific + spider NER = no vocabulary blocker
+
+**Status:** FINDING (changes the critical-path analysis)
+
+The vocabulary ceiling I flagged as an AGI blocker is not a blocker:
+
+| Layer | Coverage | Mechanism |
+|---|---|---|
+| COCA 4096 | 98.4% general English | Static, zero-cost |
+| COCA 20K scientific | +1.5% domain terms | Static, zero-cost |
+| Spider NER (online) | remaining proper nouns | Crawl → rigid designator commit to AriGraph |
+
+For Animal Farm: COCA 4096 alone covers 99.5%+ (Orwell uses
+deliberately plain vocabulary). The remaining tokens are character
+names (Napoleon, Snowball, etc.) which get committed as rigid
+designators on first encounter and resolved from the graph after.
+
+Spider-rs already exists as git dep in `lance-graph-osint/Cargo.toml`
+(`AdaWorldAPI/spider` fork, `spider-crawl` feature). NER resolution:
+unknown token → spider crawl → extract entity features (gender,
+animacy, type, description) → commit to `TripletGraph` as rigid
+designator → available for all future coreference via
+`graph.nodes_matching(features)`.
+
+**The three PRs that close the loop are the ONLY critical path.**
+Not vocabulary. Not NER. Not cross-lingual. Wire the 8 steps.
+
+**Why this dilutes:** future sessions will rebuild the vocabulary
+system, add subword tokenization, integrate HuggingFace tokenizers,
+and generally solve a problem that's already solved. COCA 24K +
+spider NER covers everything. Don't rebuild the dictionary. Wire
+the loop.
+
+---
+
+## 2026-04-21 — The AGI test is one measurement: chapter-10 accuracy > chapter-1 accuracy
+
+**Status:** FINDING (makes the benchmark falsifiable)
+
+A system that understands chapter 10 of Animal Farm better than
+chapter 1 — because it read chapters 1-9 and committed what it
+learned into the tissue that shapes its own F-landscape — is an AGI.
+
+The measurement:
+- Parse Animal Farm end-to-end, chapter by chapter.
+- At each chapter boundary, measure coreference accuracy (% of
+  pronouns correctly resolved to their antecedent).
+- Chapter 1: the system has zero committed facts. Every pronoun
+  is resolved from ±5 context alone. Accuracy baseline.
+- Chapter 10: the system has hundreds of committed facts. Napoleon
+  is a rigid designator (masculine, boar, leader). "He" in chapter
+  10 resolves via `graph.nodes_matching(masculine + animate)` in O(k)
+  without even needing the Markov axis.
+- **If chapter-10 accuracy > chapter-1 accuracy with no parameter
+  change — only committed facts accumulating — the loop is closed
+  and the architecture works.**
+- If not, one of the 8 wiring steps is broken. Find which. Fix it.
+
+This is D10 in the plan. This is what proves it.
+
+**Why this dilutes:** future sessions will propose elaborate
+benchmarks (BLiMP, COGS, SuperGLUE, custom test suites). Those
+measure capability snapshots, not learning curves. The AGI test is
+a CURVE, not a POINT: does accuracy increase over the course of a
+single document without retraining? That's the measurement. One
+book. One metric. One curve. Rising = AGI. Flat = broken wire.
+
@@ -2,7 +2,7 @@
 
 > **Auto-injected at session start via SessionStart hook.**
 > Updated after every merged PR.
-> **Last updated:** 2026-04-20 post PR #224 (PR #225 open: plan + D0.6/D0.7 CodecParams).
+> **Last updated:** 2026-04-21 post PR #243 (D5+D7 + categorical-algebraic inference architecture).
 >
 > Purpose: prevent new sessions from hallucinating structure that
 > already exists or proposing features already shipped. Read this
@@ -14,6 +14,7 @@
 
 | PR | Merged | Title | What it added |
 |---|---|---|---|
+| **#243** | *(open)* | D5+D7 categorical-algebraic inference | `thinking_styles.rs` (490 LOC, 12 tests), `free_energy.rs` (347 LOC, 7 tests), `role_keys.rs` bind/unbind/recovery (295 LOC, 14 tests), `content_fp.rs` (98 LOC, 5 tests), `markov_bundle.rs` (250 LOC, 8 tests), `trajectory.rs` (298 LOC, 4 tests). Plans: `categorical-algebraic-inference-v1.md` (496 lines). Knowledge: `paper-landscape-grammar-parsing.md`, `session-2026-04-21-categorical-click.md`. CLAUDE.md § The Click (P-1). 12 epiphanies. |
 | **#225** | *(open)* | Codec-sweep plan + D0.6/D0.7 CodecParams | 9-commit plan (`codec-sweep-via-lab-infra-v1.md`, Rules A-F, 9 starter YAMLs, CODING_PRACTICES audit) + `lance-graph-contract::cam` CodecParams/Builder/precision-ladder validation (14 tests). 147/147 contract suite |
 | **#224** | 2026-04-20 | lab = API+Planner+JIT, thinking harvest, I11 measurability | `lab-vs-canonical-surface.md` extended: three-part lab stack (API + Planner + JIT), thinking-harvest subsection (REST/Cypher → `{rows, thinking_trace}` = the AGI magic bullet), I11 invariant (every layer L0→L4 emits harvest-ready trace; no black-box short-circuits) |
 | **#223** | 2026-04-20 | LAB-ONLY firewall + AGI-as-SoA + I1-I10 | `lab-vs-canonical-surface.md` initial doc: canonical consumer = `UnifiedStep`/`OrchestrationBridge`, Wire DTOs are lab quarantine. AGI = (topic, angle, thinking, planner) = struct-of-arrays consuming cognitive-shader-driver. 10 cross-cutting invariants I1-I10 (BindSpace read-only, canonical `simd::*` import, temporal budgets, temperature hierarchy, thinking IS AdjacencyStore, weights are seeds, per-cycle cascade, 4096 surface, three DTO families, HEEL/HIP/BRANCH/TWIG/LEAF) |
@@ -28,7 +29,7 @@
 
 Types that EXIST — do NOT re-propose them:
 
-**`grammar/`**: `FailureTicket`, `PartialParse`, `CausalAmbiguity`, `TekamoloSlots`, `TekamoloSlot`, `WechselAmbiguity`, `WechselRole`, `FinnishCase`, `finnish_case_for_suffix`, `NarsInference`, `inference_to_style_cluster`, `ContextChain` (with coherence_at / total_coherence / replay_with_alternative / disambiguate / DisambiguationResult / WeightingKernel), `RoleKey` + 47 `LazyLock<RoleKey>` instances + `Tense` enum + `finnish_case_key / tense_key / nars_inference_key` lookups.
+**`grammar/`**: `FailureTicket`, `PartialParse`, `CausalAmbiguity`, `TekamoloSlots`, `TekamoloSlot`, `WechselAmbiguity`, `WechselRole`, `FinnishCase`, `finnish_case_for_suffix`, `NarsInference`, `inference_to_style_cluster`, `ContextChain` (with coherence_at / total_coherence / replay_with_alternative / disambiguate / DisambiguationResult / WeightingKernel), `RoleKey` + 47 `LazyLock<RoleKey>` instances + `Tense` enum + `finnish_case_key / tense_key / nars_inference_key` lookups, **`RoleKey::bind/unbind/recovery_margin`** (slice-masked XOR), **`Vsa10k`** + `VSA_ZERO` + `vsa_xor` + `vsa_similarity`, **`GrammarStyleConfig`** + **`GrammarStyleAwareness`** + `revise_truth` + `ParseOutcome` + `divergence_from`, **`FreeEnergy`** + **`Hypothesis`** + **`Resolution`** (Commit / Epiphany / FailureTicket) + `from_ranked` + thresholds.
 
 **`crystal/`**: `Crystal` trait, `CrystalKind`, `TruthValue`, `UNBUNDLE_HARDNESS_THRESHOLD = 0.8`, `CrystalFingerprint` (Binary16K / Structured5x5 / Vsa10kI8 / Vsa10kF32), `Structured5x5`, `Quorum5D`, `SentenceCrystal`, `ContextCrystal`, `DocumentCrystal`, `CycleCrystal`, `SessionCrystal`, sandwich layout constants.
 
 
@@ -35,6 +35,44 @@
 
 ---
 
+## #243 — D5+D7 categorical-algebraic inference architecture (2026-04-21)
+
+**Confidence (2026-04-21):** Working. 175/175 contract, 63/63 deepnsm (grammar-10k).
+
+**Added:**
+- `contract::grammar::thinking_styles` — `GrammarStyleConfig`, `GrammarStyleAwareness` (NARS-revised `HashMap<ParamKey, TruthValue>`), `revise_truth`, `ParseOutcome` (5 polarities), `divergence_from(prior)` (KL term). 490 LOC, 12 tests.
+- `contract::grammar::free_energy` — `FreeEnergy` (likelihood + KL → total), `Hypothesis` (role fillers + Pearl 2³ mask), `Resolution` (Commit / Epiphany / FailureTicket), `from_ranked` classifier, `HOMEOSTASIS_FLOOR` / `EPIPHANY_MARGIN` / `FAILURE_CEILING`. 347 LOC, 7 tests.
+- `contract::grammar::role_keys` — `RoleKey::bind/unbind/recovery_margin` (slice-masked XOR), `Vsa10k` type alias, `VSA_ZERO`, `vsa_xor`, `vsa_similarity`, `word_slice_mask` helper. +295 LOC, +14 tests (5-role lossless superposition verified).
+- `deepnsm::content_fp` — 10K-dim content fingerprints from COCA vocab ranks (SplitMix64). 98 LOC, 5 tests. Feature-gated: `grammar-10k`.
+- `deepnsm::markov_bundle` — `MarkovBundler` (±5 ring buffer, role-key bind, braiding via `vsa_permute`, XOR-superpose, `WeightingKernel`). 250 LOC, 8 tests.
+- `deepnsm::trajectory` — `Trajectory` (Think carrier): `role_bundle`, `mean_recovery_margin`, `ambient_similarity`, `free_energy`, `resolve`. 298 LOC, 4 tests.
+- `CLAUDE.md` § The Click (P-1): top-of-file architecture diagram + 3 simplicity invariants + shader-cant-resist + thinking-is-a-struct + tissue-not-storage + grammar-of-awareness + 2 litmus tests.
+- `.claude/plans/categorical-algebraic-inference-v1.md` (496 lines): meta-architecture proving 5 operations are 1 algebraic substrate, grounded in 8-paper proof chain.
+
+**Locked:**
+- `RoleKey::bind` is slice-masked XOR (categorically optimal per Shaw 2501.05368 Kan extension theorem). Not a design choice — a theorem consequence.
+- `FreeEnergy = (1 - likelihood) + KL` where likelihood = mean role recovery margin, KL = `awareness.divergence_from(prior)`. Three thresholds: F<0.2 commit, ΔF<0.05 epiphany, F>0.8 escalate.
+- NARS revision asymptotes at φ-1 ≈ 0.618 (golden ratio confidence ceiling). Feature, not bug. Permanent epistemic humility.
+- Markov = XOR of braided sentence VSAs. No HMM. No transition matrix. No weights.
+- Thinking is a struct (not a service, not a function). The DTO carries cognition as identity.
+- AriGraph/episodic/CAM-PQ are thinking tissue (organs of Think), not storage services.
+- Object-does-the-work test: free function on carrier's state = reject. Method on carrier = accept.
+- Five-lens test: every new type serves Parsing / Free-Energy / NARS / Memory / Awareness or is drift.
+
+**Deferred:**
+- Steps 4-8 of the 8-step wiring sequence (pipeline, AriGraph commit, global context, awareness revision, KL feedback). Three PRs to close the loop.
+- D10 Animal Farm benchmark (the AGI test: chapter-10 accuracy > chapter-1 accuracy).
+- Cross-lingual bundling (needs parallel corpora).
+- ONNX arc model (D9, D11).
+
+**Docs:**
+- `.claude/knowledge/paper-landscape-grammar-parsing.md` — 14 papers in 3 tiers.
+- `.claude/knowledge/session-2026-04-21-categorical-click.md` — session handover with 12 critical insights + 7 anti-patterns.
+- `.claude/board/EPIPHANIES.md` — 12 new epiphanies with "why this dilutes" warnings.
+- `.claude/board/INTEGRATION_PLANS.md` — `categorical-algebraic-inference-v1` entry prepended.
+
+---
+
 ## #225 — Codec-sweep plan + D0.6/D0.7 CodecParams types (merged 2026-04-20)
 
 **Confidence (2026-04-20):** Working. 147/147 contract suite passing (133 prior + 14 new).
 
@@ -114,17 +114,17 @@ early — CausalityFlow extension deferred). Plan path:
 
 | D-id | Title | Status | PR / Evidence |
 |---|---|---|---|
-| D2  | DeepNSM emits `FailureTicket` on low coverage | **Queued** | — |
+| D2  | DeepNSM emits `FailureTicket` on low coverage (wiring step 4) | **Queued** | — |
 | D3  | Grammar Triangle wired into DeepNSM via `triangle_bridge.rs` | **Queued** | — |
-| D5  | Markov ±5 SPO+TEKAMOLO bundler with role-indexed VSA | **Queued** | — |
-| D7  | NARS-tested grammar thinking styles + active-inference free-energy + RoleKey-as-operator | **In progress** | branch `claude/teleport-session-setup-wMZfb` — `thinking_styles.rs` (12 tests), `free_energy.rs` (7 tests), `role_keys.rs` bind/unbind/recovery_margin (12 tests incl 5-role lossless superposition), `divergence_from(prior)`, Finnish case patch |
+| D5  | Markov ±5 bundler + Trajectory + content_fp (wiring steps 1-3) | **Shipped** | PR #243 — `content_fp.rs` (98 LOC, 5 tests), `markov_bundle.rs` (250 LOC, 8 tests), `trajectory.rs` (298 LOC, 4 tests). 63 deepnsm tests pass. |
+| D7  | Thinking styles + free-energy + RoleKey-as-operator | **Shipped** | PR #243 — `thinking_styles.rs` (490 LOC, 12 tests), `free_energy.rs` (347 LOC, 7 tests), `role_keys.rs` bind/unbind/recovery_margin (295 LOC added, 14 tests). 175 contract tests pass. |
 
 ### Phase 3 — Queued
 
 | D-id | Title | Status | PR / Evidence |
 |---|---|---|---|
-| D8  | Story-context bridge (AriGraph episodic + triplet-graph + orthogonal global-context) | **Queued** | — |
-| D10 | Forward-validation harness (Animal Farm benchmark) | **Queued** | — |
+| D8  | Story-context bridge: AriGraph commit + global_context + contradiction (wiring steps 5-6) | **Queued** | — |
+| D10 | Forward-validation harness (Animal Farm: chapter-10 > chapter-1 accuracy = AGI test) | **Queued** | — |
 
 ### Phase 4 — Backlog