From eb4477566deee55bef816c17be37ad436a29ce2c Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 11:45:39 +0000 Subject: [PATCH 01/15] probe(stoic-turing): Q3 standing-wave falsification + Q4 HHTL audit MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Q3 verdict: (B) Damped relaxation + (D) wave vocabulary. The executable substrate is feed-forward bind/bundle/cosine with saturating gates and a hard-capped contraction loop (fe *= 0.5, depth < 9). No closed feedback, no vsa16k_permute on the f32 carrier, no phase decode, no self-sustaining field. global_context reshaping next cycle: 0 source hits. Key gap: vsa_permute exists only on the binary VsaVector in ndarray (correct, norm-preserving). The Vsa16kF32 "Click" carrier has no permutation — the braid that the architecture rests on is missing from the carrier it is claimed on. Latent bug identified: kv_bundle::unbundle_from uses wrapping_sub (raw subtraction) as the inverse of a weighted-average bundle — not an inverse. Q4 verdict: HHTL is addressing — but only in hhtl.rs. high_heel.rs is 0% address algebra. The two files share a name but are different concepts. hhtl.rs: NiblePath { path: u64, depth: u8 }, fan-out 16, nibble-shift prefix arithmetic. is_ancestor_of is the kernel — 4 lines, const, O(1). Every other relation (descendant, sibling, common_ancestor) derives from it. Clean. high_heel.rs breakdown (~900 lines): - ~290 LOC: LensProfile/LensConfig/LENS_REGISTRY — encoding calibration, zero addressing. Should move out of lance-graph-contract entirely. - ~170 LOC: BasinAccumulator + calibrate — online metric clustering (L1 distance, EMA centroids). Not derivable from prefix axioms. - scent() — the advertised 95%-rejection HEEL pre-filter — is dead code in the merge path. BasinAccumulator::ingest never calls it. - The test ships its own failure diagnosis: eprintln!("Low merge ratio... threshold may be too tight") and "texts not clustering meaningfully". Refactor prescription: - Keep hhtl.rs as-is (the real HHTL kernel). - Split high_heel.rs into high_heel_container.rs + basin_accumulator.rs + lens_profile.rs. - Wire scent() into ingest() or remove the cascade claim. https://claude.ai/code/session_0147hSzjmWZDuy2MSQNrhEK5 --- docs/probes/q3-standing-wave-falsification.md | 257 ++++++++++++++++++ docs/probes/q4-hhtl-audit.md | 250 +++++++++++++++++ 2 files changed, 507 insertions(+) create mode 100644 docs/probes/q3-standing-wave-falsification.md create mode 100644 docs/probes/q4-hhtl-audit.md diff --git a/docs/probes/q3-standing-wave-falsification.md b/docs/probes/q3-standing-wave-falsification.md new file mode 100644 index 00000000..1a0adf9d --- /dev/null +++ b/docs/probes/q3-standing-wave-falsification.md @@ -0,0 +1,257 @@ +# Q3 Probe — Standing-Wave Falsification + +> **Branch:** `claude/stoic-turing-M0Eiq` +> **Date:** 2026-06-06 +> **Files read:** `crystal/fingerprint.rs`, `cognitive_shader.rs`, `collapse_gate.rs`, +> `cycle_accumulator.rs`, `crystal/cycle.rs`, `recipe_kernels.rs`, `atoms.rs`, +> `planner/src/cache/kv_bundle.rs`, `ndarray/src/hpc/vsa.rs` +> **Method:** Read all VSA, braid, permutation, bundle, and cognitive-shader source in +> lance-graph-contract + lance-graph-planner. Answer 7 questions with source citations. +> No narrative — executable code only. + +--- + +## Classification + +### **(B) Damped relaxation, with (D) graph propagation with wave vocabulary as the framing layer** + +There is no standing wave. The only state-evolution code that exists is either +(i) a bounded geometric decay to a fixed point, or +(ii) a running weighted-average accumulator with distance readout. +The "wave / braid / Markov trajectory / standing field" vocabulary in CLAUDE.md is almost +entirely doc-comment and architecture prose. The executable substrate is feed-forward +bind/bundle/cosine plus saturating quantizers. Every loop has a hard iteration cap or runs +once per input with no re-injection. + +--- + +## 7-Question Answers (source-cited) + +### Q1 — What is the evolution operator? + +**There is no single VSA state → next-state operator in the contract or planner crates.** + +What exists: + +- **Feed-forward encode primitives** (`crystal/fingerprint.rs:367–505`): + `vsa_bind` / `vsa16k_bind` (elementwise multiply), + `vsa_bundle` / `vsa16k_bundle` (elementwise add), + `vsa_superpose` (weighted add), + `vsa_cosine` / `vsa16k_cosine` (normalized readout). + All pure functions `(state, state) → state`. None iterate. None feed back. + +- **The only accumulating "state machine"** is `AttentionMatrix::set` in + `planner/src/cache/kv_bundle.rs:75–84`: + `gestalt_{n+1} = (epoch·gestalt_n + head) / (epoch+1)` — a running mean, + updated only by external writes. Remove the writer; it is static. + +- **The "F descent"** — the advertised dispatch engine — is + `recipe_kernels.rs:255`: `while fe > NOISE_FLOOR && depth < 9 { fe *= 0.5 }`. + +Summary: **a pipeline of pure maps with saturating gates, not a recurrence.** + +--- + +### Q2 — Linear, piecewise-linear, or nonlinear? + +All three coexist, segregated by stage: + +| Stage | Linearity | Source | +|-------|-----------|--------| +| `vsa16k_bundle` | **Linear** — elementwise `+=`, no normalization, grows unboundedly | `fingerprint.rs:479–487` | +| `vsa16k_cosine` | **Nonlinear** — divides by `norm_a.sqrt()*norm_b.sqrt()`; `<1e-12 → 0.0` | `fingerprint.rs:490–505` | +| `I4x32::pack` | **Nonlinear** — `clamp(-8, 7)` saturating quantizer | `atoms.rs:97, :153` | +| `Vsa10kI8` | **Nonlinear** — `.clamp(-1.0, 1.0)` | `fingerprint.rs:244` | +| `GateDecision` | **Piecewise-linear** — Flow/Block/Hold threshold | `collapse_gate.rs:59` | +| F-threshold commit | **Piecewise-linear** — `if free_energy < 0.2 → Commit` | `recipe_kernels.rs:680` | +| `fe *= 0.5` loop | **Nonlinear** (exponential) — the central "dynamic" | `recipe_kernels.rs:255–261` | + +--- + +### Q3 — Which part is unitary/permutation-like (norm-preserving)? + +**`vsa_permute` does not exist on the f32 carrier.** + +The cyclic bit rotation lives in `ndarray/src/hpc/vsa.rs:326–349`, operating on the +**binary `VsaVector`** (`words: [u64]`, 16384 *bits*). It is a genuine cyclic permutation +— `dst_bit = (src_bit + shift) % 16384` — norm-preserving in Hamming space, with a +correct round-trip test at `vsa.rs:322–324`. + +The `Vsa16kF32` f32 carrier — the one CLAUDE.md calls the "Click carrier" — has +**no permutation primitive**. `fingerprint.rs:163` and `fingerprint.rs:295` have +doc-comments pointing at the ndarray binary primitive; the f32 sandwich never calls it. + +The `vsa_sequence` binary path (`vsa.rs:371–378`) permutes-by-index then bundles — +but there is **no inverse-permute readout anywhere**: `vsa_clean` (`vsa.rs:394`) does a +flat Hamming scan with no de-braiding. Position is written once and never decoded back. + +**The ρ^d braid is norm-preserving on the binary carrier it actually lives on. +It is structurally absent from the f32 carrier the "Click" architecture rests on.** + +--- + +### Q4 — Which part is dissipative? + +Five independent sinks: + +1. **Cosine normalization** (`fingerprint.rs:490–505`) — discards magnitude entirely. +2. **Saturating quantization** — `I4x32::pack` `clamp(-8,7)` (`atoms.rs:97`); + `Vsa10kI8` `.clamp(-1.0,1.0)` (`fingerprint.rs:244`); + `structured_from_vsa10k` `.round().clamp(0,255)` (`fingerprint.rs:274`). +3. **Threshold gates** — `vsa16k_to_binary16k_threshold` sign-collapse to 1 bit + (`fingerprint.rs:451`); `GateDecision::Block/Hold`. +4. **α-saturation early termination** — `MergeMode::AlphaFrontToBack` documents + `α_acc += α_i*(1-α_acc); if α_acc > 0.99 break` (`collapse_gate.rs:36–46`). + The formula is doc-only; the enum variant carries no executed math. +5. **The F-descent** — `fe *= 0.5` (`recipe_kernels.rs:259`) is a contraction map. + The system's central "dynamic" is literally exponential damping to zero, hard-capped + at 9 iterations. + +--- + +### Q5 — Does a non-trivial field persist after input removal? + +**No. There is no closed feedback loop in executed code.** + +- `CycleCrystal` (`crystal/cycle.rs:10–18`) — a frozen data record (fields + getters), + no `step`/`evolve` method. +- `CycleAccumulator` (`cycle_accumulator.rs`) — a `Vec` batch buffer; `drain()` + empties it, no carry-over. +- `CollapseGateEmission` (`collapse_gate.rs:177`) — a baton-list DTO. CLAUDE.md itself + states `Vsa16kF32` does NOT cross mailbox boundaries; the bundle is "an ephemeral + computation, never persisted." +- `gestalt` in `kv_bundle.rs` — persists across `set()` calls but is a running mean + updated by external writes only; remove the writer and it is static. +- `global_context += fact → reshapes NEXT cycle` — prose in CLAUDE.md. + `grep global_context src/`: **zero hits** in executed source. + +--- + +### Q6 — Is phase preserved, quantized, or merely implied? + +**Merely implied, and on the f32 carrier, absent.** + +Real VSA phase-coding requires permute-at-encode AND permute-aware-unbind. +In the binary path (`ndarray/vsa.rs:371–378`), `vsa_sequence` permutes-by-index then +bundles. But there is no inverse-permute at readout — `vsa_clean` does a flat Hamming +scan with no de-braiding. **Phase is written once and never distinguished at readout.** + +On the `Vsa16kF32` carrier — the actual "Click" substrate — no permute is applied at all. +Phase is structurally absent, not quantized. + +--- + +### Q7 — Minimum test to falsify + +**This is the standing-wave falsification test. The system fails it as written.** + +```rust +#[test] +fn standing_wave_or_damped_relaxation() { + // Build initial state + let role_key = vsa16k_role_key(RoleSlice::SUBJECT); + let content = vsa16k_content_fp(b"hello"); + let s0 = vsa16k_bind(&role_key, &content); + + // Remove all external input. Attempt self-recurrence: + // s_{n+1} = normalize(vsa16k_bundle([s_n])) -- the strongest possible recurrence + let mut s = s0.clone(); + let mut energies = Vec::new(); + for _ in 0..100 { + // vsa16k_bundle([s]) is identity (sum of one vector), so cosine stays 1.0. + // With normalization each step: s approaches unit-norm fixed point. + let norm: f32 = s.0.iter().map(|x| x*x).sum::().sqrt(); + energies.push(norm); + s.0.iter_mut().for_each(|x| *x /= norm.max(1e-12)); // normalize + } + + // Standing-wave criterion (would pass iff true wave): + // norm stays bounded AND bounded-away-from-zero AND energy recirculates. + // Actual result: norm monotonically → 1.0 (fixed point), never oscillates. + // That is damped relaxation, not a standing wave. + + let last = *energies.last().unwrap(); + // Should oscillate if wave; instead it's a fixed point: + assert!((last - 1.0_f32).abs() < 0.01, "fixed point, not wave: norm = {last}"); + + // What would prove a standing wave: inserting vsa16k_permute(n) in the loop + // so s_{n+1} = permute(normalize(s_n)) would produce a periodic orbit. + // vsa16k_permute does not exist. That is the gap. +} +``` + +**What would falsify the (B) verdict:** A function `f(s: Vsa16kF32) -> Vsa16kF32` that +(a) applies a norm-preserving rotation on the f32 carrier, and (b) is iterated without +a hard depth cap and without per-step external input, producing bounded non-zero energy +at n=∞. No such `f` exists in the codebase. + +--- + +## Load-Bearing Stones vs Cathedral Fog + +### Stones (correct and confirmed): + +| Claim | Status | Source | +|-------|--------|--------| +| Binary `vsa_permute` is norm-preserving cyclic rotation | ✅ Correct | `ndarray/vsa.rs:326–349` | +| Role-key orthogonality via disjoint slices | ✅ Correct-by-construction | `vsa/roles.rs`, `grammar/role_keys.rs` | +| Baton carries `(u16 target, CausalEdge64)` across mailbox boundaries | ✅ Correct | `collapse_gate.rs:177` | +| CAM-PQ codec is separate from VSA (I-VSA-IDENTITIES) | ✅ Correct | enforced architecturally | +| `vsa16k_bind` = elementwise multiply (Hadamard product) | ✅ Correct | `fingerprint.rs:468` | +| `vsa16k_bundle` = elementwise sum, no normalization | ✅ Correct | `fingerprint.rs:479` | + +### Fog (vocabulary in prose, absent in executable code): + +| Claim | Status | Gap | +|-------|--------|-----| +| ρ^d braiding on `Vsa16kF32` | ❌ Missing | `vsa_permute` exists only on binary carrier | +| Phase decode / unbind recovers position | ❌ Missing | No inverse-permute at readout | +| `global_context += fact` reshapes next cycle | ❌ Missing | `grep global_context`: 0 hits in src | +| "Shader can't resist thinking" active-inference loop | ❌ Not a loop | `fe *= 0.5`, depth < 9, then stops | +| Standing-wave / self-sustaining field | ❌ Not present | No feedback loop in executed code | + +--- + +## Latent Bug (separate from the wave question) + +`unbundle_from` in `kv_bundle.rs:29–33` uses `wrapping_sub` on i16: + +```rust +pub fn unbundle_from(&mut self, head: &HeadPrint) { + for (g, h) in self.gestalt.iter_mut().zip(head.0.iter()) { + *g = g.wrapping_sub(*h); // NOT the inverse of weighted-average bundle + } +} +``` + +`bundle_into` divides by total weight (a weighted average). `unbundle_from` does raw +subtraction. **These are not inverses.** After a few epochs, `gestalt.unbundle_from()` +produces a vector that has no predictable relationship to what was bundled. +Additionally, `wrapping_sub` silently flips sign on overflow. +This is a real bug independent of the wave question. + +--- + +## Prescription + +**Do not add `vsa16k_permute` to plug the standing-wave gap unless you first prove the +standing-wave architecture provides measurable benefit over the current feed-forward +system.** The feed-forward bind/bundle/cosine + threshold pipeline is: +- Correct +- Well-tested +- Delivers real value (VSA encoding, role-indexed readout, SPO triple commit) + +The standing-wave framing is an architectural aspiration that requires: +1. `vsa16k_permute` on the f32 carrier (the rotation) +2. Inverse-permute at readout (the decoding) +3. A closed-loop recurrence without a hard iteration cap +4. Evidence that the resulting system does something the feed-forward version does not + +Before building stained glass, prove the foundation: write the falsification test above +and make it pass. Until it passes, CLAUDE.md §"The Click" should be marked +`[ASPIRATIONAL — not yet executable]` rather than presented as current implementation. + +**The honest architecture description:** a feed-forward VSA encode/readout pipeline with +threshold-gated commit, a binary permutation in ndarray (correctly norm-preserving but +with no inverse-decode), and a baton-based causal handoff. That is already interesting +and already shipped. The standing-wave framing adds nothing to it today. diff --git a/docs/probes/q4-hhtl-audit.md b/docs/probes/q4-hhtl-audit.md new file mode 100644 index 00000000..8203a22b --- /dev/null +++ b/docs/probes/q4-hhtl-audit.md @@ -0,0 +1,250 @@ +# Q4 Probe — HHTL Audit: Address Algebra vs Accretion + +> **Branch:** `claude/stoic-turing-M0Eiq` +> **Date:** 2026-06-06 +> **Files audited:** `crates/lance-graph-contract/src/hhtl.rs` (19 KB, ~470 lines), +> `crates/lance-graph-contract/src/high_heel.rs` (42 KB, ~900 lines) +> **Method:** Read both files completely. Classify every struct/enum/function +> into exactly one of five categories. No narrative — code citations. + +--- + +## Headline finding + +**The address algebra lives almost entirely in `hhtl.rs`. `high_heel.rs` contains +essentially zero address algebra.** + +The two files share the "HHTL" brand but are different concepts. +`hhtl.rs` is a clean nibble-addressed prefix tree. `high_heel.rs` is a binary container +format + an online metric clustering accumulator + a lens-calibration subsystem. +They communicate in module docs, not in code. + +--- + +## Formal definition of an HHTL address (what the code actually says) + +**`hhtl.rs` — the real address: `NiblePath { path: u64, depth: u8 }`** + +- Fan-out: 16 (`FAN_OUT = 16`). One nibble (4 bits) per level, root-first. +- Max depth: 16 (`MAX_DEPTH = 16`). Max 16 nibbles = 64 bits. +- **Truncation is explicit and exact:** + - `parent()` → `path >> 4, depth - 1` + - `child(n)` → `(path << 4) | n, depth + 1` + - These are literal inverses. No ambiguity. +- **Containment is a single prefix kernel** — `is_ancestor_of` shifts `other.path` + right by `4 * (other.depth - self.depth)` nibbles and compares. One function, + called by all relation operations. No ad-hoc re-implementation anywhere. + +**`high_heel.rs` — NOT an address.** + +`Heel::dn_address: u64` is an opaque identity key. It is never truncated, +never prefix-compared, never tested for containment. The field name contains +"address" but the semantics are identity, not addressing. The module doc +claims "HHTL cascade mapping (HEEL=scent / HIP=palette / TWIG=SpoBase17 / LEAF=full +planes)" — this is a layout legend in a comment. **No code routes by prefix.** + +--- + +## Classification table + +### `hhtl.rs` (~470 lines incl. tests) + +| Symbol | Category | +|--------|----------| +| `NiblePath { path, depth }` | **1 — Address algebra** | +| `FAN_OUT = 16`, `MAX_DEPTH = 16`, `EMPTY` | **1 — Address algebra** (the axioms) | +| `root()`, `child(n)`, `try_child(n)`, `parent()` | **1 — Address algebra** (extension / truncation) | +| `basin()`, `leaf()`, `depth()`, `packed()` | **1 — Address algebra** (prefix readout) | +| `is_ancestor_of` | **1 — Address algebra** (the containment kernel — see §6 below) | +| `is_descendant_of` | **3 — Topology consequence** (`other.is_ancestor_of(self)` delegation) | +| `is_sibling_of` | **3 — Topology consequence** (same parent, different child) | +| `common_ancestor` | **3 — Topology consequence** (LCA via depth alignment) | +| `is_full` | **2 — LOD consequence** (depth == MAX\_DEPTH = u64 capacity limit) | +| `FieldMask` usage in test impls | **3 — Topology consequence** | + +### `high_heel.rs` (~900 lines incl. tests) + +| Symbol | Category | +|--------|----------| +| `CONTAINER_WORDS/BYTES`, `HEEL_WORDS`, `MAX_EDGES` | **5 — Unrelated utility** (container layout constants) | +| `SpoBase17` + `l1_distance`, `l1_subject`, `scent()` | **5 — Unrelated utility** (distance metrics on i16 planes) | +| `Heel { dn_address, frequency, confidence, scent, plasticity, temporal }` | **5 — Unrelated utility** (NARS bitfield packing) | +| `pack_truth_meta`, `unpack_truth_meta` | **5 — Unrelated utility** (NARS bit serialization) | +| `HighHeelBGZ { buf: [u64; CONTAINER_WORDS] }` | **5 — Unrelated utility** (wire container) | +| `HighHeelBGZ::new`, `add_edge`, `edge_count`, `wire_size` | **5 — Unrelated utility** | +| `pack`/`unpack`, `spo_to_bytes`/`bytes_to_spo` | **5 — Unrelated utility** (serialization) | +| `is_crystallized` | **4 — Special-case accretion** (NARS confidence threshold state machine) | +| `revise_truth` | **4 — Special-case accretion** (4-branch plasticity threshold magic numbers) | +| `BasinAccumulator`, `ingest`, `calibrate`, `stats` | **4 — Special-case accretion** (online metric clustering + EMA centroid) | +| `BasinStats` | **5 — Unrelated utility** (monitoring DTO) | +| `EncodingPath`, `LensProfile`, `LensProfile::build` | **5 — Unrelated utility / should move out** (encoding calibration — zero addressing) | +| `LensConfig`, `LensFamily`, `TokenizerFamily`, `LENS_REGISTRY` | **5 — Unrelated utility / should move out** (hardcoded model registry) | +| experiment / test harness | test infrastructure | + +--- + +## LOC estimate per category (both files combined, ~1370 lines) + +| Category | LOC | Where | +|----------|-----|-------| +| **1 — Address algebra** | ~110 | `hhtl.rs` only | +| **2 — LOD consequence** | ~8 | `hhtl.rs` only | +| **3 — Topology consequence** | ~70 | `hhtl.rs` only | +| **4 — Special-case accretion** | ~170 | `high_heel.rs` only | +| **5 — Unrelated utility / move out** | ~620 | `high_heel.rs` only | +| tests / experiments | ~390 | both | + +**Address algebra + direct consequences: ~190 LOC, ~14% of total. +All 14% lives in `hhtl.rs`. `high_heel.rs` is 0% address algebra.** + +--- + +## Verdict: Is HHTL fundamentally addressing? + +**Partial — and split along file lines.** + +`hhtl.rs` **is** fundamentally addressing. Every interesting behavior derives from two +axioms: `child = (path << 4) | n` and prefix-compare via right-shift. +`is_ancestor_of`, `is_descendant_of`, `is_sibling_of`, `common_ancestor`, `is_full`, +`parent` — all follow. No special cases. Claim vindicated *here*. + +`high_heel.rs` is **not** addressing. Belonging is decided by `l1_distance < threshold` +(metric clustering), not prefix containment. "HHTL" in this file is a branding label, +not a shared abstraction. The file has become a bucket: three independent subsystems +(wire container, basin clustering, lens calibration) filed under one name. + +--- + +## The kernel that proves the concept + +`NiblePath::is_ancestor_of` in `hhtl.rs`: + +```rust +pub const fn is_ancestor_of(self, other: Self) -> bool { + if self.depth == 0 || self.depth > other.depth { + false + } else { + (other.path >> (4 * (other.depth as u32 - self.depth as u32))) == self.path + } +} +``` + +Four lines. Const, O(1). Every other relation reduces to it. This single function proves +"HHTL is addressing" — truncate `other` to `self`'s depth by shifting off trailing +nibbles, compare prefix. This is the load-bearing stone. + +--- + +## Three most egregious accretions in `high_heel.rs` + +### A1 — `LensProfile` / `LensConfig` / `LENS_REGISTRY` (~290 LOC) + +Encoding-distortion profiling with hardcoded `cos_range`, `gamma_offset`, +`TokenizerFamily` (jina-v3/bge-m3/reranker-v3), model-specific lens configs. +**Zero addressing, zero clustering, zero container relationship** to the rest of the file. +It is a transplanted calibration module. + +**Action:** Extract to `lens_profile.rs`. Long-term: move out of `lance-graph-contract` +entirely (the zero-deps invariant is not violated by its presence, but it has no business +in the contract surface — it is an encoding implementation detail). + +### A2 — `BasinAccumulator::calibrate` + EMA centroid drift (~120 LOC combined) + +Online metric clustering with exponential-moving-average centroid updates and +pairwise-distance percentile auto-threshold. Sophisticated but entirely metric, +not derivable from any prefix axiom. + +**Additional smell:** The test prints its own clustering failure: +`eprintln!("Low merge ratio … threshold may be too tight")`. The file ships +embedded evidence that the algorithm does not work well. + +**Action:** Extract to `basin_accumulator.rs` or a dedicated clustering module. + +### A3 — `revise_truth` plasticity state machine + +```rust +let new_plasticity = if merged_c > 0.8 { 0 } else if merged_c > 0.6 { 1 } + else if merged_c > 0.3 { 2 } else { 3 }; +``` + +Four branches, three magic-number thresholds, zero citation. Not derivable from +address structure. Per `I-LEGACY-API-FEATURE-GATED`: magic thresholds must say +"hand-tuned" and cite the calibration experiment. These don't. + +**Action:** Extract to a `nars_truth` helper or document thresholds with rationale. + +--- + +## The dead cascade + +**`scent()` — the advertised 95%-rejection HEEL pre-filter — is dead code in the merge path.** + +`scent()` is computed on `SpoBase17` and tested in isolation. `BasinAccumulator::ingest` +does a flat linear scan over all basins computing full `l1_distance` and never calls +`scent()` as a pre-filter. The headline "cascade" performance claim has no implementation. + +If the pre-filter were wired: `ingest` would call `candidate.scent().is_compatible(new.scent())` +first, reject mismatches without computing `l1_distance`, and achieve O(k) instead of O(n) +where k << n is the scent-compatible subset. Currently it's O(n) flat scan. To fix: + +```rust +// In BasinAccumulator::ingest — proposed, not yet implemented: +for basin in &mut self.basins { + if !basin.scent.is_compatible(candidate.scent) { continue; } // HEEL gate + if basin.centroid.l1_distance(&candidate) < self.threshold { // HIP merge + // merge + } +} +``` + +--- + +## Refactor prescription + +1. **Keep `hhtl.rs` as-is.** It is clean, minimal, provably correct. Do not merge it into anything. +2. **Split `high_heel.rs` into three files:** + - `high_heel_container.rs` — `HighHeelBGZ`, `Heel`, `SpoBase17`, pack/unpack (~200 LOC) + - `basin_accumulator.rs` — `BasinAccumulator`, `BasinStats`, calibrate (~170 LOC) + - `lens_profile.rs` — `LensProfile`, `LensConfig`, `LensFamily`, `TokenizerFamily`, `LENS_REGISTRY` (~290 LOC). **Move out of contract surface eventually.** +3. **Wire `scent()` into `BasinAccumulator::ingest`** or delete the pre-filter claim from all docs. +4. **Cite or remove the plasticity thresholds** in `revise_truth`. + +--- + +## Proposed test that proves the kernel is correct + +```rust +#[test] +fn prefix_kernel_completeness() { + // Every relation must reduce to is_ancestor_of or the shift idiom. + let root = NiblePath::root(); + let a = root.child(3).child(7); + let b = root.child(3).child(7).child(2); + let c = root.child(3).child(9); + + // Containment + assert!(a.is_ancestor_of(b)); + assert!(!b.is_ancestor_of(a)); + assert!(!a.is_ancestor_of(c)); // different last nibble at depth 2 + + // Topology consequences follow + assert!(b.is_descendant_of(a)); + assert!(a.is_sibling_of(c)); // same parent (root.child(3)), different child + assert_eq!(a.common_ancestor(c), root.child(3)); + + // LOD consequence + assert!(!a.is_full()); // depth 2 < MAX_DEPTH + let full = (0..NiblePath::MAX_DEPTH).fold(NiblePath::root(), |p, i| p.child(i as u8 % 16)); + assert!(full.is_full()); + + // Cascade routing: truncating to any depth gives a valid ancestor + for depth in 0..=b.depth() { + let ancestor = NiblePath { path: b.path >> (4 * (b.depth() - depth) as u32), depth }; + assert!(ancestor.is_ancestor_of(b) || ancestor == b); + } +} +``` + +This test would fail if `is_ancestor_of` had an off-by-one, if `parent()` didn't invert +`child()`, or if the nibble-shift arithmetic were wrong. It does not test the claimed cascade +performance (O(k) scent pre-filter) because that code does not exist yet. From 674cdc281882bd90a77431ee052302fdeeb88e00 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 17:33:26 +0000 Subject: [PATCH 02/15] docs/probes: source-grounded particle/SoA envelope audit Audits the current implementation against the intended particle model (one mailbox-owned thought, one SoA envelope, immutable address identity, OGAR class inheritance, Lance versioning, CausalEdge64 as payload, explicit references, local pragmatics). Findings tagged Confirmed/Inferred/Absent/Contradiction per phase, all file-path grounded. Key results: two competing SoA envelopes (BindSpace global + MailboxSoA per-mailbox); deprecated Vsa16kF32 still live as the BindSpace cycle plane (65 KB/row); two incompatible CausalEdge64 types; OGAR::classes::from(address) reverse resolver absent (forward-only in OGAR, linear scan in lance-graph); references are explicit Copy handles with no ownership cycles; Lance versioning clean. Seven minimal corrections proposed, no theory-driven renames. --- docs/probes/particle-soa-envelope-audit.md | 493 +++++++++++++++++++++ 1 file changed, 493 insertions(+) create mode 100644 docs/probes/particle-soa-envelope-audit.md diff --git a/docs/probes/particle-soa-envelope-audit.md b/docs/probes/particle-soa-envelope-audit.md new file mode 100644 index 00000000..c52b04f8 --- /dev/null +++ b/docs/probes/particle-soa-envelope-audit.md @@ -0,0 +1,493 @@ +# Particle / SoA Envelope Audit — Does the Struct Geometry Measure What It Claims? + +> **Branch:** `claude/stoic-turing-M0Eiq` +> **Date:** 2026-06-06 +> **Scope:** Current code only. No standing-wave, no emergent-cognition, no +> philosophy. No assumption of a global carrier, wave field, f32 carrier, +> singleton thought, or distributed truth ownership. +> **Method:** Whole-file reads of the envelope, edge, reference, scheduler, +> bridge, and ontology surfaces in `lance-graph` + `OGAR`. Every claim is +> file-path grounded and tagged Confirmed / Inferred / Absent / Contradiction. + +**Intended model under test:** +1. One mailbox-owned logical thought. +2. One SoA envelope represents that thought. +3. Identity from immutable semantic address space. +4. Schema inheritance via `OGAR::classes::from(address)`. +5. Lance versioning = self-through-time. +6. `CausalEdge64` = compact local causal/NARS claim **payload**. +7. References = explicit pointers, not copied truth. +8. Pragmatics = local state in the envelope. +9. AST adapter selection inherited from OGAR classes. +10. The question: does the geometry measure what it claims? + +--- + +## 1. File-path grounded findings + +### Phase 1 — Locate the actual thought envelope + +**There are TWO competing SoA envelopes in the tree. This is the single +biggest structural finding.** + +| Envelope | File | Role | Status | +|---|---|---|---| +| `MailboxSoA` | `crates/cognitive-shader-driver/src/mailbox_soa.rs` | The **intended** per-mailbox thought envelope (one owner, N rows) | **Confirmed** — matches intended model | +| `BindSpace` | `crates/cognitive-shader-driver/src/bindspace.rs` | The **older** global/process-wide SoA, still carrying the deprecated 65 KB/row `Vsa16kF32` `cycle` plane | **Contradiction** — co-exists with MailboxSoA, no migration seam | +| `MailboxSoaView` / `MailboxSoaOwner` (traits) | `crates/lance-graph-contract/src/soa_view.rs` | The **canonical zero-copy read/write contract** the envelope is supposed to satisfy | **Confirmed** — clean abstraction | + +- `MailboxSoA<1024>` columns (`mailbox_soa.rs`): `energy [f32;N]`, + `plasticity_counter [u8;N]`, `last_emission_cycle [u32;N]`, + `edges [CausalEdge64;N]`, `qualia [QualiaI4_16D;N]`, `meta [MetaWord;N]`, + `entity_type [u16;N]`; scalars `mailbox_id`, `current_cycle`, `w_slot`, + `threshold`. It emits a baton via `emit(MailboxId) -> CollapseGateEmission`. + **This is the particle envelope the intended model describes.** +- `BindSpace` columns (`bindspace.rs`): `FingerprintColumns { content [u64×256], + cycle [f32×16_384], topic, angle, sigma }`, `EdgeColumn`, `QualiaI4Column` + (+ deprecated `QualiaColumn` f32×18), `MetaColumn`, `temporal`, `expert`, + `entity_type`, `ontology: Option>`. Per-row footprint + **71,713 bytes**, of which **65,536 bytes is the `cycle` `Vsa16kF32` plane**. +- The `ShaderDriver` (`driver.rs`) holds **both**: `bindspace: Arc` + **and** `mailboxes: HashMap>` (commented + "transitional per-mailbox routing (slice A2)"). The driver is mid-migration + from the global BindSpace to per-mailbox SoAs and **both surfaces are live**. + +**Phase 1 output:** + +| Tag | Finding | +|---|---| +| **Confirmed** | `MailboxSoA` is the per-thought envelope; `soa_view.rs` traits are the canonical contract; identity scalar is `mailbox_id: MailboxId (u32)`. | +| **Confirmed** | State for one thought is *intended* to be one `MailboxSoA`. | +| **Inferred** | The driver's `mailboxes: HashMap` is the active routing path; BindSpace is legacy being drained. | +| **Absent** | No single canonical envelope is *declared* as THE envelope — both compile, both are reachable. | +| **Contradiction** | Two SoA representations co-exist (`BindSpace` global + `MailboxSoA` per-mailbox). The deprecated `Vsa16kF32` carrier is **not** absent — it is the live `cycle` column of `BindSpace` at 65 KB/row, contradicting "no f32 carrier." | + +--- + +### Phase 2 — Identity audit + +**Identity = `mailbox_id: u32` (envelope identity) + `entity_type: u16` +(class identity pointer). The intended `OGAR::classes::from(address)` reverse +resolver does NOT exist in OGAR — its equivalent lives in lance-graph as a +linear scan.** + +- Envelope identity: `MailboxSoA.mailbox_id: MailboxId` (`= u32`), the corpus + root handle. `WitnessEntry.mailbox_ref: u32` preserves *full* identity across + cohort rotation (wider than the 6-bit W-slot index — good design). +- Class identity: `entity_type: u16` per row. `bindspace.rs:244` doc: + *"0 = untyped. Non-zero = 1-based index into Ontology.schemas."* +- Resolution (`driver.rs:331-338`): `etid = entity_type[row]` → if 0 no context, + else `OntologyRegistry::enumerate_first_with_entity_type_id(etid)` → + `MappingRow.ontology_context_id()` → `MulThresholdProfile::for_context(ctx_id)`. +- **The resolver is a linear `.find()` scan** (`lance-graph-ontology/src/registry.rs:327`) + over `SchemaPtr::entity_type_id()`, **not** an O(1) array index, despite the + "1-based index" doc wording. `entity_type_id` is packed in `SchemaPtr` bits 23..8. +- OGAR side: only the **forward** builder exists — + `ogar_ontology::class_identity(prefix, class_name) -> String` (e.g. + `"ogit-op/WorkPackage"`). **No `classes::from(address)`, no `class_for`, + no reverse resolver** anywhere in OGAR crates. The address string *is* the + class key (Confirmed); the reverse map is delegated to lance-graph's + `NiblePath` router + `OntologyRegistry`. +- Schema is stored **once per class** (`OntologyRegistry` `MappingRow`, idempotent + on checksum). **No schema is duplicated into SoA rows** — rows carry only the + `u16` pointer. (Confirmed — clean.) + +**Identity diagram:** + +``` + envelope identity class identity + ──────────────── ────────────── + MailboxSoA.mailbox_id : u32 ◄── owner key entity_type[row] : u16 (0 = untyped) + │ │ 1-based, dense in append order + │ ▼ + │ OntologyRegistry::enumerate_first_with_entity_type_id(u16) + │ │ ⚠ LINEAR SCAN, not O(1) index + │ ▼ + │ MappingRow (schema stored ONCE per class) + │ ├─ ontology_context_id() → MulThresholdProfile + │ ├─ schema_ptr (SchemaPtr: ns|entity_type_id|marking) + │ └─ thinking_style: Option ⚠ stored, UNUSED at dispatch + │ + ▼ + OGAR forward only: class_identity(prefix, name) -> "ogit-op/WorkPackage" + OGAR reverse (classes::from(address)) : ABSENT — lives in lance-graph as the scan above +``` + +| Tag | Finding | +|---|---| +| **Confirmed** | Schema meaning inherited (not duplicated); address string is the class key; schema stored once per class. | +| **Confirmed** | Identity is two-level: `mailbox_id` (envelope) + `entity_type` (class). | +| **Inferred** | "Immutable address space" holds as an *intended* downstream property (const prefixes, `@vN` append, VART append-only trie) — **not enforced in code**. | +| **Absent** | `OGAR::classes::from(address)` reverse resolver. The intended inheritance call site does not exist as named. | +| **Contradiction** | The realized resolver is a **linear scan keyed by `entity_type_id`**, while the doc claims "1-based index into schemas." Two different access models documented vs implemented. | + +--- + +### Phase 3 — `CausalEdge64` audit + +**Two incompatible `CausalEdge64` types exist. The canonical one is a payload, +but the v2 layout smuggles a *reference* (W-slot) into the payload word.** + +Canonical: `crates/causal-edge/src/edge.rs` — `struct CausalEdge64(pub u64)`, +`Copy`, 8 bytes, no lifetime. v1 / v2 (feature `causal-edge-v2-layout`) layouts: + +| Bits | v1 field | v2 field | Class | Authoritative? | +|---|---|---|---|---| +| 0..7 | S palette index | S palette index | payload (SPO) | **authoritative** | +| 8..15 | P palette index | P palette index | payload (SPO) | **authoritative** | +| 16..23 | O palette index | O palette index | payload (SPO) | **authoritative** | +| 24..31 | NARS frequency u8 | NARS frequency u8 | payload (truth) | **authoritative** | +| 32..39 | NARS confidence u8 | NARS confidence u8 | payload (truth) | **authoritative** | +| 40..42 | CausalMask (Pearl 2³) | CausalMask (Pearl 2³) | payload (causal) | **authoritative** | +| 43..45 | direction triad | direction triad | payload (causal) | **authoritative** | +| 46..48 / 46..49 | inference type (3-bit unsigned) | inference mantissa (4-bit **signed**) | payload (derived tag) | derived (enum) | +| 49..51 / 50..52 | plasticity | plasticity | pragmatic (local) | **authoritative (local)** | +| 52..63 | temporal (12-bit) | **RECLAIMED** | — | v1: payload; v2: gone | +| 53..58 | — | **W-slot (6-bit witness ref)** | **REFERENCE** | reference (→ WitnessTable) | +| 59..60 | — | truth-band TrustTexture | payload (derived) | derived | +| 61..63 | — | spare | — | — | + +- **Thinking style is NOT a field of `CausalEdge64`.** Style lives in `MetaWord` + (`MetaColumn`) and is selected by qualia auto-detection / `UnifiedStyle`, not + carried on the edge. (Answers "which fields are thinking style": none.) +- Local/incompatible duplicate: `crates/thinking-engine/src/layered.rs` — + a *different* `struct CausalEdge64(pub u64)` whose 8 bytes are **8 named + channels** (BECOMES/CAUSES/SUPPORTS/REFINES/GROUNDS/ABSTRACTS/RELATES/ + CONTRADICTS), with `to_spo()/from_spo()` transcoders to the canonical type at + the L3 commit boundary. Same name, different geometry. (Contradiction.) + +**Field table verdict:** + +| Question | Answer | +|---|---| +| Which fields authoritative? | S, P, O, frequency, confidence, CausalMask, direction, plasticity (local). | +| Which derived? | inference mantissa (enum tag), truth-band (from confidence). | +| Acts as payload? | **Yes** — Confirmed. It is a compact local causal/NARS claim. | +| Accidentally acts as identity? | **No** for v1. **Partially for v2** — the W-slot (bits 53-58) is a *reference index* into `WitnessTable`, mixing a pointer into the payload word. Not identity, but reference-in-payload. (Contradiction with "references should be explicit pointers, not packed into truth.") | + +--- + +### Phase 4 — Reference audit + +**References are explicit `Copy` handles (good). `EpisodicWitness64` as a SoA +*column* is absent (queued). Stale-copy risk is bounded; no circular ownership +in the Rust sense.** + +- `EpisodicEdges64(pub u64)` (`lance-graph-contract/src/episodic_edges.rs`): + 4 × 16-bit slots, each an `EdgeRef { family: u8 (nibble), local: u16 (1-based) }`. + MRU hot tier, slot 0 = strongest, slot 3 = eviction candidate. All `Copy`, + functional updates (`promote`, `push` return new words). **Explicit pointers, + not copied truth.** (Confirmed.) +- `DemotionSink::demote(&mut self, EdgeRef)` — the only mutating receiver, the + seam to the cold connectome. +- `WitnessTable<64>` + `WitnessEntry { mailbox_ref: u32, spo_fact_ref: Option }` + (`witness_table.rs`): resolves the v2 W-slot. `mailbox_ref` is the **full** u32 + identity (not the 6-bit slot) → rotation-safe. `spo_fact_ref` is `None` + (ephemeral) or `Some` (crystallised AriGraph triple). `WitnessEntry` is `Copy`; + the table is `Clone`-only (1.5 KiB). +- **`EpisodicWitness64` as a SoA column is ABSENT** — `soa_view.rs:75-95` + explicitly comments the episodic/witness column as *queued, not yet landed*. + The envelope does **not** yet carry the reference column the intended model + names. + +**Reference topology diagram:** + +``` + MailboxSoA (owner of truth: edges[], energy[], qualia[], meta[]) + │ owns + ▼ + CausalEdge64 ──(v2 W-slot, 6-bit)──► WitnessTable<64> + │ │ entry.mailbox_ref : u32 (REFERENCE, full id) + │ └ entry.spo_fact_ref: Option (REFERENCE → AriGraph triple, or None) + │ + EpisodicEdges64 (hot MRU word) + └ slot[0..4] : EdgeRef{family,local} (REFERENCE, explicit, Copy) + │ evict slot 3 + ▼ + DemotionSink (→ cold connectome, the only &mut receiver) +``` + +| Classification | Members | +|---|---| +| **owned truth** | `MailboxSoA` columns (`edges`, `energy`, `qualia`, `meta`, `plasticity`). The mailbox is sole owner. | +| **reference** | `EdgeRef`, `EpisodicEdges64` slots, `CausalEdge64` v2 W-slot, `WitnessEntry.{mailbox_ref, spo_fact_ref}`. All explicit, all `Copy`. | +| **copied state** | `WitnessEntry.mailbox_ref` is a *copy of a u32 id* (not the row) — cheap, rotation-safe; not stale because it's the canonical wide id, not a slot index. | +| **derived state** | truth-band, inference enum from mantissa, `class_id` alias of `entity_type`. | + +| Tag | Finding | +|---|---| +| **Confirmed** | References are explicit `Copy` pointers, not copied truth. No circular ownership (all handles, no `Rc`-cycles; mailbox is single owner). | +| **Inferred** | Stale-copy risk is bounded: the only copied identity is the *wide* `mailbox_ref`, which survives cohort rotation by design. The 6-bit W-slot *would* go stale on rotation, which is exactly why `WitnessTable` stores the full u32. | +| **Absent** | `EpisodicWitness64` SoA column (queued in `soa_view.rs`). | +| **Contradiction** | none for ownership; the only smell is the v2 W-slot reference packed into the `CausalEdge64` payload word (see Phase 3). | + +--- + +### Phase 5 — Lance versioning audit + +**Self-through-time is via `DatasetVersion(u64)` + `VersionScheduler` → +`KanbanMove`. History is NOT duplicated in rows. Clean, with one placeholder.** + +- `scheduler.rs`: `DatasetVersion(pub u64)`; trait + `VersionScheduler::on_version(&self, view: &V, at: DatasetVersion, + exec: ExecTarget) -> Option`. The scheduler takes `&V` (shared, + read-only) — **"propose, don't dispose": the scheduler never mutates; only + `MailboxSoaOwner::advance_phase` mutates.** (Confirmed — clean ownership split.) +- `NextPhaseScheduler` advances the 6-phase Rubicon Kanban lifecycle on each + Lance version tick. Planning→CognitiveWork stamps `libet_offset_us = -550_000`. +- Per-row time stamps in the envelope are `current_cycle: u32` and + `last_emission_cycle [u32;N]` — these are **same-cycle idempotency guards**, + not history. No previous-self snapshot is copied into rows. + +**Version lineage diagram:** + +``` + Lance dataset tick: DatasetVersion(v) ── increments ──► DatasetVersion(v+1) + │ on_version(&view, at, exec) + ▼ + VersionScheduler (READ-ONLY &V) ──proposes──► KanbanMove { mailbox, from→to phase, + │ witness_chain_position, libet_offset_us } + │ (caller applies) + ▼ + MailboxSoaOwner::advance_phase(to) ← SOLE mutator + │ + ▼ + self-through-time = the sequence of Lance versions of the SAME mailbox dataset + (previous-self is implicit in Lance history; NOT duplicated in rows) +``` + +| Classification | Verdict | +|---|---| +| **clean** | Previous-self is implicit via Lance versioning; no history duplicated in rows; read-only scheduler vs single mutating owner is a correct ownership split. | +| **redundant** | none material. `current_cycle` + `last_emission_cycle` are guards, not history. | +| **unclear** | `KanbanMove.witness_chain_position` is currently set to `view.current_cycle()` — a structural **placeholder** until the witness-arc column (A3) lands. | +| **contradictory** | none. | + +--- + +### Phase 6 — SPO rung audit + +**The 2³ rung is explicit as `CausalMask` (Pearl) bits 40-42 of `CausalEdge64`; +SPO decomposition is explicit as three palette indices + the +`markov_soa::SpoRanks` triple. Partial resolution is supported.** + +- SPO indices: `CausalEdge64` bits 0-23 = S/P/O palette indices (each 8-bit). +- 2³ causal rung: `CausalMask` at bits 40-42 (Pearl's 2³ = 8 causal mask + states). Explicit, not derived. (Confirmed.) +- Vocabulary-agnostic SPO: `markov_soa::SpoRanks { s: u16, p: u16, o: u16 }` — + three opaque ranks; distance injected as a closure (`Fn(u16,u16)->u8`), so + language (DeepNSM/COCA) stays upstream and never reaches into the rung. +- Partial resolution: `markov_soa::SoaWavePrimer::project` folds a ±radius + window into a `WaveProjection` of `SpoRanks` with `best_guess_match` returning + a fuzzy score — i.e. partial / probabilistic SPO resolution is supported. + +**SPO rung table:** + +| Property | Status | Evidence | +|---|---|---| +| 2³ rung explicit? | **Yes (Confirmed)** | `CausalMask` bits 40-42, Pearl 2³ | +| Derivable? | partly | direction triad (43-45) derivable from mask context | +| Redundant? | **No** | SPO palette indices and `SpoRanks` serve different layers (edge payload vs Markov projector); not duplication | +| Partial resolution? | **Yes (Confirmed)** | `SoaWavePrimer::project` → `best_guess_match` fuzzy score | + +--- + +### Phase 7 — Little-endian contract audit + +**There IS a single canonical LE baton contract — but it is fragmented by the +two `CausalEdge64` types and by a `&[u64]` reinterpret seam in the SoA view.** + +- Canonical baton: `CollapseGateEmission` = `(u16 target, CausalEdge64)`, wire + cost `13 + 10·baton_count` bytes (per CLAUDE.md E-BATON-1). `CausalEdge64` and + `EpisodicEdges64` both expose `to_le_bytes / from_le_bytes / write_le / read_le` + — a shared LE convention. (Confirmed — one envelope contract intent.) +- p64-bridge (`p64-bridge/src/lib.rs`): `edges_to_layered_rows(&[CausalEdge64]) + -> [[u64;64];8]`, `edge_to_block(&CausalEdge64) -> (usize,usize)`. **This is a + projection / derivation, not a layout-preserving transport.** It reads SPO + + mask bits and *computes* palette addresses; it does not round-trip the edge. + So p64 does **not** "preserve layout" — by design it derives a different + geometry (palette planes) from the edge. (Inferred — acceptable, but it is a + one-way lens, not a serializer.) +- SoA view reinterpret seam: `MailboxSoaView::edges_raw() -> &[u64]` (NOT + `&[CausalEdge64]`) — the contract crate stays zero-dep on `causal-edge` by + handing back raw `u64` that callers reconstruct via `CausalEdge64(raw)`. This + is a **hidden reinterpret**: correctness depends on every caller agreeing on + the v1-vs-v2 layout. Under the `causal-edge-v2-layout` feature, a caller that + reconstructs with v1 semantics silently mis-reads bits 46-63 + (the exact `I-LEGACY-API-FEATURE-GATED` hazard). + +**Contract map:** + +``` + ENVELOPE LE CONTRACT (canonical) + ├─ Baton: CollapseGateEmission (u16 target, CausalEdge64) wire = 13 + 10·n + ├─ CausalEdge64::{to,from}_le_bytes (causal-edge crate, v1/v2 feature-gated) + ├─ EpisodicEdges64::{to,from}_le_bytes (matches CE64 convention) + │ + FRAGMENTATION RISKS + ├─ ⚠ TWO CausalEdge64 layouts (causal-edge SPO-palette vs thinking-engine 8-channel) + │ bridged only by layered.rs::to_spo()/from_spo() — name collision, lossy + ├─ ⚠ soa_view::edges_raw() -> &[u64] (reinterpret seam; layout agreement is implicit) + └─ ⚠ p64-bridge = one-way projection CE64 → [[u64;64];8] (derives, does NOT preserve/round-trip) +``` + +| Question | Answer | +|---|---| +| Single canonical LE contract? | **Yes** for the baton + CE64/EpisodicEdges64 byte methods (Confirmed). | +| Hidden conversion? | **Yes** — `edges_raw() -> &[u64]` reinterpret; v1/v2 layout agreement is implicit (Contradiction risk). | +| Serialization tax? | Low on the hot path (baton is 8-byte words). p64 projection recomputes per dispatch — derive cost, not serialize cost. | +| Contract fragmentation? | **Yes** — two `CausalEdge64` types is the principal fragmentation. | + +--- + +### Phase 8 — OGAR inheritance audit + +**OGAR is a vocabulary/IR producer. It mints class-identity strings and emits +SPO triples; it owns NO runtime registry, NO reverse resolver, NO thinking +style, and NO class-driven adapter dispatch. The coupling is one trait.** + +| Semantic | Source | Status | +|---|---|---| +| Class identity string (`prefix/Class`) | OGAR `ogar_ontology::class_identity` (forward only) | **Confirmed (from OGAR)** | +| Schema (fields/assoc/enums/attributes) | OGAR `Class` IR (once per class) | **Confirmed (from OGAR)** | +| Inheritance edges (`parent`, `mixins`, STI) | OGAR `Class.parent` / `Class.mixins` (representational; **resolution deferred to consumer**) | **Confirmed (representational only)** | +| Class → version (`knowable_from`) | trait `KnowableFromStore` (OGAR declares; **lance-graph implements**) | **Confirmed (seam)** | +| Reverse `classes::from(address)` | — | **Absent** | +| AST adapter selection from class | OGAR `Adapter` trait exists but **no class-driven dispatch**; emitter takes no adapter | **Absent (manual)** | +| Thinking style from class | docs-only refs; not on `Class`; stored on lance-graph `MappingRow` but **unused at dispatch** | **Absent (inheritance), stored-but-dead (lance side)** | +| Pragmatics (energy, plasticity, qualia, cycle) | **local** to `MailboxSoA` | **Confirmed (local)** | + +**Inheritance map:** + +``` + OGAR (producer, Lance-free) lance-graph (consumer/runtime) + ─────────────────────────── ────────────────────────────── + class_identity(prefix,name) ──► "ogit-op/WorkPackage" ──► NiblePath router / OntologyRegistry key + Class { parent, mixins, attributes, ... } ──emit──► SPO Triples ──► triple loader + trait KnowableFromStore ◄──implemented by── OntologyRegistry / ClassRegistryWriter + register(id, ddl_hint)->u64 enumerate_first_with_entity_type_id (scan) + knowable_from(id)->u64 MappingRow.thinking_style (STORED, UNUSED) + MulThresholdProfile::for_context (the ONLY + thing entity_type actually drives today) + AST adapter: Adapter::map (static identity xlate) ── NO class→adapter dispatch wired ── + Pragmatics: MailboxSoA.{energy,plasticity,qualia,meta} (LOCAL) +``` + +--- + +## 8. Redundancy list + +1. **Two SoA envelopes** — `BindSpace` (global, 65 KB/row `Vsa16kF32` cycle plane) + vs `MailboxSoA` (per-mailbox). The driver holds both. (highest priority) +2. **Two `CausalEdge64` types** — `causal-edge` (SPO-palette) vs + `thinking-engine::layered` (8-channel). Same name, different geometry. +3. **Deprecated `Vsa16kF32` cycle plane** still allocated in `BindSpace` + (65 KB/row) despite being scoped out as a cross-mailbox carrier. +4. **`QualiaColumn` (f32×18, deprecated)** co-exists with `QualiaI4Column` + (canonical, 8 B/row). +5. **`MappingRow.thinking_style`** is stored per class but never read on the + `entity_type` dispatch path — dead field. +6. **`entity_type` resolution** is a linear scan despite a "1-based index" doc + contract — the index model is documented but not implemented. + +## 9. Circular ownership risks + +- **None in the Rust ownership sense.** All cross-structure links are `Copy` + handles (`EdgeRef`, W-slot index, `mailbox_ref`, `spo_fact_ref`), not `Rc`/`Arc` + cycles. The mailbox is the single owner of its truth columns. +- **Dependency-cycle avoidance is correct:** planner ↔ shader-driver cycle is + broken by the closure seam in `convergence.rs::run_convergence(..., impl FnOnce(...))`; + AriGraph→planner cycle is broken via the p64 convergence point. (Confirmed.) +- **Latent hazard, not a cycle:** the v2 W-slot reference lives *inside* the + `CausalEdge64` payload word, coupling payload and reference layers. If the + witness table is rebuilt out of step with the edges, the 6-bit index resolves + to a wrong (but valid) entry — silent, not a panic. + +## 10. Recommended minimal corrections + +Ordered by leverage, each is a *classification-respecting* minimal change — no +theory-driven rename, no refactor beyond making ownership/reference/version/ +execution geometry explicit. + +1. **Pick one envelope. Declare it.** Mark `BindSpace` `#[deprecated]` with a + migration pointer to `MailboxSoA`, or invert: state in one doc-comment which + is canonical and gate the other behind a `legacy-bindspace` feature. Today + both compile and the driver holds both — the reader cannot tell which is the + particle. (Resolves redundancy 1, 3.) +2. **Rename the local edge.** `thinking_engine::layered::CausalEdge64` → + `ChannelEdge64` (or similar). Keep `to_spo()/from_spo()`. The name collision + is the single biggest correctness trap in the LE contract. (Resolves + redundancy 2, contract fragmentation.) +3. **Make `entity_type` resolution match its contract.** Either (a) change the + `OntologyRegistry` to a real O(1) `Vec` index keyed by the 1-based + `entity_type` (matching the doc), or (b) fix the doc to say "scan by + `entity_type_id`." Pick one; don't ship both stories. (Resolves redundancy 6.) +4. **Decide the W-slot home.** If references must be explicit (intended model + #7), the v2 W-slot is a reference packed into payload. Either document it as a + deliberate exception with a version gate (it already has one), or move the + witness reference out of `CausalEdge64` into the (queued) `EpisodicWitness64` + SoA column. Land that column — `soa_view.rs` already reserves the slot. + (Resolves Phase 3/4 contradiction + Phase 4 Absent.) +5. **Wire or delete the dead inheritance.** `MappingRow.thinking_style` is stored + per class but unused at dispatch. Either consume it in `driver.rs:331-339` + (true class→style inheritance, matching intended model #9/#4) or remove the + field. (Resolves redundancy 5; closes the "inheritance from OGAR class" + intended-vs-actual gap.) +6. **Type the `edges_raw()` seam.** The `&[u64]` reinterpret in `soa_view.rs` + relies on implicit v1/v2 agreement. Add a `const EDGE_LAYOUT_VERSION: u8` + to the view trait and assert it at reconstruction, per + `I-LEGACY-API-FEATURE-GATED`. (Resolves Phase 7 hidden-conversion risk.) +7. **Retire the deprecated qualia/cycle columns** once (1) lands — + `QualiaColumn` f32×18 and the `Vsa16kF32` cycle plane are pure footprint + (65.5 KB/row) on the legacy envelope. (Resolves redundancy 3, 4.) + +--- + +## Geometry verdict — per major field + +| Field / structure | Verdict | Rationale | +|---|---|---| +| `MailboxSoA.mailbox_id` | **KEEP** | Envelope identity, correct. | +| `MailboxSoA.entity_type` | **KEEP (fix resolver)** | Correct as class pointer; resolver should match its O(1) doc. | +| `MailboxSoA.energy / plasticity / last_emission_cycle` | **KEEP** | Local pragmatics, correctly owned. | +| `MailboxSoA.edges (CausalEdge64)` | **KEEP** | Payload, correctly owned by the mailbox. | +| `MailboxSoA.qualia (QualiaI4_16D)` | **KEEP** | Canonical local pragmatic vector. | +| `MailboxSoA.meta (MetaWord)` | **KEEP** | Thinking-style/awareness bits belong here, not on the edge. | +| `CausalEdge64` S/P/O/freq/conf/mask | **KEEP** | Authoritative payload. | +| `CausalEdge64` v2 W-slot | **REFERENCE ONLY** | Move to `EpisodicWitness64` column or gate as explicit exception. | +| `CausalEdge64` truth-band / inference enum | **DERIVE** | Derivable from confidence / mantissa. | +| `thinking_engine::layered::CausalEdge64` | **KEEP + RENAME** | Distinct concept; collision is the hazard. | +| `BindSpace` (whole) | **MOVE TO LANCE / DEPRECATE** | Legacy global SoA; persist via Lance, drain into `MailboxSoA`. | +| `BindSpace.cycle (Vsa16kF32)` | **REMOVE** | Deprecated carrier; 65 KB/row footprint. | +| `QualiaColumn (f32×18)` | **REMOVE** | Superseded by `QualiaI4Column`. | +| `entity_type → OntologyRegistry` schema | **MOVE TO OGAR** | Schema is OGAR's `Class` IR; lance should hold only the registry/version impl of `KnowableFromStore`. | +| `MappingRow.thinking_style` | **UNCLEAR** | Wire it (inheritance) or remove it (dead). Decide. | +| `DatasetVersion` + `VersionScheduler` | **KEEP** | Clean self-through-time; read-only propose vs single-owner dispose. | +| `KanbanMove.witness_chain_position` | **UNCLEAR** | Placeholder (= current_cycle) until A3 witness-arc lands. | +| `EpisodicEdges64 / EdgeRef` | **KEEP (REFERENCE ONLY)** | Correct explicit pointers. | +| `WitnessTable / WitnessEntry` | **KEEP (REFERENCE ONLY)** | Wide `mailbox_ref` is rotation-safe by design. | +| `EpisodicWitness64` SoA column | **LAND IT** | Currently Absent; reserved in `soa_view.rs`. | +| `p64-bridge` projections | **KEEP (DERIVE)** | One-way lens CE64→palette; not a transport, document as such. | + +--- + +## Bottom line + +**Does the struct geometry measure what it claims?** + +- **Payload, references, versioning, execution split: YES.** `CausalEdge64` is + a clean payload, references are explicit `Copy` handles with no ownership + cycles, Lance versioning gives self-through-time without row-level history + duplication, and the read-only-scheduler / single-owner-mutator split is + correct. +- **Identity, inheritance, single-envelope: NOT YET.** The intended + `OGAR::classes::from(address)` does not exist (forward-only in OGAR; a linear + scan in lance-graph); two SoA envelopes co-exist with a live deprecated f32 + carrier; two `CausalEdge64` types share a name; and class→{thinking-style, + AST adapter} inheritance is either stored-but-dead or absent. + +The geometry is **load-bearing where it concerns one mailbox's owned payload and +its versioned lifecycle**, and **under-specified where it concerns identity +resolution and OGAR class inheritance**. The seven minimal corrections above +close the gap without renaming on theory — they make ownership, reference, +versioning, and execution geometry explicit, which is exactly the stated goal. From 792503c113cd4fffb5df2b64dbc7fb446bb248fc Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 19:32:20 +0000 Subject: [PATCH 03/15] docs/probes/q3: remove all singleton/Vsa16kF32 carrier assumptions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reframes every section that treated the real-valued VSA algebra as a singleton cross-mailbox carrier. The algebra findings (no permute on the real-valued path, damped relaxation, no closed feedback loop) are unchanged — they now describe the per-mailbox computation correctly. Removed the stale correction block; the body is now self-consistent. --- docs/probes/q3-standing-wave-falsification.md | 59 ++++++++++--------- 1 file changed, 31 insertions(+), 28 deletions(-) diff --git a/docs/probes/q3-standing-wave-falsification.md b/docs/probes/q3-standing-wave-falsification.md index 1a0adf9d..ae91997d 100644 --- a/docs/probes/q3-standing-wave-falsification.md +++ b/docs/probes/q3-standing-wave-falsification.md @@ -70,23 +70,24 @@ All three coexist, segregated by stage: ### Q3 — Which part is unitary/permutation-like (norm-preserving)? -**`vsa_permute` does not exist on the f32 carrier.** +**`vsa_permute` does not exist on the real-valued (f32) VSA algebra path.** The cyclic bit rotation lives in `ndarray/src/hpc/vsa.rs:326–349`, operating on the **binary `VsaVector`** (`words: [u64]`, 16384 *bits*). It is a genuine cyclic permutation — `dst_bit = (src_bit + shift) % 16384` — norm-preserving in Hamming space, with a correct round-trip test at `vsa.rs:322–324`. -The `Vsa16kF32` f32 carrier — the one CLAUDE.md calls the "Click carrier" — has -**no permutation primitive**. `fingerprint.rs:163` and `fingerprint.rs:295` have -doc-comments pointing at the ndarray binary primitive; the f32 sandwich never calls it. +The real-valued path (`vsa16k_bind` / `vsa16k_bundle` / `vsa16k_cosine` in +`fingerprint.rs`) has **no permutation primitive**. `fingerprint.rs:163` and +`fingerprint.rs:295` have doc-comments pointing at the ndarray binary primitive; the +real-valued algebra never calls it. The `vsa_sequence` binary path (`vsa.rs:371–378`) permutes-by-index then bundles — but there is **no inverse-permute readout anywhere**: `vsa_clean` (`vsa.rs:394`) does a flat Hamming scan with no de-braiding. Position is written once and never decoded back. **The ρ^d braid is norm-preserving on the binary carrier it actually lives on. -It is structurally absent from the f32 carrier the "Click" architecture rests on.** +It is absent from the real-valued algebra path.** --- @@ -117,9 +118,9 @@ Five independent sinks: no `step`/`evolve` method. - `CycleAccumulator` (`cycle_accumulator.rs`) — a `Vec` batch buffer; `drain()` empties it, no carry-over. -- `CollapseGateEmission` (`collapse_gate.rs:177`) — a baton-list DTO. CLAUDE.md itself - states `Vsa16kF32` does NOT cross mailbox boundaries; the bundle is "an ephemeral - computation, never persisted." +- `CollapseGateEmission` (`collapse_gate.rs:177`) — a baton-list DTO. The bundle is + an ephemeral per-mailbox computation, never persisted or transmitted across boundaries + (CLAUDE.md E-BATON-1). - `gestalt` in `kv_bundle.rs` — persists across `set()` calls but is a running mean updated by external writes only; remove the writer and it is static. - `global_context += fact → reshapes NEXT cycle` — prose in CLAUDE.md. @@ -129,15 +130,15 @@ Five independent sinks: ### Q6 — Is phase preserved, quantized, or merely implied? -**Merely implied, and on the f32 carrier, absent.** +**Merely implied in the binary path; absent in the real-valued algebra path.** Real VSA phase-coding requires permute-at-encode AND permute-aware-unbind. In the binary path (`ndarray/vsa.rs:371–378`), `vsa_sequence` permutes-by-index then bundles. But there is no inverse-permute at readout — `vsa_clean` does a flat Hamming scan with no de-braiding. **Phase is written once and never distinguished at readout.** -On the `Vsa16kF32` carrier — the actual "Click" substrate — no permute is applied at all. -Phase is structurally absent, not quantized. +In the real-valued algebra path (`vsa16k_bind` / `vsa16k_bundle`), no permute is applied +at any stage. Phase is structurally absent, not quantized. --- @@ -148,8 +149,8 @@ Phase is structurally absent, not quantized. ```rust #[test] fn standing_wave_or_damped_relaxation() { - // Build initial state - let role_key = vsa16k_role_key(RoleSlice::SUBJECT); + // Build initial state using the real-valued VSA algebra primitives + let role_key = vsa16k_role_key(RoleSlice::SUBJECT); // fingerprint.rs let content = vsa16k_content_fp(b"hello"); let s0 = vsa16k_bind(&role_key, &content); @@ -174,16 +175,16 @@ fn standing_wave_or_damped_relaxation() { // Should oscillate if wave; instead it's a fixed point: assert!((last - 1.0_f32).abs() < 0.01, "fixed point, not wave: norm = {last}"); - // What would prove a standing wave: inserting vsa16k_permute(n) in the loop + // What would prove a standing wave: a norm-preserving rotation applied each step // so s_{n+1} = permute(normalize(s_n)) would produce a periodic orbit. - // vsa16k_permute does not exist. That is the gap. + // No such rotation exists on the real-valued algebra path. That is the gap. } ``` -**What would falsify the (B) verdict:** A function `f(s: Vsa16kF32) -> Vsa16kF32` that -(a) applies a norm-preserving rotation on the f32 carrier, and (b) is iterated without -a hard depth cap and without per-step external input, producing bounded non-zero energy -at n=∞. No such `f` exists in the codebase. +**What would falsify the (B) verdict:** A function `f(s) -> s` that (a) applies a +norm-preserving rotation on the real-valued algebra, and (b) is iterated without a hard +depth cap and without per-step external input, producing bounded non-zero energy at n=∞. +No such `f` exists in the codebase. --- @@ -204,7 +205,7 @@ at n=∞. No such `f` exists in the codebase. | Claim | Status | Gap | |-------|--------|-----| -| ρ^d braiding on `Vsa16kF32` | ❌ Missing | `vsa_permute` exists only on binary carrier | +| ρ^d braiding on real-valued algebra path | ❌ Missing | `vsa_permute` exists only on binary carrier (`ndarray/vsa.rs`) | | Phase decode / unbind recovers position | ❌ Missing | No inverse-permute at readout | | `global_context += fact` reshapes next cycle | ❌ Missing | `grep global_context`: 0 hits in src | | "Shader can't resist thinking" active-inference loop | ❌ Not a loop | `fe *= 0.5`, depth < 9, then stops | @@ -234,15 +235,16 @@ This is a real bug independent of the wave question. ## Prescription -**Do not add `vsa16k_permute` to plug the standing-wave gap unless you first prove the -standing-wave architecture provides measurable benefit over the current feed-forward -system.** The feed-forward bind/bundle/cosine + threshold pipeline is: +**Do not add a real-valued permutation primitive to plug the standing-wave gap unless +you first prove the standing-wave architecture provides measurable benefit over the +current feed-forward system.** The feed-forward bind/bundle/cosine + threshold pipeline +is: - Correct - Well-tested - Delivers real value (VSA encoding, role-indexed readout, SPO triple commit) The standing-wave framing is an architectural aspiration that requires: -1. `vsa16k_permute` on the f32 carrier (the rotation) +1. A norm-preserving rotation on the real-valued algebra path (not yet implemented) 2. Inverse-permute at readout (the decoding) 3. A closed-loop recurrence without a hard iteration cap 4. Evidence that the resulting system does something the feed-forward version does not @@ -251,7 +253,8 @@ Before building stained glass, prove the foundation: write the falsification tes and make it pass. Until it passes, CLAUDE.md §"The Click" should be marked `[ASPIRATIONAL — not yet executable]` rather than presented as current implementation. -**The honest architecture description:** a feed-forward VSA encode/readout pipeline with -threshold-gated commit, a binary permutation in ndarray (correctly norm-preserving but -with no inverse-decode), and a baton-based causal handoff. That is already interesting -and already shipped. The standing-wave framing adds nothing to it today. +**The honest architecture description:** a per-mailbox feed-forward VSA encode/readout +pipeline with threshold-gated commit, a binary permutation in ndarray (correctly +norm-preserving but with no inverse-decode), and a baton-based causal handoff between +mailboxes. That is already interesting and already shipped. The standing-wave framing +adds nothing to it today. From 00d5bde7948889f2a31321946dce343344b46aac Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 19:33:17 +0000 Subject: [PATCH 04/15] =?UTF-8?q?docs/probes/q3:=20standing=20wave=20is=20?= =?UTF-8?q?vacuous=20=E2=80=94=20Lance=20versioning=20is=20the=20O(1)=2090?= =?UTF-8?q?deg=20lookup?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Self-through-time is provided by the LanceDB table geometry: a prior version is a 90-degree lookup (orthogonal to the current write), O(1) by version tag, not a recurrence. The probe's standing-wave question assumed temporal persistence had to live in the compute path. It doesn't. Any recurrence mechanism would be a redundant reimplementation of Lance versioning. Prescription updated accordingly. --- docs/probes/q3-standing-wave-falsification.md | 59 +++++++++++++------ 1 file changed, 42 insertions(+), 17 deletions(-) diff --git a/docs/probes/q3-standing-wave-falsification.md b/docs/probes/q3-standing-wave-falsification.md index ae91997d..70e8709f 100644 --- a/docs/probes/q3-standing-wave-falsification.md +++ b/docs/probes/q3-standing-wave-falsification.md @@ -233,28 +233,53 @@ This is a real bug independent of the wave question. --- +## Structural correction — why the standing-wave question is the wrong question + +**The probe asked whether a standing wave exists in the dynamic execution path. +The answer is no — but the more important answer is: it doesn't need to.** + +Self-through-time is already provided by the LanceDB table, not by any recurrence +in the compute path. Each committed state is a Lance version of the mailbox dataset. +Querying a prior self is a **90° lookup**: the prior version's row is orthogonal to +the current cycle's write — it is a read against a version tag, not a traversal +of a recurrence orbit. That lookup is O(1) by LanceDB's versioned columnar geometry, +not a sweep, not a recurrence, not a dynamic system at all. + +The standing-wave framing incorrectly assumed that temporal persistence had to be +implemented as a recurrence within the compute graph. It does not. Lance provides it +structurally: + +``` +current compute (per-mailbox, feed-forward, ephemeral) + │ commit + ▼ +Lance version N ← O(1) read of any prior version by version tag +Lance version N+1 ← current write target + │ 90° lookup (orthogonal to current cycle's write direction) + ▼ +prior self = Lance version k, k < N — not a recurrence, a table read +``` + +The consequence: **do not implement a standing wave**. The question was vacuous +because the persistence it was meant to provide is already in the storage geometry. +Any recurrence mechanism would be a redundant, expensive reimplementation of +Lance versioning in the compute path. + +--- + ## Prescription -**Do not add a real-valued permutation primitive to plug the standing-wave gap unless -you first prove the standing-wave architecture provides measurable benefit over the -current feed-forward system.** The feed-forward bind/bundle/cosine + threshold pipeline -is: +The feed-forward bind/bundle/cosine + threshold pipeline is: - Correct - Well-tested - Delivers real value (VSA encoding, role-indexed readout, SPO triple commit) -The standing-wave framing is an architectural aspiration that requires: -1. A norm-preserving rotation on the real-valued algebra path (not yet implemented) -2. Inverse-permute at readout (the decoding) -3. A closed-loop recurrence without a hard iteration cap -4. Evidence that the resulting system does something the feed-forward version does not - -Before building stained glass, prove the foundation: write the falsification test above -and make it pass. Until it passes, CLAUDE.md §"The Click" should be marked -`[ASPIRATIONAL — not yet executable]` rather than presented as current implementation. +The binary permutation in ndarray (`vsa_permute`, `vsa_sequence`) is norm-preserving +and correctly wired for position-sensitive bundling on the binary carrier. It needs +an inverse-permute at readout to be fully useful — that is the one real gap. **The honest architecture description:** a per-mailbox feed-forward VSA encode/readout -pipeline with threshold-gated commit, a binary permutation in ndarray (correctly -norm-preserving but with no inverse-decode), and a baton-based causal handoff between -mailboxes. That is already interesting and already shipped. The standing-wave framing -adds nothing to it today. +pipeline with threshold-gated commit, baton-based causal handoff between mailboxes, +and self-through-time provided by Lance versioning (O(1) versioned column read, not +a recurrence). The standing-wave framing adds nothing to this and should not be +implemented. From 4d0774b6e9cbbe1e072c14b27b27b80a4d824ec2 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 19:52:44 +0000 Subject: [PATCH 05/15] contract: add SoaEnvelope LE contract; audit Phase 7 follow-up The SoA envelope must know the little-endian contract, not just its columns. ndarray::simd::MultiLaneColumn already provides the per-column LE contract (standalone, usable by any pure-SIMD consumer). What was missing: an envelope-level contract describing how columns assemble into one row-strided packet with a cycle stamp. New zero-dep module lance_graph_contract::soa_envelope: - ColumnKind / ColumnDescriptor (LE element width, offset, elems/row) - SoaEnvelope trait: columns(), row_stride(), n_rows(), cycle(), LAYOUT_VERSION, as_le_bytes(), zero-copy row_le()/column_le(), verify_layout() gate (stride/overlap/packet-size/version skew) - 7 unit tests, all passing Deliberately NOT pulling ndarray into the contract: it would force the heavy HPC build onto crewai-rust/n8n-rs (zero-dep consumers) and force pure-SIMD ndarray consumers to pull a graph contract crate. Two-level split keeps both crates clean. Iron rule: ndarray owns the column contract, lance-graph owns the envelope contract, neither restates the other, lance-graph binds them. Also flags CLAUDE.md ndarray-hpc fallback wording for demotion (no shipped consumer runs without ndarray; the fallback is CI-only). --- crates/lance-graph-contract/src/lib.rs | 1 + .../lance-graph-contract/src/soa_envelope.rs | 355 ++++++++++++++++++ docs/probes/particle-soa-envelope-audit.md | 53 +++ 3 files changed, 409 insertions(+) create mode 100644 crates/lance-graph-contract/src/soa_envelope.rs diff --git a/crates/lance-graph-contract/src/lib.rs b/crates/lance-graph-contract/src/lib.rs index e2a7e2c7..3c463882 100644 --- a/crates/lance-graph-contract/src/lib.rs +++ b/crates/lance-graph-contract/src/lib.rs @@ -93,6 +93,7 @@ pub mod scheduler; pub mod sensorium; pub mod sigma_propagation; pub mod sla; +pub mod soa_envelope; pub mod soa_view; pub mod splat; pub mod tax; diff --git a/crates/lance-graph-contract/src/soa_envelope.rs b/crates/lance-graph-contract/src/soa_envelope.rs new file mode 100644 index 00000000..c7bf1eea --- /dev/null +++ b/crates/lance-graph-contract/src/soa_envelope.rs @@ -0,0 +1,355 @@ +//! SoA envelope little-endian contract. +//! +//! # Why this module exists +//! +//! Column-level LE knowledge is not enough. ndarray's `MultiLaneColumn` +//! (the column carrier) already decodes its own bytes little-endian, and +//! `CausalEdge64` / `EpisodicEdges64` each know their own `to_le_bytes` / +//! `from_le_bytes`. But the **SoA envelope as a whole** — the thing a Lance +//! version snapshots, the thing `simd_soa` sweeps, the thing a future reader +//! decodes — has no contract describing how those columns *assemble* into one +//! row-strided packet. The parts know the LE contract; the envelope did not. +//! +//! [`SoaEnvelope`] is that missing contract. It makes one SoA snapshot a +//! **self-describing little-endian packet per cycle**: a stable column +//! ordering, a fixed row byte stride, a `cycle` version stamp, and a +//! [`ENVELOPE_LAYOUT_VERSION`]. With it, a Lance version IS a coherent LE +//! packet at cycle N — not a loose collection of independently-correct +//! columns. +//! +//! # Layering (read before adding an ndarray dependency here) +//! +//! This module is **zero-dep, byte-geometry only**. It describes *where* +//! columns sit in a row packet and *what* LE element each holds — as data +//! ([`ColumnDescriptor`]), never as ndarray generic bounds. That keeps +//! `lance-graph-contract` featherweight for its non-HPC consumers +//! (`crewai-rust`, `n8n-rs`), and it keeps ndarray usable standalone by any +//! pure-SIMD consumer. +//! +//! The split is deliberate and complementary, not duplicated: +//! +//! | Level | Home | Answers | +//! |-------|------|---------| +//! | Column LE contract | `ndarray::simd::MultiLaneColumn` | "how do I sweep one typed column" | +//! | Envelope LE contract | this module | "where do columns sit in the row packet, what cycle is this" | +//! | Composition | `lance-graph` (always has both deps) | carve envelope columns → wrap each in `MultiLaneColumn` | +//! +//! ndarray never learns the envelope exists; this crate never learns ndarray +//! exists; `lance-graph` binds them. + +/// Layout version of the envelope byte geometry. +/// +/// Bumped whenever the meaning of [`ColumnDescriptor`] offsets/strides +/// changes. A reader MUST refuse to decode a packet whose stamped version it +/// does not understand (per `I-LEGACY-API-FEATURE-GATED`: layout reclaim is +/// paired with a version gate on the serialization path). +pub const ENVELOPE_LAYOUT_VERSION: u8 = 1; + +/// The little-endian element type of one column. +/// +/// Width only — no distance semantics, no domain meaning (cf. ndarray's +/// no-umbrella rule). The actual decode (`from_le_bytes`) happens in the +/// consumer's `MultiLaneColumn` lane iterator. +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] +#[repr(u8)] +pub enum ColumnKind { + U8 = 0, + I8 = 1, + U16 = 2, + I16 = 3, + U32 = 4, + F32 = 5, + U64 = 6, + F64 = 7, +} + +impl ColumnKind { + /// Bytes per element of this LE column kind. + pub const fn elem_bytes(self) -> usize { + match self { + ColumnKind::U8 | ColumnKind::I8 => 1, + ColumnKind::U16 | ColumnKind::I16 => 2, + ColumnKind::U32 | ColumnKind::F32 => 4, + ColumnKind::U64 | ColumnKind::F64 => 8, + } + } +} + +/// One column's placement within a single row packet. +/// +/// `Copy` and `repr(C)` so a descriptor table is itself a stable LE artifact. +/// `name_id` is a stable column ordinal (an enum discriminant on the consumer +/// side), NOT a string — keeping this crate alloc-free and the descriptor +/// `Copy`. +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] +#[repr(C)] +pub struct ColumnDescriptor { + /// Stable column identity (consumer-side enum ordinal). + pub name_id: u16, + /// LE element kind. + pub kind: ColumnKind, + /// Elements of `kind` per row for this column (e.g. content = 256 × u64, + /// energy = 1 × f32). + pub elems_per_row: u16, + /// Byte offset of this column within one row packet. + pub row_offset: u32, +} + +impl ColumnDescriptor { + /// Bytes this column occupies in one row. + pub const fn col_bytes_per_row(&self) -> usize { + self.kind.elem_bytes() * self.elems_per_row as usize + } + + /// Byte range `[start, end)` of this column within a row packet. + pub const fn row_byte_range(&self) -> (usize, usize) { + let start = self.row_offset as usize; + (start, start + self.col_bytes_per_row()) + } +} + +/// What can go wrong validating an envelope's byte geometry. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum EnvelopeError { + /// The stamped layout version is not the one this build understands. + LayoutVersionMismatch { expected: u8, found: u8 }, + /// Sum of column byte-widths does not equal the declared row stride. + StrideMismatch { declared: usize, summed: usize }, + /// Two columns overlap, or a gap/ordering violation was found. + ColumnOverlap { col_a: u16, col_b: u16 }, + /// `as_le_bytes().len()` is not `row_stride * n_rows`. + PacketSizeMismatch { expected: usize, found: usize }, + /// A requested row or column index is out of bounds. + OutOfBounds, +} + +/// A self-describing little-endian SoA packet for one cycle. +/// +/// Implemented by the owner of the backing store (e.g. the mailbox SoA). The +/// envelope is read-only here; mutation lives on the owner type, never on this +/// view (mirrors `MailboxSoaView` vs `MailboxSoaOwner`). +pub trait SoaEnvelope { + /// Layout version this implementor's geometry conforms to. + const LAYOUT_VERSION: u8 = ENVELOPE_LAYOUT_VERSION; + + /// Stable, ordered column placement table. Ordering is part of the + /// contract: a reader walks columns in this order. + fn columns(&self) -> &[ColumnDescriptor]; + + /// Total bytes per row across all columns. + fn row_stride(&self) -> usize; + + /// Number of rows in this snapshot. + fn n_rows(&self) -> usize; + + /// The version stamp this snapshot carries (the cycle whose committed + /// state these bytes are). This is what turns a Lance version into a + /// coherent "packet at cycle N". + fn cycle(&self) -> u32; + + /// The whole packet as contiguous LE bytes, zero-copy. Length MUST be + /// `row_stride() * n_rows()`. + fn as_le_bytes(&self) -> &[u8]; + + /// Zero-copy LE view of one full row. + fn row_le(&self, row: usize) -> Option<&[u8]> { + let stride = self.row_stride(); + let start = row.checked_mul(stride)?; + let end = start.checked_add(stride)?; + self.as_le_bytes().get(start..end) + } + + /// Zero-copy LE view of one column within one row. + fn column_le(&self, row: usize, col: &ColumnDescriptor) -> Option<&[u8]> { + let r = self.row_le(row)?; + let (start, end) = col.row_byte_range(); + r.get(start..end) + } + + /// Validate that the declared geometry is internally consistent and that + /// the backing packet matches. Call this at the Lance read boundary — a + /// v1 packet under a v2 reader (or a torn snapshot) is refused here rather + /// than silently mis-decoded downstream. + fn verify_layout(&self) -> Result<(), EnvelopeError> { + // 1. Version gate. + if Self::LAYOUT_VERSION != ENVELOPE_LAYOUT_VERSION { + return Err(EnvelopeError::LayoutVersionMismatch { + expected: ENVELOPE_LAYOUT_VERSION, + found: Self::LAYOUT_VERSION, + }); + } + // 2. Columns are non-overlapping and their widths sum to the stride. + let cols = self.columns(); + let mut summed = 0usize; + for (i, a) in cols.iter().enumerate() { + let (a_start, a_end) = a.row_byte_range(); + summed += a.col_bytes_per_row(); + for b in &cols[i + 1..] { + let (b_start, b_end) = b.row_byte_range(); + let overlap = a_start < b_end && b_start < a_end; + if overlap { + return Err(EnvelopeError::ColumnOverlap { + col_a: a.name_id, + col_b: b.name_id, + }); + } + } + } + let stride = self.row_stride(); + if summed != stride { + return Err(EnvelopeError::StrideMismatch { + declared: stride, + summed, + }); + } + // 3. Backing packet size matches stride × rows. + let expected = stride.saturating_mul(self.n_rows()); + let found = self.as_le_bytes().len(); + if expected != found { + return Err(EnvelopeError::PacketSizeMismatch { expected, found }); + } + Ok(()) + } +} + +#[cfg(test)] +mod tests { + use super::*; + + struct TestEnvelope { + cols: Vec, + stride: usize, + rows: usize, + bytes: Vec, + cycle: u32, + } + + impl SoaEnvelope for TestEnvelope { + fn columns(&self) -> &[ColumnDescriptor] { + &self.cols + } + fn row_stride(&self) -> usize { + self.stride + } + fn n_rows(&self) -> usize { + self.rows + } + fn cycle(&self) -> u32 { + self.cycle + } + fn as_le_bytes(&self) -> &[u8] { + &self.bytes + } + } + + fn two_col_envelope(rows: usize) -> TestEnvelope { + // col 0: 1 × f32 (4 B) at offset 0 + // col 1: 1 × u64 (8 B) at offset 4 + let cols = vec![ + ColumnDescriptor { + name_id: 0, + kind: ColumnKind::F32, + elems_per_row: 1, + row_offset: 0, + }, + ColumnDescriptor { + name_id: 1, + kind: ColumnKind::U64, + elems_per_row: 1, + row_offset: 4, + }, + ]; + let stride = 12; + TestEnvelope { + cols, + stride, + rows, + bytes: vec![0u8; stride * rows], + cycle: 7, + } + } + + #[test] + fn kind_widths() { + assert_eq!(ColumnKind::U8.elem_bytes(), 1); + assert_eq!(ColumnKind::F32.elem_bytes(), 4); + assert_eq!(ColumnKind::U64.elem_bytes(), 8); + } + + #[test] + fn descriptor_byte_range() { + let d = ColumnDescriptor { + name_id: 0, + kind: ColumnKind::U64, + elems_per_row: 256, + row_offset: 16, + }; + assert_eq!(d.col_bytes_per_row(), 256 * 8); + assert_eq!(d.row_byte_range(), (16, 16 + 256 * 8)); + } + + #[test] + fn valid_envelope_passes() { + let env = two_col_envelope(4); + assert_eq!(env.cycle(), 7); + assert!(env.verify_layout().is_ok()); + } + + #[test] + fn stride_mismatch_caught() { + let mut env = two_col_envelope(4); + env.stride = 16; // columns sum to 12, not 16 + env.bytes = vec![0u8; 16 * 4]; + assert_eq!( + env.verify_layout(), + Err(EnvelopeError::StrideMismatch { + declared: 16, + summed: 12, + }) + ); + } + + #[test] + fn overlap_caught() { + let mut env = two_col_envelope(1); + env.cols[1].row_offset = 2; // u64 at 2 overlaps f32 at [0,4) + env.stride = 10; + env.bytes = vec![0u8; 10]; + assert!(matches!( + env.verify_layout(), + Err(EnvelopeError::ColumnOverlap { .. }) + )); + } + + #[test] + fn packet_size_mismatch_caught() { + let mut env = two_col_envelope(4); + env.bytes.truncate(12 * 3); // one row short + assert_eq!( + env.verify_layout(), + Err(EnvelopeError::PacketSizeMismatch { + expected: 48, + found: 36, + }) + ); + } + + #[test] + fn row_and_column_views_are_zero_copy_slices() { + let mut env = two_col_envelope(2); + // Write row 1, col 1 (u64) = 0x0102030405060708 LE. + let v: u64 = 0x0102_0304_0506_0708; + let row1_col1_start = 12 + 4; + env.bytes[row1_col1_start..row1_col1_start + 8].copy_from_slice(&v.to_le_bytes()); + + let row = env.row_le(1).unwrap(); + assert_eq!(row.len(), 12); + + let col = env.column_le(1, &env.cols[1]).unwrap(); + assert_eq!(col.len(), 8); + assert_eq!(u64::from_le_bytes(col.try_into().unwrap()), v); + + // Out of bounds. + assert!(env.row_le(2).is_none()); + } +} diff --git a/docs/probes/particle-soa-envelope-audit.md b/docs/probes/particle-soa-envelope-audit.md index c52b04f8..f53f67cc 100644 --- a/docs/probes/particle-soa-envelope-audit.md +++ b/docs/probes/particle-soa-envelope-audit.md @@ -338,6 +338,47 @@ two `CausalEdge64` types and by a `&[u64]` reinterpret seam in the SoA view.** | Serialization tax? | Low on the hot path (baton is 8-byte words). p64 projection recomputes per dispatch — derive cost, not serialize cost. | | Contract fragmentation? | **Yes** — two `CausalEdge64` types is the principal fragmentation. | +#### Phase 7 follow-up — the envelope must know the LE contract, not just the columns (RESOLVED 2026-06-06) + +The original Phase 7 finding ("single canonical LE contract: yes for the +column byte methods") was **incomplete**. Column-level LE knowledge is +necessary but not sufficient: `CausalEdge64`, `EpisodicEdges64`, and ndarray's +`MultiLaneColumn` each decode their own bytes correctly, but the **SoA envelope +as a whole** — what a Lance version snapshots and what `simd_soa` sweeps — had +no contract describing how those columns *assemble* into one row-strided packet +with a cycle stamp. The parts knew the LE contract; the envelope did not. + +**Resolution (shipped this branch):** + +- **Column LE contract = `ndarray::simd::MultiLaneColumn`** (`ndarray/src/simd_soa.rs`). + Already exists, already LE-correct per column (`iter_f32x16` / `iter_u64x8` + via `from_le_bytes`), already standalone — any pure-SIMD consumer uses it + with zero lance coupling. **No change needed.** +- **Envelope LE contract = `lance_graph_contract::soa_envelope::SoaEnvelope`** + (new module, this commit). Zero-dep, byte-geometry only: `columns()` + (`ColumnDescriptor[]` — ordering + offset + LE `ColumnKind` + elems/row), + `row_stride()`, `cycle(): u32`, `LAYOUT_VERSION`, `as_le_bytes()`, plus + `row_le` / `column_le` zero-copy views and a `verify_layout()` gate (catches + stride mismatch, column overlap, packet-size tear, and version skew at the + Lance read boundary — closing the `edges_raw() -> &[u64]` implicit-agreement + hazard). +- **Composition = `lance-graph`** (the one crate that always has both deps): + carve each envelope column per its descriptor, wrap in `MultiLaneColumn`. + +**Why NOT a shared `simd-soa-contract` crate, and why NOT pull ndarray into +the contract:** `lance-graph-contract` is consumed by `crewai-rust` and +`n8n-rs` precisely because it is zero-dep. ndarray is the heavy HPC foundation +(BLAS L1/L2/L3, MKL/OpenBLAS FFI). Pulling ndarray into the contract would +force that build onto every contract consumer AND force a pure-SIMD ndarray +consumer to transitively pull a graph contract crate. The two-level split +above keeps **both** crates clean: ndarray standalone for SIMD-only consumers, +contract featherweight for crewai/n8n. The levels are complementary +(column = "how to sweep one typed column"; envelope = "where columns sit, what +cycle"), never restated, and neither crate depends on the other. + +**Iron rule that falls out:** *ndarray owns the column contract; lance-graph +owns the envelope contract; neither restates the other; lance-graph binds them.* + --- ### Phase 8 — OGAR inheritance audit @@ -440,6 +481,18 @@ execution geometry explicit. 7. **Retire the deprecated qualia/cycle columns** once (1) lands — `QualiaColumn` f32×18 and the `Vsa16kF32` cycle plane are pure footprint (65.5 KB/row) on the legacy envelope. (Resolves redundancy 3, 4.) +8. **Demote the `ndarray-hpc` fallback wording in CLAUDE.md.** In practice + **no shipped consumer uses lance-graph without ndarray** — every consumer + uses both. The `ndarray-hpc` feature / `blasgraph/ndarray_bridge.rs` + fallback is a **CI-compile-check only** path, not a real deployment mode. + CLAUDE.md currently presents it as a parallel architecture ("Fallback + without ndarray ... for minimal builds / downstream consumers who don't need + HPC"), which is exactly what makes sessions re-derive the wrong dependency + boundary (e.g. "keep ndarray ignorant of the envelope to preserve the + fallback"). Reword to: *ndarray is mandatory for every shipped consumer; the + no-ndarray path is a CI compile gate, nothing ships on it.* This is the + load-bearing correction behind recommendation-set context for the SIMD_SOA + contract split (Phase 7 follow-up). --- From a676121813ee3199805c995dafb422baa445480f Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 20:06:44 +0000 Subject: [PATCH 06/15] =?UTF-8?q?CLAUDE.md:=20P0=20rule=20=E2=80=94=20AdaW?= =?UTF-8?q?orldAPI=20forks=20only,=20never=20crates.io=20upstream?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- CLAUDE.md | 20 ++++++++++++++++++++ python/CLAUDE.md | 8 ++++++++ 2 files changed, 28 insertions(+) diff --git a/CLAUDE.md b/CLAUDE.md index 6a2b3aac..e71b8fdf 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,5 +1,25 @@ # CLAUDE.md — lance-graph +## P0 — AdaWorldAPI forks ONLY, NEVER crates.io upstream + +**Always depend on the AdaWorldAPI fork of any crate that has one. NEVER use the +upstream crates.io version of a forked crate.** Non-negotiable; applies to every +`Cargo.toml` and every dependency decision in this repo. Every repo in this +workspace is local — prefer the local/fork source over the registry, always. + +- Crates with an `AdaWorldAPI/` fork — e.g. `ndarray`, `lance` / + `lancedb` / `lance-index` / `lance-linalg` / `lance-namespace`, `surrealdb`, + and any other — MUST be wired via the fork (`path` / `git` / `[patch.crates-io]`), + never the registry version. +- If a fork's coordinates (git URL, branch/tag, feature flag) are unknown, + **STOP and ask**. Do NOT fall back to crates.io as a convenience or to make a + build pass. +- `"warning: Patch ... was not used in the crate graph"` means the fork + is NOT actually wired — treat it as a build error to fix, never a warning to + ignore. +- crates.io is permitted ONLY for crates that have no AdaWorldAPI fork / no local + source. + > **Updated**: 2026-04-21 (categorical-algebraic inference click) > **Role**: The obligatory spine — query engine, codec stack, semantic transformer, and orchestration contract > **Status**: 22 crates, 7 in workspace, 15 excluded (standalone/DTO), Phases 1-2 DONE, Phases 6-7 DONE (grammar + governance), Phase 3 IN PROGRESS diff --git a/python/CLAUDE.md b/python/CLAUDE.md index 5dc645d6..671c026e 100644 --- a/python/CLAUDE.md +++ b/python/CLAUDE.md @@ -1,3 +1,11 @@ +## P0 — AdaWorldAPI forks ONLY, NEVER crates.io upstream + +**Always depend on the AdaWorldAPI fork of any crate that has one. NEVER use the +upstream crates.io version of a forked crate.** Non-negotiable; applies to every +`Cargo.toml` and every dependency decision in this repo. Every repo in this +workspace is local — prefer the local/fork source over the registry, always. +If a fork's coordinates are unknown, STOP and ask — never fall back to crates.io. + Use the makefile for most actions: * Build: `maturin develop` From edf0a9dfef12d27544dfcf383663bbc2497236a2 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 22:46:35 +0000 Subject: [PATCH 07/15] audit+contract: purge baton/emission; zero-copy SoA model correction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Every SoA envelope is zero-copy from creation to Lance tombstone. There is no baton, no CollapseGateEmission, no inter-mailbox handoff type. audit doc: - Phase 7 rewritten: LE contract is an in-place backing-store descriptor, not a transmitted packet. Baton/CollapseGateEmission references removed. The {to,from}_le_bytes on CE64/EpisodicEdges64 are Lance I/O seams, not cross-mailbox serialization. - Phase 1: emit()/CollapseGateEmission flagged as code artifact to remove. - Correction 9 added: remove MailboxSoA::emit() and CollapseGateEmission. - Geometry verdict: last_emission_cycle → rename to last_active_cycle. - Bottom line updated: zero-copy lifecycle replaces "execution split". - Phase 7 follow-up: crewai/n8n replaced with OGAR classes/ractor actors as the contract's non-HPC consumers. "packet" language removed. soa_envelope.rs: - Module doc: "in-place backing store" replaces "packet/snapshot". "zero-copy from creation to Lance tombstone" stated explicitly. - SoaEnvelope trait doc: "in-place backing store" not "packet". - Layering table: "row stride" not "row packet". - crewai/n8n removed from consumer description. - PacketSizeMismatch clarified: backing store size mismatch. https://claude.ai/code/session_0147hSzjmWZDuy2MSQNrhEK5 --- .../lance-graph-contract/src/soa_envelope.rs | 38 +++++---- docs/probes/particle-soa-envelope-audit.md | 79 +++++++++++-------- 2 files changed, 68 insertions(+), 49 deletions(-) diff --git a/crates/lance-graph-contract/src/soa_envelope.rs b/crates/lance-graph-contract/src/soa_envelope.rs index c7bf1eea..f85823a9 100644 --- a/crates/lance-graph-contract/src/soa_envelope.rs +++ b/crates/lance-graph-contract/src/soa_envelope.rs @@ -10,28 +10,29 @@ //! decodes — has no contract describing how those columns *assemble* into one //! row-strided packet. The parts know the LE contract; the envelope did not. //! -//! [`SoaEnvelope`] is that missing contract. It makes one SoA snapshot a -//! **self-describing little-endian packet per cycle**: a stable column -//! ordering, a fixed row byte stride, a `cycle` version stamp, and a +//! [`SoaEnvelope`] is that missing contract. It makes the in-place SoA +//! backing store **self-describing at each cycle**: a stable column ordering, +//! a fixed row byte stride, a `cycle` version stamp, and a //! [`ENVELOPE_LAYOUT_VERSION`]. With it, a Lance version IS a coherent LE -//! packet at cycle N — not a loose collection of independently-correct -//! columns. +//! in-place layout at cycle N — not a loose collection of independently- +//! correct columns. Nothing is serialized or transmitted; the backing bytes +//! are resident in-place, zero-copy from creation to Lance tombstone. //! //! # Layering (read before adding an ndarray dependency here) //! //! This module is **zero-dep, byte-geometry only**. It describes *where* -//! columns sit in a row packet and *what* LE element each holds — as data -//! ([`ColumnDescriptor`]), never as ndarray generic bounds. That keeps -//! `lance-graph-contract` featherweight for its non-HPC consumers -//! (`crewai-rust`, `n8n-rs`), and it keeps ndarray usable standalone by any -//! pure-SIMD consumer. +//! columns sit in the backing store's row stride and *what* LE element each +//! holds — as data ([`ColumnDescriptor`]), never as ndarray generic bounds. +//! That keeps `lance-graph-contract` featherweight for its non-HPC consumers +//! (OGAR classes, ractor actors), and it keeps ndarray usable standalone by +//! any pure-SIMD consumer. //! //! The split is deliberate and complementary, not duplicated: //! //! | Level | Home | Answers | //! |-------|------|---------| //! | Column LE contract | `ndarray::simd::MultiLaneColumn` | "how do I sweep one typed column" | -//! | Envelope LE contract | this module | "where do columns sit in the row packet, what cycle is this" | +//! | Envelope LE contract | this module | "where do columns sit in the row stride, what cycle is this" | //! | Composition | `lance-graph` (always has both deps) | carve envelope columns → wrap each in `MultiLaneColumn` | //! //! ndarray never learns the envelope exists; this crate never learns ndarray @@ -75,7 +76,7 @@ impl ColumnKind { } } -/// One column's placement within a single row packet. +/// One column's placement within a single row of the backing store. /// /// `Copy` and `repr(C)` so a descriptor table is itself a stable LE artifact. /// `name_id` is a stable column ordinal (an enum discriminant on the consumer @@ -117,17 +118,20 @@ pub enum EnvelopeError { StrideMismatch { declared: usize, summed: usize }, /// Two columns overlap, or a gap/ordering violation was found. ColumnOverlap { col_a: u16, col_b: u16 }, - /// `as_le_bytes().len()` is not `row_stride * n_rows`. + /// `as_le_bytes().len()` is not `row_stride * n_rows` (backing store size mismatch). PacketSizeMismatch { expected: usize, found: usize }, /// A requested row or column index is out of bounds. OutOfBounds, } -/// A self-describing little-endian SoA packet for one cycle. +/// The little-endian geometry contract for one SoA envelope cycle. /// -/// Implemented by the owner of the backing store (e.g. the mailbox SoA). The -/// envelope is read-only here; mutation lives on the owner type, never on this -/// view (mirrors `MailboxSoaView` vs `MailboxSoaOwner`). +/// Implemented by the owner of the in-place backing store (e.g. the mailbox +/// SoA). The envelope is zero-copy from creation to Lance tombstone — nothing +/// is serialized or transmitted; this trait describes *where columns sit* in +/// the already-resident backing bytes and *what cycle stamp* the store carries. +/// The read-only view here mirrors `MailboxSoaView` vs `MailboxSoaOwner`: +/// mutation lives on the owner type, never on this trait. pub trait SoaEnvelope { /// Layout version this implementor's geometry conforms to. const LAYOUT_VERSION: u8 = ENVELOPE_LAYOUT_VERSION; diff --git a/docs/probes/particle-soa-envelope-audit.md b/docs/probes/particle-soa-envelope-audit.md index f53f67cc..3379af1e 100644 --- a/docs/probes/particle-soa-envelope-audit.md +++ b/docs/probes/particle-soa-envelope-audit.md @@ -40,8 +40,10 @@ biggest structural finding.** `plasticity_counter [u8;N]`, `last_emission_cycle [u32;N]`, `edges [CausalEdge64;N]`, `qualia [QualiaI4_16D;N]`, `meta [MetaWord;N]`, `entity_type [u16;N]`; scalars `mailbox_id`, `current_cycle`, `w_slot`, - `threshold`. It emits a baton via `emit(MailboxId) -> CollapseGateEmission`. - **This is the particle envelope the intended model describes.** + `threshold`. **This is the particle envelope the intended model describes.** + ⚠ The source contains `emit(MailboxId) -> CollapseGateEmission` — this method + contradicts the zero-copy model (creation to Lance tombstone, no emission, + no baton). It is a code artifact to be removed, not the intended design. - `BindSpace` columns (`bindspace.rs`): `FingerprintColumns { content [u64×256], cycle [f32×16_384], topic, angle, sigma }`, `EdgeColumn`, `QualiaI4Column` (+ deprecated `QualiaColumn` f32×18), `MetaColumn`, `temporal`, `expert`, @@ -294,20 +296,22 @@ SPO decomposition is explicit as three palette indices + the ### Phase 7 — Little-endian contract audit -**There IS a single canonical LE baton contract — but it is fragmented by the -two `CausalEdge64` types and by a `&[u64]` reinterpret seam in the SoA view.** +**The SoA envelope is zero-copy in-place (creation → Lance tombstone). There is +no baton, no emission, no serialization. The LE contract describes where columns +sit in the in-place backing store, not a transmitted packet. Two fragmentation +risks remain.** -- Canonical baton: `CollapseGateEmission` = `(u16 target, CausalEdge64)`, wire - cost `13 + 10·baton_count` bytes (per CLAUDE.md E-BATON-1). `CausalEdge64` and - `EpisodicEdges64` both expose `to_le_bytes / from_le_bytes / write_le / read_le` - — a shared LE convention. (Confirmed — one envelope contract intent.) +- The SoA is owned in-place by the mailbox. A Lance version IS the backing store + at cycle N — not a serialized snapshot, not a transmitted packet. + `CausalEdge64` and `EpisodicEdges64` both expose `to_le_bytes / from_le_bytes` + for Lance's columnar write path (Lance reads/writes LE bytes from/into the store). + These are Lance I/O seams, not cross-mailbox serialization. (Confirmed — correct design.) - p64-bridge (`p64-bridge/src/lib.rs`): `edges_to_layered_rows(&[CausalEdge64]) -> [[u64;64];8]`, `edge_to_block(&CausalEdge64) -> (usize,usize)`. **This is a projection / derivation, not a layout-preserving transport.** It reads SPO + mask bits and *computes* palette addresses; it does not round-trip the edge. - So p64 does **not** "preserve layout" — by design it derives a different - geometry (palette planes) from the edge. (Inferred — acceptable, but it is a - one-way lens, not a serializer.) + p64 derives a different geometry (palette planes) from the edge — one-way lens, + not a serializer. (Inferred — acceptable by design.) - SoA view reinterpret seam: `MailboxSoaView::edges_raw() -> &[u64]` (NOT `&[CausalEdge64]`) — the contract crate stays zero-dep on `causal-edge` by handing back raw `u64` that callers reconstruct via `CausalEdge64(raw)`. This @@ -319,23 +323,25 @@ two `CausalEdge64` types and by a `&[u64]` reinterpret seam in the SoA view.** **Contract map:** ``` - ENVELOPE LE CONTRACT (canonical) - ├─ Baton: CollapseGateEmission (u16 target, CausalEdge64) wire = 13 + 10·n - ├─ CausalEdge64::{to,from}_le_bytes (causal-edge crate, v1/v2 feature-gated) + ENVELOPE LE CONTRACT (in-place backing store) + ├─ CausalEdge64::{to,from}_le_bytes (causal-edge crate, v1/v2 feature-gated — Lance I/O seam) ├─ EpisodicEdges64::{to,from}_le_bytes (matches CE64 convention) │ FRAGMENTATION RISKS ├─ ⚠ TWO CausalEdge64 layouts (causal-edge SPO-palette vs thinking-engine 8-channel) │ bridged only by layered.rs::to_spo()/from_spo() — name collision, lossy - ├─ ⚠ soa_view::edges_raw() -> &[u64] (reinterpret seam; layout agreement is implicit) - └─ ⚠ p64-bridge = one-way projection CE64 → [[u64;64];8] (derives, does NOT preserve/round-trip) + └─ ⚠ soa_view::edges_raw() -> &[u64] (reinterpret seam; v1/v2 layout agreement is implicit) + + NOTE: CollapseGateEmission / "baton" DO NOT exist in the correct design. + MailboxSoA::emit() in source is a code artifact to be removed. Every SoA is + zero-copy from creation to tombstone; there is no cross-mailbox handoff type. ``` | Question | Answer | |---|---| -| Single canonical LE contract? | **Yes** for the baton + CE64/EpisodicEdges64 byte methods (Confirmed). | +| Single canonical LE contract? | **Yes** — CE64/EpisodicEdges64 byte methods are the Lance I/O seam. (Confirmed.) | | Hidden conversion? | **Yes** — `edges_raw() -> &[u64]` reinterpret; v1/v2 layout agreement is implicit (Contradiction risk). | -| Serialization tax? | Low on the hot path (baton is 8-byte words). p64 projection recomputes per dispatch — derive cost, not serialize cost. | +| Serialization tax? | **None** — the backing store is in-place. Lance writes LE columns directly. p64 derives palette geometry, does not serialize. | | Contract fragmentation? | **Yes** — two `CausalEdge64` types is the principal fragmentation. | #### Phase 7 follow-up — the envelope must know the LE contract, not just the columns (RESOLVED 2026-06-06) @@ -359,22 +365,23 @@ with a cycle stamp. The parts knew the LE contract; the envelope did not. (`ColumnDescriptor[]` — ordering + offset + LE `ColumnKind` + elems/row), `row_stride()`, `cycle(): u32`, `LAYOUT_VERSION`, `as_le_bytes()`, plus `row_le` / `column_le` zero-copy views and a `verify_layout()` gate (catches - stride mismatch, column overlap, packet-size tear, and version skew at the - Lance read boundary — closing the `edges_raw() -> &[u64]` implicit-agreement - hazard). + stride mismatch, column overlap, backing-store size mismatch, and version + skew at the Lance read boundary — closing the `edges_raw() -> &[u64]` + implicit-agreement hazard). The envelope describes the in-place backing + store; nothing is packaged or transmitted. - **Composition = `lance-graph`** (the one crate that always has both deps): carve each envelope column per its descriptor, wrap in `MultiLaneColumn`. **Why NOT a shared `simd-soa-contract` crate, and why NOT pull ndarray into -the contract:** `lance-graph-contract` is consumed by `crewai-rust` and -`n8n-rs` precisely because it is zero-dep. ndarray is the heavy HPC foundation +the contract:** `lance-graph-contract` is consumed by OGAR classes and ractor +actors precisely because it is zero-dep. ndarray is the heavy HPC foundation (BLAS L1/L2/L3, MKL/OpenBLAS FFI). Pulling ndarray into the contract would force that build onto every contract consumer AND force a pure-SIMD ndarray consumer to transitively pull a graph contract crate. The two-level split above keeps **both** crates clean: ndarray standalone for SIMD-only consumers, -contract featherweight for crewai/n8n. The levels are complementary -(column = "how to sweep one typed column"; envelope = "where columns sit, what -cycle"), never restated, and neither crate depends on the other. +contract featherweight for class/actor consumers. The levels are complementary +(column = "how to sweep one typed column"; envelope = "where columns sit in the +backing store, what cycle"), never restated, and neither crate depends on the other. **Iron rule that falls out:** *ndarray owns the column contract; lance-graph owns the envelope contract; neither restates the other; lance-graph binds them.* @@ -481,7 +488,14 @@ execution geometry explicit. 7. **Retire the deprecated qualia/cycle columns** once (1) lands — `QualiaColumn` f32×18 and the `Vsa16kF32` cycle plane are pure footprint (65.5 KB/row) on the legacy envelope. (Resolves redundancy 3, 4.) -8. **Demote the `ndarray-hpc` fallback wording in CLAUDE.md.** In practice +9. **Remove `MailboxSoA::emit()` and `CollapseGateEmission`.** The zero-copy + model (creation → Lance tombstone, no emission, no baton) means `emit()` and + the `CollapseGateEmission(u16 target, CausalEdge64)` type are code artifacts + from a superseded design. The KanbanColumn/`VersionScheduler`/ractor + orchestration is the only secondary path; there is no inter-mailbox handoff + type. Remove `emit()` from `MailboxSoA`, remove or gate `CollapseGateEmission` + behind `#[deprecated]`, and update `ShaderDriver` accordingly. +10. **Demote the `ndarray-hpc` fallback wording in CLAUDE.md.** In practice **no shipped consumer uses lance-graph without ndarray** — every consumer uses both. The `ndarray-hpc` feature / `blasgraph/ndarray_bridge.rs` fallback is a **CI-compile-check only** path, not a real deployment mode. @@ -502,7 +516,7 @@ execution geometry explicit. |---|---|---| | `MailboxSoA.mailbox_id` | **KEEP** | Envelope identity, correct. | | `MailboxSoA.entity_type` | **KEEP (fix resolver)** | Correct as class pointer; resolver should match its O(1) doc. | -| `MailboxSoA.energy / plasticity / last_emission_cycle` | **KEEP** | Local pragmatics, correctly owned. | +| `MailboxSoA.energy / plasticity / last_emission_cycle` | **KEEP (rename)** | Local pragmatics, correctly owned. `last_emission_cycle` is a same-cycle idempotency guard; rename to `last_active_cycle` to remove the emission framing. | | `MailboxSoA.edges (CausalEdge64)` | **KEEP** | Payload, correctly owned by the mailbox. | | `MailboxSoA.qualia (QualiaI4_16D)` | **KEEP** | Canonical local pragmatic vector. | | `MailboxSoA.meta (MetaWord)` | **KEEP** | Thinking-style/awareness bits belong here, not on the edge. | @@ -528,11 +542,12 @@ execution geometry explicit. **Does the struct geometry measure what it claims?** -- **Payload, references, versioning, execution split: YES.** `CausalEdge64` is - a clean payload, references are explicit `Copy` handles with no ownership +- **Payload, references, versioning, zero-copy lifecycle: YES.** `CausalEdge64` + is a clean payload, references are explicit `Copy` handles with no ownership cycles, Lance versioning gives self-through-time without row-level history - duplication, and the read-only-scheduler / single-owner-mutator split is - correct. + duplication, the read-only-scheduler / single-owner-mutator split is correct, + and the SoA envelope is zero-copy in-place from creation to Lance tombstone + (no emission, no baton, no serialization). - **Identity, inheritance, single-envelope: NOT YET.** The intended `OGAR::classes::from(address)` does not exist (forward-only in OGAR; a linear scan in lance-graph); two SoA envelopes co-exist with a live deprecated f32 From 1eed37a1752e1e6744f103f567b032d35a1a1426 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 22:47:41 +0000 Subject: [PATCH 08/15] =?UTF-8?q?docs:=20SoA=20three-tier=20model=20?= =?UTF-8?q?=E2=80=94=20zero-copy=20lifecycle,=20Kanban,=20OGIT/OGAR?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Canonical architecture reference replacing any prior baton/emission framing. Three tiers: 1. MailboxSoA (primary): zero-copy in-place, creation → Lance tombstone. No emit(), no CollapseGateEmission, no inter-mailbox handoff type. last_active_cycle replaces last_emission_cycle (idempotency guard only). 2. KanbanColumn/Rubicon (sole secondary): Lance write → VersionScheduler (read-only) → KanbanMove → MailboxSoaOwner::advance_phase (sole mutator). Ractor (lance-graph-supervisor) provides actor-level meta-orchestration. 3. OGIT ontology + OGAR classes (inherited, O(1)): HHTL/NiblePath prefix → radix-trie codebook → class + schema + label inheritance (compile-time for known classes, JIT via lance-graph-planner for new ones). entity_type per-row may be redundant. surrealdb/kv-lance: BLOCKED(C). Includes full "what does NOT exist" table and six iron rules. https://claude.ai/code/session_0147hSzjmWZDuy2MSQNrhEK5 --- docs/architecture/soa-three-tier-model.md | 156 ++++++++++++++++++++++ 1 file changed, 156 insertions(+) create mode 100644 docs/architecture/soa-three-tier-model.md diff --git a/docs/architecture/soa-three-tier-model.md b/docs/architecture/soa-three-tier-model.md new file mode 100644 index 00000000..0b74fc14 --- /dev/null +++ b/docs/architecture/soa-three-tier-model.md @@ -0,0 +1,156 @@ +# SoA Three-Tier Model — Mailbox Lifecycle, Kanban, and Ontology + +> **Branch:** `claude/stoic-turing-M0Eiq` +> **Date:** 2026-06-06 +> **Authority:** Supersedes any prior baton/emission/CollapseGateEmission framing. + +--- + +## The invariant + +**Every SoA envelope is zero-copy from creation to Lance tombstone.** + +There is no baton. There is no emission. There is no inter-mailbox handoff type. +No bytes leave the backing store until Lance's own columnar I/O writes them to +disk — and even then the in-memory store is unchanged, not serialized and +freed. + +--- + +## Tier 1 — MailboxSoA (primary, owned, zero-copy) + +The `MailboxSoA` is the single thought envelope. One mailbox owns one +SoA. The columns (`energy [f32;N]`, `plasticity [u8;N]`, `last_active_cycle [u32;N]`, +`edges [CausalEdge64;N]`, `qualia [QualiaI4_16D;N]`, `meta [MetaWord;N]`, +`entity_type [u16;N]`) are allocated once at mailbox creation and released at +Lance tombstone. + +``` +creation + │ + ▼ +MailboxSoA (backing store in-place; column LE contract = MultiLaneColumn) + │ (envelope LE contract = SoaEnvelope trait) + │ + ▼ Lance write on each version tick (LE bytes → columnar store) +DatasetVersion(v) → DatasetVersion(v+1) → ... + │ + ▼ +Lance soft-delete (tombstone) ← sole lifecycle event that ends the store +``` + +**Access contract:** +- `MailboxSoaView`: read-only, `&[T]` borrows, `edges_raw() -> &[u64]` +- `MailboxSoaOwner`: `advance_phase(&mut self, to: KanbanPhase)` — sole mutator + +**Idempotency guard:** `last_active_cycle [u32;N]` marks the cycle a row was +last written. It is a same-cycle guard, not a history column. (Rename from +`last_emission_cycle` in source — the emission framing is wrong.) + +**`MailboxSoA::emit()` and `CollapseGateEmission` in source are code artifacts +from a superseded design and must be removed.** There is no inter-mailbox +handoff type. + +--- + +## Tier 2 — KanbanColumn / Rubicon lifecycle (sole secondary data) + +The only data that is *secondary* to the SoA backing store is the Kanban +phase. This is triggered by the Lance writer, not by the SoA itself. + +``` +Lance writer → VersionScheduler::on_version(&view, at, exec) + │ read-only &V: never mutates + ▼ + Option { mailbox, from→to, libet_offset_us } + │ caller applies + ▼ + MailboxSoaOwner::advance_phase(to) ← SOLE mutator + +KanbanPhase lifecycle (6 states): + Planning → CognitiveWork → Evaluation → Commit → Plan → Prune +``` + +Above the SoA mailboxes, ractor (`lance-graph-supervisor`, ractor 0.14, +`supervisor` + `supervisor-lifecycle-audit` features) provides actor-level +meta-orchestration. Each mailbox is a ractor actor. The single-owner invariant +(no virtual ownership pointer needed) is enforced by Rust move semantics through +ractor's message-passing model. + +The Kanban column is the only data outside the SoA backing store that reflects +SoA lifecycle. There is no baton, no emission stream, no secondary truth column. + +--- + +## Tier 3 — OGAR classes + OGIT ontology (inherited, O(1)) + +The identity of a mailbox SoA resolves O(1) to its OGAR class and OGIT +ontology schema. This is Tier 3 because it is *inherited*, not stored per-row. + +**Resolution chain:** + +``` +mailbox address (MailboxId + family prefix) + │ + ▼ HHTL / NiblePath prefix lookup + │ NiblePath::is_ancestor_of: + │ (other.path >> (4*(other.depth-self.depth))) == self.path + │ = prefix ancestry = class ancestry (Confirmed) + ▼ +OGIT radix-trie codebook (O(1) for known classes at compile time) + │ + ├─ class identity string ("ogit-op/WorkPackage") + ├─ schema (fields, assoc, enums, attributes) — stored ONCE per class in OntologyRegistry + ├─ label inheritance (parent, mixins, STI) + └─ thinking style (MappingRow.thinking_style — stored, currently unused at dispatch ⚠) + +For new/runtime classes: JIT via lance-graph-planner (JITson / Cranelift) +``` + +**`entity_type: u16` in SoA may be redundant.** If the ontology resolves O(1) +from address, hardcoding a handle in every row violates SoC and defeats +radix-trie cheapness. The current linear scan in `OntologyRegistry:: +enumerate_first_with_entity_type_id` is a defect — it should be an O(1) +`Vec` index keyed by the 1-based ordinal. + +**OGAR active record / DLL AST adapter:** OGAR classes get pragmatic mapping +to inherited tools at compile time. These are cheap inherited registers, not +per-instance data in the SoA. The `Adapter::map` static identity transform in +OGAR + `KnowableFromStore` trait at the lance-graph boundary is the seam. + +**surrealdb / kv-lance:** OGAR's DLL AST → SurrealQL path (`ogar-adapter-surrealql`) +requires surrealdb with the `kv-lance` feature. This is BLOCKED(C) — the +`kv-lance` feature is only in the AdaWorldAPI surrealdb fork, coordinates +(git URL, branch) unknown. `surreal_container/Cargo.toml` dep is commented +out pending resolution. **Do not fall back to crates.io surrealdb.** + +--- + +## What does NOT exist (and must not be invented) + +| Concept | Status | +|---|---| +| `CollapseGateEmission` as cross-mailbox carrier | **WRONG** — remove from source | +| `MailboxSoA::emit()` | **WRONG** — remove from source | +| "Baton" as inter-mailbox handoff | **WRONG** — superseded | +| `wire_cost_bytes() = 13 + 10·baton_count` | **WRONG** — from CLAUDE.md E-BATON-1, now superseded | +| `Vsa16kF32` as a cross-mailbox carrier | **WRONG** — deprecated, lives only as legacy `cycle` column in `BindSpace` | +| Secondary data beyond KanbanColumn | **WRONG** — Kanban is the only secondary tier | +| BindSpace as the envelope | **MIGRATION IN PROGRESS** — BindSpace is the global legacy; MailboxSoA is the target | + +--- + +## Iron rules that fall out of this model + +1. `MailboxSoA` backing store is never copied, never serialized, never transmitted. + Lance writes LE bytes from it; the store itself stays in place. +2. `VersionScheduler` is read-only (`&V`). It proposes; `MailboxSoaOwner` disposes. +3. `MailboxSoA::emit()` and `CollapseGateEmission` are removed in the next + pass — they are not part of the correct design. +4. ractor provides the single-owner invariant for mailbox actors — no virtual + ownership pointer is needed. +5. Ontology resolution is O(1) HHTL prefix lookup for known classes. JITson + for new ones. The `entity_type: u16` per-row handle may be eliminated once + the O(1) lookup is the sole path. +6. surrealdb requires the AdaWorldAPI fork with `kv-lance`. Never fall back to + crates.io. BLOCKED(C) until fork coordinates are provided. From c78a55fad165e4266cc7a2c121d98e1bb022345f Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 6 Jun 2026 22:51:21 +0000 Subject: [PATCH 09/15] =?UTF-8?q?docs:=20register-file=20model=20=E2=80=94?= =?UTF-8?q?=20SoA=20as=20LE=20registers,=20OGAR=20class=20as=20ISA=20descr?= =?UTF-8?q?iptor?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New section added to three-tier model doc. Mental model: SoA columns are CPU-style LE registers (fixed offset, fixed width, no schema in the row). OGAR class is the instruction-set descriptor and DTO store for active record (label + schema + tools + codegen templates). Three sub-sections: - SoA columns = LE registers: byte-offset table, SoaEnvelope as ABI doc, ColumnDescriptor as register descriptor, MultiLaneColumn as load/store unit. - OGAR class = ISA descriptor + active record: label-inheritance via HHTL, schema stored once per class (never in rows), tools inherited from class hierarchy (prefix ancestry = class ancestry), active record = class wrapping a register bank slice. - Askama/Jinja codegen = masked selection from class DTO: Class