|
| 1 | +# HHTL Canary Inhabitance Plan |
| 2 | + |
| 3 | +Date: 2026-05-19 |
| 4 | +Status: Phase 2 entry condition — names the canary workload for the 6-sprint substrate arc |
| 5 | +Companion docs: |
| 6 | +- `stack-consolidation-bardioc-to-hhtl.md` (architectural frame) |
| 7 | +- `pr-master-consolidation.md` (6-sprint plan) |
| 8 | +- `pr-master-consolidation-savant-verdict.md` (Phase 1 verdict — READY-WITH-DOC-FIXES, all patches applied) |
| 9 | +- `hhtl-substrate-execution-prompt.md` (Phase 2 execution flex prompt — sibling to this doc) |
| 10 | + |
| 11 | +## Why this doc exists |
| 12 | + |
| 13 | +The strategic arc proves the new architecture wins **on paper**. The 6-sprint |
| 14 | +plan moves PR-X4 + PR-X9 from **design to substrate**. Neither artifact answers |
| 15 | +the question the substrate has to answer to count as **inhabited**: when does |
| 16 | +one specific cognitive query path *run end-to-end on the new architecture using |
| 17 | +the new idioms*? |
| 18 | + |
| 19 | +This doc names the canary. The canary is what closes the gap between |
| 20 | +"substrate exists" and "substrate is lived in." |
| 21 | + |
| 22 | +## The canary: NARS revision routed through HHTL cascade |
| 23 | + |
| 24 | +**Workload**: a NARS belief revision triggered by a perceptual surface, routed |
| 25 | +through the splat4d cascade to the relevant basin, materializing the basin |
| 26 | +codebook entry on demand, returning a revised `TruthValue` via the Rubicon |
| 27 | +commit gate, persisted to SurrealDB through a typed-surface adapter. |
| 28 | + |
| 29 | +**Why this workload**: |
| 30 | +- It is **architecturally pure** — exercises every load-bearing piece of the |
| 31 | + new substrate (cascade, codebook, Rule #3, Rubicon, per-thought bindspace, |
| 32 | + typed surfaces, zone-1↔2 boundary, ndarray::simd kernels) |
| 33 | +- It is **real** — NARS revision is a primary cognitive workload, not a |
| 34 | + synthetic benchmark; the existing Bardioc stack runs it constantly |
| 35 | +- It is **measurable** — has a scalar reference implementation in |
| 36 | + `src/hpc/nars.rs` to compare against for correctness |
| 37 | +- It is **scoped** — one query path, not a system migration; can be |
| 38 | + retracted without affecting parallel sprint work |
| 39 | +- It is **representative** — the result generalizes: if revision-via-HHTL |
| 40 | + works, every other cascade-routed cognitive op works the same way |
| 41 | + |
| 42 | +## What "routed through HHTL" concretely means |
| 43 | + |
| 44 | +Each step exercises a specific substrate primitive. This is the inhabitance |
| 45 | +checklist — not the implementation order: |
| 46 | + |
| 47 | +| Step | Substrate piece | Rule / discipline | |
| 48 | +|---|---|---| |
| 49 | +| 1. Perceptual surface arrives at a Ractor mailbox | Ractor as Rubicon gate (not Erlang) | Per-thought bindspace begins on mailbox entry | |
| 50 | +| 2. Surface → `Base17` typed wrapper | ndarray::hpc::cognitive (PR-X9) | Typed surface, not DTO | |
| 51 | +| 3. `CascadeAddr::from_position` Hilbert-3D encode | PR-X10 A12 hilbert.rs | Deterministic, no shared state | |
| 52 | +| 4. Cascade L1 XOR projection | PR-X4 splat4d cascade | Single XOR + table-addressing, no scan | |
| 53 | +| 5. Cascade L2-L4 hops | PR-X4 splat4d cascade | Each hop = 1 XOR; total ≤ 4 hops | |
| 54 | +| 6. Basin lookup at leaf address | PR-X9 LazyBlockedGrid | Lazy: codebook present → return; absent → materialize | |
| 55 | +| 7. Basin materialization (cold path only) | PR-X12 codec (rANS decode) | Decode under the Rubicon write-back gate, not during cascade | |
| 56 | +| 8. NARS revision over (existing truth, new evidence) | hpc::nars existing | Pure function: returns new `TruthValue`, no `&mut self` | |
| 57 | +| 9. Rubicon commit | Ractor handler `&mut self` is the legitimate gated write | Single committed outcome per mailbox message | |
| 58 | +| 10. Zone-1↔2 boundary crossing | sea-orm at zone 3 (only if egressing); SurrealDB at zone 2 | Typed surface in, ACID-tx out, materialization once | |
| 59 | +| 11. Per-thought bindspace dies | Message lifetime | No global registry retained | |
| 60 | + |
| 61 | +Eleven steps, one query path, four hops, sub-microsecond worst case (claimed). |
| 62 | +The canary either reaches that envelope or the architecture is wrong. |
| 63 | + |
| 64 | +## Measurement gates |
| 65 | + |
| 66 | +The canary passes Phase 2 when **all** of the following hold on a Zen4 or |
| 67 | +Sapphire Rapids 8-core box, AVX-512 enabled (`target-cpu=x86-64-v4`): |
| 68 | + |
| 69 | +### Correctness gates (binary) |
| 70 | + |
| 71 | +1. **Revision output matches scalar reference**: |
| 72 | + - `Fingerprint` (u64) bit-exact match against `src/hpc/nars.rs::revise` |
| 73 | + - `TruthValue` (f, c) within ULP ≤ 4 of scalar reference |
| 74 | + - 10,000 randomly-seeded revisions, zero divergences allowed |
| 75 | +2. **Cascade routing is deterministic**: |
| 76 | + - Same `(Base17, position)` → same `CascadeAddr` across runs |
| 77 | + - Same `CascadeAddr` → same basin entry (warm cache or cold-materialized) |
| 78 | + - Bit-exact reproducibility across 100 runs |
| 79 | +3. **No `&mut self` during compute** (compile-time enforcement): |
| 80 | + - `ndarray::hpc::cognitive::*` engines have `revise(&self, ...) -> Result` |
| 81 | + - Only Ractor handlers carry `&mut self` and only for commit, never compute |
| 82 | + - Clippy lint `clippy::needless_pass_by_ref_mut` clean |
| 83 | +4. **Per-thought bindspace is per-thought**: |
| 84 | + - No `static`/`lazy_static`/`OnceLock` carrying mutable cognitive state |
| 85 | + inside zone 1 — audited by grep + sentinel-qa review |
| 86 | +5. **Typed surfaces at zone boundaries**: |
| 87 | + - Zone 1 → zone 2: `ndarray::hpc::*` types, no `serde_json::Value`, no |
| 88 | + `HashMap<String, Box<dyn Any>>`, no DTO layer |
| 89 | + - Zone 2 → zone 3: `sea-orm` ActiveModel, materialization exactly once |
| 90 | + |
| 91 | +### Performance gates (numeric) |
| 92 | + |
| 93 | +1. **p99 revision latency** (warm cache, cascade depth ≤ 4): |
| 94 | + ≤ **1.5 µs** (target 700 ns mean per the HHTL claim; allow 2× headroom on p99) |
| 95 | +2. **p99 revision latency** (cold cache, includes basin materialization): |
| 96 | + ≤ **15 µs** (codec decode + cascade + revision; rANS decode dominates) |
| 97 | +3. **Cascade-only latency** (excluding revision math): |
| 98 | + ≤ **400 ns p99** (4 XOR hops + 4 table addressings) |
| 99 | +4. **Codebook hit rate after 1M revisions warmup**: |
| 100 | + ≥ **95%** (sparse basins not pre-materialized; popular cells warm fast) |
| 101 | +5. **Throughput, saturated**: |
| 102 | + ≥ **1M revisions/sec** per core sustained over 10 seconds (~1 µs amortized) |
| 103 | +6. **Working set per worker thread**: |
| 104 | + ≤ **1 MB** (fits L2 cache on Zen4/SPR) |
| 105 | +7. **ndarray::simd primitive coverage**: |
| 106 | + 100% of hot-path SIMD ops route through `ndarray::simd::*` — zero raw |
| 107 | + intrinsics in the cognitive path (enforced by clippy lint and the W1a |
| 108 | + consumer contract gate) |
| 109 | + |
| 110 | +### Inhabitance gates (qualitative) |
| 111 | + |
| 112 | +1. **The canary path reads like the architecture document.** A new reader |
| 113 | + should be able to trace each of the 11 steps above to a specific function |
| 114 | + in the codebase. If the code is more complex than the architecture |
| 115 | + description, the architecture didn't get inhabited — a translation |
| 116 | + layer got built. |
| 117 | +2. **No "Bardioc-shaped" code in the canary path.** No SQL builders for |
| 118 | + the lookup, no Elasticsearch-shaped query DSL, no JanusGraph-shaped |
| 119 | + traversal, no ClickHouse-shaped aggregation. The cascade is the lookup; |
| 120 | + the codebook is the storage; the Rubicon is the commit. If any |
| 121 | + step reaches for a legacy idiom, the canary has not inhabited. |
| 122 | +3. **The canary survives a sentinel-qa audit** with zero P0 SAFETY findings |
| 123 | + on the new code (existing scalar reference is grandfathered). |
| 124 | +4. **The integration sprint produces a 30-second screen recording** showing |
| 125 | + the canary running end-to-end, p99 latency on screen, codebook hit |
| 126 | + rate climbing during warmup. Recording is committed to the repo. |
| 127 | + |
| 128 | +## What is NOT the canary |
| 129 | + |
| 130 | +Explicit anti-scope so the canary doesn't drift into a system migration: |
| 131 | + |
| 132 | +- **Not**: a full Bardioc → HHTL stack swap |
| 133 | +- **Not**: a multi-workload benchmark suite |
| 134 | +- **Not**: a SQL or graph-query analog of NARS revision |
| 135 | +- **Not**: production cutover from Bardioc |
| 136 | +- **Not**: a UI demo |
| 137 | +- **Not**: a research artifact about HHTL theory — the canary is the |
| 138 | + *operational* proof, not a paper |
| 139 | + |
| 140 | +If the canary works, Bardioc cutover is a follow-on per-workload migration |
| 141 | +that can take months. The canary just has to demonstrate inhabitability of |
| 142 | +*one* path. |
| 143 | + |
| 144 | +## Where the canary lives |
| 145 | + |
| 146 | +| Component | Crate / path | Sprint | |
| 147 | +|---|---|---| |
| 148 | +| `Base17` + `Fingerprint` + `TruthValue` types | `ndarray::hpc::{nars,fingerprint,base17}` (existing) | — (pre-existing) | |
| 149 | +| `Hilbert3D::{encode,decode}` | `ndarray::hpc::linalg::hilbert` | PR-X10 A12 | |
| 150 | +| `CascadeAddr` + `from_position` + `XorProjection` | `ndarray::hpc::splat4d::cascade` | PR-X4 | |
| 151 | +| `SplatPyramid<T, S: GridStorage<T>, BR, BC>` | `ndarray::hpc::splat4d::pyramid` | PR-X4 + PR-X9 (GridStorage is PR-X9) | |
| 152 | +| `BasinCodebook` + `LazyBlockedGrid` | `ndarray::hpc::cognitive::{codebook,storage}` | PR-X9 | |
| 153 | +| rANS encode/decode + `CellMode` + `rdo_cell` | `ndarray::hpc::codec::*` | PR-X12 | |
| 154 | +| Per-pillar PASS gates (revision math certified) | `ndarray::hpc::pillar::*` | PR-X11 | |
| 155 | +| OGIT cognitive namespace bridge | `ndarray::hpc::ogit_bridge::*` | PR-X13 | |
| 156 | +| Ractor Rubicon gate (`RevisionHandler`) | `lance-graph::cognitive::nars_actor` (new) | Integration sprint | |
| 157 | +| SurrealDB egress (zone 2 typed surface) | `lance-graph::cognitive::nars_persist` (new) | Integration sprint | |
| 158 | +| End-to-end canary binary | `lance-graph/examples/nars_canary.rs` (new) | Integration sprint | |
| 159 | +| Measurement harness | `lance-graph/benches/nars_canary.rs` (new) | Integration sprint | |
| 160 | + |
| 161 | +The integration sprint produces the two `lance-graph::cognitive::*` modules |
| 162 | +that wire the substrate pieces together. The wiring is small (~200 LoC each); |
| 163 | +the substrate pieces are the work. |
| 164 | + |
| 165 | +## Composition with the 4-prompt strategic arc |
| 166 | + |
| 167 | +| Strategic prompt | Role | Canary relationship | |
| 168 | +|---|---|---| |
| 169 | +| `bardioc-weekend-rebuild-prompt.md` | Baseline measurement (legacy) | Produces the **NARS-revision-on-Bardioc** number the canary beats | |
| 170 | +| `ndarray-simd-trojan-horse-prompt.md` | Path A: ClickHouse + Tantivy FFI inject | **Independent** — analytic tier, not cognitive | |
| 171 | +| `databend-ndarray-simd-prompt.md` | Path C: Rust-native ClickHouse successor | **Independent** — analytic tier, not cognitive | |
| 172 | +| **THIS DOC + `hhtl-substrate-execution-prompt.md`** | Cognitive tier — the actual architectural win | Canary measures **revision-on-HHTL** vs the Bardioc baseline | |
| 173 | + |
| 174 | +The four-prompt arc handles the **analytic tier** (where ClickHouse used to |
| 175 | +live). This canary handles the **cognitive tier** (where HHTL lives). They |
| 176 | +compose: the analytic tier is Bardioc's escape hatch; the cognitive tier is |
| 177 | +the architecture's reason to exist. |
| 178 | + |
| 179 | +Both must work for the consolidation to be real. The cognitive canary is |
| 180 | +the harder and more important one. |
| 181 | + |
| 182 | +## Pass/fail decision |
| 183 | + |
| 184 | +If the canary passes all gates: HHTL is **inhabited**. Bardioc cognitive-tier |
| 185 | +cutover is a per-workload migration; analytic-tier cutover follows path A |
| 186 | +(buy time) or path C (replace). The consolidation arc is operationally |
| 187 | +proved. |
| 188 | + |
| 189 | +If the canary fails **performance gates** (latency/throughput): the |
| 190 | +architecture's algorithmic regime claim ("two orders of magnitude") is |
| 191 | +wrong. Re-examine the cascade depth, the codebook materialization cost, |
| 192 | +or the SIMD primitive coverage. Patch and re-measure. |
| 193 | + |
| 194 | +If the canary fails **correctness gates** (ULP/bit-exact): a substrate bug |
| 195 | +exists. P0 — block all dependent sprint work until resolved. |
| 196 | + |
| 197 | +If the canary fails **inhabitance gates** (qualitative): the substrate |
| 198 | +exists but isn't being lived in — the integration sprint built a |
| 199 | +translation layer instead of using the substrate primitives. Re-write |
| 200 | +the wiring, not the substrate. |
| 201 | + |
| 202 | +## Sequencing |
| 203 | + |
| 204 | +The canary cannot be implemented until the 6 substrate sprints land (the |
| 205 | +canary depends on PR-X4 + PR-X9 + PR-X10 A12 + PR-X11 + PR-X12 + PR-X13). |
| 206 | +**The canary is the integration sprint deliverable**, not a parallel track. |
| 207 | + |
| 208 | +The 6 sprints run per the master schedule (W1-W8 in |
| 209 | +`pr-master-consolidation.md`). Integration sprint = W8 = canary build + |
| 210 | +measure + record + write report. |
| 211 | + |
| 212 | +## What changes if the canary passes |
| 213 | + |
| 214 | +Three things become true that aren't true today: |
| 215 | + |
| 216 | +1. **The architecture document stops being a claim and becomes a measurement.** |
| 217 | + The "700ns at depth 4" claim is now a number with confidence intervals. |
| 218 | +2. **Per-workload Bardioc cutover becomes mechanically composable.** Each |
| 219 | + subsequent cognitive workload follows the canary pattern: typed surface |
| 220 | + in, cascade lookup, codebook materialization, Rubicon commit, zone |
| 221 | + boundary crossing. No new architectural decisions per workload. |
| 222 | +3. **The four strategic prompts can be executed with confidence.** Today |
| 223 | + they read as "buy time + measure baseline + adopt successor." After |
| 224 | + the canary passes, they read as "execute the cutover" with the cognitive |
| 225 | + tier already proven. |
| 226 | + |
| 227 | +If the canary doesn't pass, those three things stay false — and the next |
| 228 | +session has to decide whether to debug the substrate or revisit the |
| 229 | +architecture. |
0 commit comments