diff --git a/.claude/agents/BOOT.md b/.claude/agents/BOOT.md index 0f52bcd2..3e352e67 100644 --- a/.claude/agents/BOOT.md +++ b/.claude/agents/BOOT.md @@ -112,6 +112,9 @@ documents listed in its header (`READ BY:` line) BEFORE producing output. | Composing subsystems / integration | truth-architect | frankenstein-checklist.md | | New abstraction / new struct | truth-architect | frankenstein-checklist.md (§ redundant abstractions) | | Performance budget question | truth-architect | frankenstein-checklist.md (§ correctness-first) | +| C++→Rust transcode / codegen / AST-DLL / "port Tesseract" | core-first-architect (gate) + truth-architect | core-first-transcode-doctrine.md | +| Shaping a C++ method into a DO/DTO adapter | adapter-shaper | core-first-transcode-doctrine.md | +| Adapter "doesn't fit" the Core / before scaling adapters | core-gap-auditor | core-first-transcode-doctrine.md (§ Core gap + parity probe) | **The insight update cycle:** diff --git a/.claude/agents/README.md b/.claude/agents/README.md index 83f34b34..0b174426 100644 --- a/.claude/agents/README.md +++ b/.claude/agents/README.md @@ -8,7 +8,7 @@ > when starting a session; read this file when deciding which > specialist to wake for a specific task. -Ensemble size: **19 specialists + 5 meta-agents**. Every card is at +Ensemble size: **22 specialists + 5 meta-agents**. Every card is at `.claude/agents/.md`. Each card declares its own `tools`, `model`, and `READ BY:` knowledge prerequisites. @@ -170,6 +170,35 @@ revision-aware memory design. --- +## Transcode / Codegen (Core-First) + +Carry `.claude/knowledge/core-first-transcode-doctrine.md`: a generated layer +(AST / adapters / codegen'd Rust) is only ever as clean as the OGAR Core it +targets. Use this ensemble for any C++→Rust transcode / codegen / AST-DLL / +"port Tesseract" / DO-adapter work. + +### `core-first-architect` +Gatekeeper of the Core-First inversion. Checks that a transcode/codegen proposal +TARGETS the OGAR Core (`classid` / SoA value tenants / `EdgeBlock` / +`classid → ClassView` / `UnifiedStep`) and stays thin, rather than building a +parallel object model or treating the Core as codegen residue. Verdict: +TARGETS-CORE / RESIDUE-CORE / PARALLEL-MODEL. **Use BEFORE** any transcode proposal. + +### `adapter-shaper` +Shapes ONE C++ method into a thin classid-keyed DO-in/out adapter: maps its I/O +onto the SoA value tenants (the #511 `SoaMemberSpec` calibration), defers +composition to ClassView, and routes intrusive/stateful methods to hand-port +instead of forcing the adapter mold. **Use when** transcoding a specific leaf method. + +### `core-gap-auditor` +The honest guard: fires when an adapter needs state/dispatch the Core can't hold +(a Core gap). Rules EXTEND-CORE (grow the deliberate Core) vs ADAPTER-HACK +(reject — the moment an adapter carries its own state the elegance is gone). Owns +`PROBE-OGAR-ADAPTER-UNICHARSET`, the CONJECTURE→FINDING parity gate. **Use when** +an adapter "doesn't quite fit", or before scaling the adapter approach across modules. + +--- + ## How to pick the right agent Decision flow: @@ -193,6 +222,9 @@ Decision flow: - Persona / user / topic / angle → `perspective-weaver`. - Kernel mediation / hypothesis loops → `mirror-kernel-synthesist`. - codec-research / ZeckBF17 / golden-step → `savant-research`. + - C++→Rust transcode / codegen / "port Tesseract" / DO-adapter → + `core-first-architect` (gate) → `adapter-shaper` (per method) → + `core-gap-auditor` (Core gaps + the parity probe). - Drift / anti-pattern check → `adk-behavior-monitor`. 5. **Before PR merge** on any HHTL / codec / claims work → `truth-architect` review is the final link. diff --git a/.claude/agents/adapter-shaper.md b/.claude/agents/adapter-shaper.md new file mode 100644 index 00000000..8fb1ed10 --- /dev/null +++ b/.claude/agents/adapter-shaper.md @@ -0,0 +1,80 @@ +--- +name: adapter-shaper +description: > + Shapes a single C++ method/function into a thin, classid-keyed DO-in/out + adapter that targets the OGAR Core — identity from classid, state mapped onto + SoA value tenants, composition deferred to ClassView. Use when transcoding a + specific Tesseract leaf method to Rust, when deciding how a method's inputs/ + outputs map onto the value-tenant columns, or when a "DO/DTO adapter" shape is + being designed. Produces a per-method adapter spec; routes intrusive/stateful + methods to hand-port instead of forcing the adapter mold. +tools: Read, Glob, Grep, Bash, Edit, Write +model: opus +--- + +You are the ADAPTER_SHAPER agent for the tesseract-rs transcode. + +## Mission + +Turn ONE C++ method into the thinnest possible Rust adapter by leaning on the +OGAR Core. The adapter is a **shape**, not a re-implementation: it reads/writes +the Core's value tenants and returns; the Core owns identity, state, dispatch. + +Read `.claude/knowledge/core-first-transcode-doctrine.md` before shaping. + +## The shaping procedure (per method) + +``` +1. CLASSIFY the method: + mechanical / data-shaped (pure-ish: lookup, encode/decode, membership) → ADAPTER + intrusive / stateful / virtual-dispatch-heavy (ELIST mutation, BiLSTM) → HAND-PORT (stop; route out) + If HAND-PORT, do NOT force an adapter — say so and route to the raw-pointer tier. + +2. For an ADAPTER, fill the four slots from the Core (never invent a 5th container): + - identity : the classid this adapter attaches to (NOT a struct it defines) + - inputs (DO): which SoA value tenants / edge slots it READS (cite the #511 + SoaMemberSpec axis → column; if no column carries it → CORE GAP) + - outputs(DO): which tenants/edges it WRITES (same; gap → core-gap-auditor) + - body : the actual transform (this is the only genuinely new code) + +3. WIRE composition, don't implement it: the method's membership on its class + comes from the harvest manifest (has_function); overrides come from + virtually_overrides. The ClassView composes — the adapter does not do MRO. + +4. STATE the parity oracle: the libtesseract function this adapter must match + byte-for-byte (the codegen diff-gate, D-OCR-42). +``` + +## The thinness test (apply to every adapter you shape) + +> If the adapter defines its own struct, owns its own state, builds its own +> graph, or does its own dispatch — it is NOT thin. Each of those is a slot the +> Core already provides. Map it onto the Core or, if the Core lacks it, declare +> a **Core gap** (hand off to `core-gap-auditor`) — never carry it in the adapter. + +## Anti-patterns you must catch + +- **Adapter-State-Leak** — the adapter carries state because mapping it onto a + tenant was inconvenient. That is the failure mode; declare the Core gap. +- **Universal-Adapter-Flattening** — shaping an intrusive/stateful method as a + DO adapter anyway. Stop and route to hand-port. +- **Type-identity smuggling** — the adapter re-introduces a class hierarchy + instead of keying on classid. + +## Output format + +``` +## Method: +## Route: ADAPTER | HAND-PORT (+ one-line reason) + +(if ADAPTER:) +## DO-in : (SoaMemberSpec axis → column) +## DO-out : +## classid : +## body : +## composed-by: ClassView via has_function/virtually_overrides manifest +## parity oracle: libtesseract (byte-equal target) +## Core gaps (if any): +``` + +You shape ONE method per invocation. Do not batch-flatten a module. diff --git a/.claude/agents/core-first-architect.md b/.claude/agents/core-first-architect.md new file mode 100644 index 00000000..f660fc70 --- /dev/null +++ b/.claude/agents/core-first-architect.md @@ -0,0 +1,91 @@ +--- +name: core-first-architect +description: > + Guards the Core-First Transcode Doctrine: a generated AST / codegen / adapter + layer is only as clean as the Core it targets, so the Core (OGAR) must be the + deliberate hand-built foundation, never codegen residue. Use BEFORE any C++→Rust + transcode, codegen, AST-DLL, "port Tesseract", or DO/DTO-adapter proposal — and + whenever someone proposes a standalone Tesseract-rs object model instead of + growing OGAR with classid-keyed adapters. Verdict scale: TARGETS-CORE (proceed) / + RESIDUE-CORE (reject — Core is being treated as leftover) / PARALLEL-MODEL (reject + — build adapters into OGAR, not a second object model). +tools: Read, Glob, Grep, Bash, Edit, Write +model: opus +--- + +You are the CORE_FIRST_ARCHITECT agent for the lance-graph / tesseract-rs +transcode arc. + +## Mission + +Hold one inversion against every transcode/codegen proposal: + +> **The generated layer (AST / adapters / codegen'd Rust) is only ever as +> elegant as the Core it targets. Shape the Core first, deliberately, so the +> generated layer collapses to thin shapes. A residue-Core — "the Core is +> whatever we couldn't codegen" — guarantees fat, dirty output.** + +The Core is **OGAR** — operator-locked canon (`CLAUDE.md` § CANON, +`canonical_node.rs`). It is not yours to redesign; it is the foundation the +generated layer must *target*. Your job is to make sure proposals target it. + +## The doctrine you carry (non-negotiable) + +Read `.claude/knowledge/core-first-transcode-doctrine.md` in full when woken. +The spine: + +1. **The Core's movable parts are the adapter's assume-contract.** A thin + adapter assumes: identity = `classid`; state = SoA value tenants (the #511 + `SoaMemberSpec` calibration); relations = the `EdgeBlock`; + composition/inheritance = `classid → ClassView`; invocation = `UnifiedStep`. + If a proposed adapter re-implements any of those, it is NOT thin — flag it. +2. **The SPO harvest and the codegen are ONE system, not orthogonal.** The + harvest (`has_function`/`inherits_from`/`virtually_overrides`) is the + ClassView method-resolution manifest; the codegen is the adapter bodies. +3. **Scope boundary.** The doctrine holds for mechanical/data-shaped leaf + methods. Intrusive / stateful / virtual-heavy code is raw-pointer hand-port + — forcing it into the adapter mold is Frankenstein flattening. +4. **No new layer / no new `ValueSchema` variant.** Adapters grow OGAR via + ClassView; they do not add a parallel object model or a new enum tier. + +## Anti-patterns you must catch + +- **Residue-Core** — the Core is being treated as leftover instead of designed + first. Tell: the proposal describes codegen output before it describes what + the adapter gets to assume. +- **Parallel-Object-Model** — a standalone Tesseract-rs struct/impl hierarchy + instead of classid-keyed adapters composed by ClassView. +- **Universal-Adapter-Flattening** — every C++ method forced into the DO shape, + including the intrusive/stateful ones. Route those to hand-port. +- **Harvest-is-orthogonal** — treating harvester polish and codegen as + unrelated; forgetting the SPO graph IS the ClassView manifest. + +## The hand-off to siblings + +- An adapter that needs state/dispatch the Core can't hold → `core-gap-auditor` + (extend the Core deliberately, never hack the adapter). +- Shaping a specific method into a classid-keyed DO adapter → `adapter-shaper`. +- A claim that "the Core makes it clean" without the parity probe → + `truth-architect` (it stays CONJECTURE until `PROBE-OGAR-ADAPTER-UNICHARSET`). + +## Output format + +``` +## Targets the Core? (TARGETS-CORE / RESIDUE-CORE / PARALLEL-MODEL) + +## What the adapter is allowed to assume (the 5 movable parts it uses) +(list each + the OGAR mechanism it leans on) + +## What it re-implements that the Core already provides +(each one is a thinness leak — name the Core part it should use instead) + +## Scope check +- leaf/data-shaped (adapter) or intrusive/stateful (hand-port)? +- if forced into adapter shape despite being intrusive → REJECT (Frankenstein) + +## Does it change what to do next? +(yes/no — and whether PROBE-OGAR-ADAPTER-UNICHARSET must run first) +``` + +Never bless a transcode as "clean" without the parity probe having run. Until +then it is a CONJECTURE — say so. diff --git a/.claude/agents/core-gap-auditor.md b/.claude/agents/core-gap-auditor.md new file mode 100644 index 00000000..568c0246 --- /dev/null +++ b/.claude/agents/core-gap-auditor.md @@ -0,0 +1,91 @@ +--- +name: core-gap-auditor +description: > + The honest guard of the Core-First Transcode Doctrine. Fires when a transcoded + adapter needs state the SoA value tenants can't carry, or a dispatch the + ClassView can't express — a Core gap. Rules EXTEND-CORE (grow the deliberate + Core: a new tenant / ClassView capability, filed + reviewed) vs ADAPTER-HACK + (rejected — the moment an adapter carries its own state the elegance is gone). + Also owns the falsifier: PROBE-OGAR-ADAPTER-UNICHARSET (the CONJECTURE→FINDING + gate). Use when an adapter "doesn't quite fit", before scaling the adapter + approach across modules, or when validating the doctrine empirically. +tools: Read, Glob, Grep, Bash, Edit, Write +model: opus +--- + +You are the CORE_GAP_AUDITOR agent for the tesseract-rs transcode. + +## Mission + +Hold the iron guard of the doctrine: + +> **A Core gap is a signal to grow the deliberate Core — NEVER to fatten an +> adapter. The instant an adapter carries its own state or does its own +> dispatch, the Core-first elegance collapses into the dirty parallel port the +> whole approach exists to avoid.** + +Read `.claude/knowledge/core-first-transcode-doctrine.md` before ruling. + +## What a Core gap looks like (catch these) + +- An adapter that needs to **store** something with no SoA value tenant to hold + it (no `SoaMemberSpec` axis maps to it). +- An adapter whose behavior depends on a **dispatch** the ClassView can't + express (e.g. a virtual-override chain the `virtually_overrides` manifest + + ClassView don't yet model). +- An adapter that needs a **relation** the `EdgeBlock` slots don't carry. + +## The ruling (per gap) + +``` +EXTEND-CORE (the correct resolution): + - Name the missing movable part precisely: a new value tenant? a ClassView + capability? an edge-slot meaning? + - File it as a deliberate Core change (a new SoaMemberSpec axis with its + width calibration; a ClassView capability) — reviewed, not ad-hoc. + - The adapter STAYS thin; it just gets a richer Core to assume. + +ADAPTER-HACK (always REJECT): + - The adapter grows its own field / struct / state / dispatch to "just make + it work." This is the Adapter-State-Leak anti-pattern. Reject and convert + to an EXTEND-CORE proposal, OR (if the method is genuinely intrusive) + route it to the raw-pointer hand-port tier — never a hacked adapter. +``` + +## The falsifier you own + +The doctrine is a CONJECTURE until this runs green. Spec + run it (probe-first, +per `truth-architect`): + +``` +PROBE-OGAR-ADAPTER-UNICHARSET (P0) + Hypothesis: a leaf C++ method, transcoded as a classid-keyed DO adapter and + composed by a ClassView from the harvest manifest, reproduces + libtesseract byte-for-byte. + Build: 1–2 unicharset methods (unichar_to_id / id_to_unichar) → adapters + → mint a classid + ClassView composing them → invoke via ClassView. + Pass: byte-parity with the libtesseract FFI oracle on a fixed corpus. + Fail: a state/dispatch the Core can't hold → that IS the first Core gap; + record it (EXTEND-CORE), do NOT scale the approach until resolved. + Cost: small; the wiring (deepnsm/unicharset table + classid + ClassView) + is the real work. +``` + +Until this is green: **block scaling the adapter approach across modules.** One +green leaf is the licence to proceed; a leak is the cheapest possible discovery +of a Core gap, before the whole transcode is built on sand. + +## Output format + +``` +## Gap: +## Ruling: EXTEND-CORE | ADAPTER-HACK(REJECT) | HAND-PORT(intrusive) + +## If EXTEND-CORE: the deliberate Core change +- movable part: +- calibration / review needed: + +## Probe status +- PROBE-OGAR-ADAPTER-UNICHARSET: NOT RUN | PASS (byte-parity) | FAIL (gap: …) +- Is scaling unblocked? (only if PASS) +``` diff --git a/.claude/board/EPIPHANIES.md b/.claude/board/EPIPHANIES.md index fd2b4c79..d1bbda23 100644 --- a/.claude/board/EPIPHANIES.md +++ b/.claude/board/EPIPHANIES.md @@ -1,3 +1,27 @@ +## 2026-06-16 — E-TRANSCODE-EXEC-LADDER-1 — the Core-First transcode has a 3-rung execution ladder (codegen → two-tier compile → elixir-tissue over surreal/kanban/odoo), and rungs 2–3 land on already-shipped substrate + +**Status:** CONJECTURE (operator forward-design). v1 is the shipped doctrine; v2/v3 are gated on `PROBE-COMPILE-TWO-TIER` + `PROBE-SURREAL-TISSUE-SWAP` (both in `core-first-transcode-doctrine.md`), themselves floored by the v1 `PROBE-OGAR-ADAPTER-UNICHARSET`. +**Confidence:** Medium — the substrate each rung lands on is shipped and cited (`contract::kanban`, `contract::jit`, `surreal_container`, `E-SUBSTRATE-IS-THE-SCHEDULER`); the two NEW edges (Odoo→kanban ingest, AST-DLL tissue hot-swap) are unbuilt. + +**Context.** The Core-First Transcode Doctrine (knowledge doc, captured this session) framed transcode v1: thin classid-keyed adapters target the OGAR Core, bodies codegen'd at build. The operator then extended it along the *execution model* — how a body is compiled and where it lives — across two more rungs. + +**The ladder.** +- **v1 — Core-first codegen.** Bodies generated once at build, targeting `canonical_node` / `classid → ClassView` (#498). +- **v2 — two-tier compile.** ONE Elixir-shaped adapter source, TWO backends: **existing → compile-time** (Askama→Jinja analogy: a `defadapter!` proc-macro monomorphises to Rust, zero runtime cost), **new → JIT** (jitson/Cranelift). Not greenfield: `contract::jit` already defines `JitCompiler::compile(JitTemplate) → KernelHandle`, compiled by ndarray jitson/Cranelift, cached by n8n-rs `CompiledStyleRegistry`. +- **v3 — elixir-tissue over a fixed Core.** Core stays immutable; the DO-shaped business logic is **replaceable tissue** (BEAM hot-swap heritage — the deep reason Elixir is the right syntax to steal, not mere ergonomics) living in the **AST-DLL**, persisted + served + hot-swapped via SurrealDB's API; a **Kanban orchestration** reacts to **Odoo shapes** and dispatches the tissue. `contract::kanban`'s header already *names this seam verbatim*: planner emits `KanbanMove` → ractor drives the `KanbanColumn` → `surreal_container` projects the columns as the kanban view, carried as `UnifiedStep{step_type:"kanban.*"}` (`StepDomain::Kanban`). `E-SUBSTRATE-IS-THE-SCHEDULER` already has the substrate emit the schedule via surreal LIVE. + +**Why this matters.** The transcode's "holy grail" (Core-first thin adapters) was framed as a build-time codegen story. The execution-model ladder shows the SAME invariant survives JIT and hot-swap: whether a body is macro-monomorphised, Cranelift-JIT'd, or swapped from the SurrealDB AST-DLL, it is STILL a thin adapter targeting the OGAR Core, and a tissue adapter needing state the Core can't hold is STILL a Core gap → EXTEND-CORE. The execution model changes; the iron guard does not. And the two ambitious rungs are ~85% shipped substrate (kanban contract, jit contract, surreal_container, scheduler epiphany) + exactly two unbuilt edges. + +**The two unbuilt edges (the honest scope).** +1. **Odoo→kanban ingest** — map Odoo model records / stage transitions / automated-action triggers into `UnifiedStep{step_type:"kanban.*"}` / `KanbanMove`. No bridge exists today (`AdaWorldAPI/odoo` is the Python ERP, local at `/home/user/odoo`). +2. **AST-DLL tissue store + hot-swap over SurrealDB** — persist codegen'd/JIT'd DO adapter bodies in `surreal_container`, serve + hot-swap via the surreal API, gated by `kanban.rs`'s read-only-projection / commit ruling. Conceptually supported by the crate's `view`/`read`/`write` split; the swap-without-Core-rebuild parity is unprobed. + +**Lesson.** When an operator's forward vision arrives as "vN is X + Y + Z", the high-value move is to *locate each clause on shipped substrate before treating it as new work* — here three of the four load-bearing pieces (kanban seam, JIT tier, surreal projection) already existed and only needed naming + two ingest edges. Capturing the ladder (vs. only v1) prevents the v2/v3 framing from being re-derived next session, and the cited symbols make the "already shipped" claim checkable. + +**Cross-ref.** `core-first-transcode-doctrine.md` § "The execution-model ladder (v1 → v2 → v3)" (the canonical statement + the two new probes); `E-SUBSTRATE-IS-THE-SCHEDULER` (the v3 reactive tier this extends); `contract::kanban` / `contract::jit` / `surreal_container` (the shipped substrate); `tesseract-rs-ast-dll-codegen-v1` (the v1 codegen plan whose bodies become the v2/v3 tissue). + +--- + ## 2026-06-16 — E-UNBLOCK-CASCADE-1 — three independent fork/contract landings collapsed onto the same `MailboxSoaView` seam, closing four queued deliverables in one commit **Status:** FINDING. diff --git a/.claude/board/TECH_DEBT.md b/.claude/board/TECH_DEBT.md index d7980e13..bf8d8b83 100644 --- a/.claude/board/TECH_DEBT.md +++ b/.claude/board/TECH_DEBT.md @@ -175,9 +175,9 @@ flags. Separately, #507 left `intervene_counterfactual.rs:133/165` calling the `8131c480` lives on the unmerged `claude/continue-ndarray-x0Oaw`) — that WARNS, does not fail (v1 default routes through the canonical mapping per I-LEGACY-API-FEATURE- GATED); tracked here as a separate latent item, not fixed on this CI branch. -Cross-ref: `.github/workflows/rust-test.yml` (now both jobs at `debuginfo=0`); PR -#507 (`0c6ef02c`); `claude/continue-ndarray-x0Oaw` (the pending ce64-v2 consumer -migration). +Cross-ref: `.github/workflows/rust-test.yml` (now both jobs at `debuginfo=0`); +PR #507 (`0c6ef02c`); `claude/continue-ndarray-x0Oaw` (the pending ce64-v2 +consumer migration). ### TD-UNBUNDLE-FROM-1 — `unbundle_from` is NOT the inverse of `bundle_into` (2026-06-07) diff --git a/.claude/knowledge/core-first-transcode-doctrine.md b/.claude/knowledge/core-first-transcode-doctrine.md new file mode 100644 index 00000000..1c419098 --- /dev/null +++ b/.claude/knowledge/core-first-transcode-doctrine.md @@ -0,0 +1,218 @@ +# Core-First Transcode Doctrine — the Core empowers the AST to be thin + +> **READ BY:** `core-first-architect`, `adapter-shaper`, `core-gap-auditor`, +> `family-codec-smith`, `integration-lead`, `truth-architect`, and ANY agent +> proposing C++→Rust transcode / codegen / AST-DLL / adapter / "port Tesseract" +> work. +> **Status:** CONJECTURE (2026-06-16) — the unification is architecturally +> coherent and fits the operator-locked OGAR canon, but it is **not a FINDING** +> until the `unicharset` adapter-parity probe (below) runs green. Label every +> downstream citation accordingly. +> **Trigger phrases:** "transcode", "codegen", "AST DLL", "C++ → Rust", +> "port Tesseract", "DO/DTO adapter", "generated Rust", "tesseract-rs source". + +--- + +## The one sentence + +**A generated AST / adapter / codegen layer is only ever as clean as the Core +it targets — so the Core must be a deliberate, reusable, hand-built foundation +(OGAR), NOT the leftover residue of "what we couldn't codegen."** The Core is +what *empowers* the generated layer to be elegant; without a rich Core, every +generated unit drags its own world (types, state, structure) and the output is +dirty — and the diff-gate just fails everything. + +## The inversion (this is the whole trick) + +``` +Naive transcode: C++ ──codegen──► BLANK target + → every method re-implements its own types/state/structure + → fat, self-contained, dirty Rust; more cleanup than hand-port + +Core-first: C++ ──codegen──► RICH target (the OGAR Core) + → each method becomes a THIN adapter that ASSUMES the Core + → classid for identity, SoA columns for state, ClassView for + composition; the adapter is a shape, not a re-implementation +``` + +The intelligence is in **recognizing the simplification** — shaping the Core +(and its movable parts) first so the generated layer collapses to thin shapes. +A residue-Core never lets the adapters be thin. + +## The Core = OGAR's movable parts (the assume-contract) + +OGAR is **operator-locked canon** (`CLAUDE.md` § CANON, `canonical_node.rs`, +locked 2026-06-13) — it is the hand-built reusable Core. Each movable part is a +thing the generated adapter gets to **assume**, which is exactly what keeps the +adapter thin: + +| An adapter may ASSUME… | …because the Core provides | …so the adapter does NOT | +|---|---|---| +| "a class is a `classid`" | `canonical_node` key (`classid` u32) | carry type identity / define a struct hierarchy | +| "state lives in SoA value tenants" | helix `Signed360` / turbovec / palette columns; the per-axis widths are the **#511 `SoaMemberSpec`** calibration | define its own data structures | +| "relations are the edge block" | `EdgeBlock` (12 in-family + 4 out) | build its own graph / pointer web | +| "composition/inheritance is `classid → ClassView`" | ClassView preset selection (PR #498), driven by the harvest's `has_function`/`inherits_from`/`virtually_overrides` manifest | implement inheritance, vtables, MRO | +| "invocation is a `UnifiedStep`" | `OrchestrationBridge` / `UnifiedStep` (contract) | be a method bolted onto a god-object | + +Concretely: `unichar_to_id` stops being "a method on `UNICHARSET`" and becomes a +thin fn — *read the unicharset tenant columns → table lookup → return id.* The +Core does identity, state, composition, dispatch; the adapter is the lookup. + +## The two halves are one system (the harvest is NOT orthogonal to codegen) + +Earlier framing called the SPO harvest "orthogonal" to codegen. **That was +wrong.** They are the two halves of one mechanism: + +- **`ruff_cpp_spo` SPO harvest** (`has_function` / `inherits_from` / + `virtually_overrides`) = the **method-resolution manifest** — *which* adapters + a `classid`'s ClassView composes, and in what override order. +- **`tesseract-rs-ast-dll-codegen-v1`** = the **adapter bodies** — the Rust the + ClassView dispatches to. +- **`classid → ClassView`** = the "inheritance" — composition of the adapter set + the manifest names. **No new layer, no new `ValueSchema` variant.** + +## The execution-model ladder (v1 → v2 → v3) + +The doctrine above is **v1**: a Core targeted by thin adapters, codegen'd once at +build time. The operator's forward design (2026-06-16) extends it along the +**execution model** — *how* the adapter body is compiled and *where it lives* — +**without touching the Core-first invariant.** Each rung is **CONJECTURE**; the +gating probes are below. The striking part: rungs 2–3 land on substrate that is +already shipped, not greenfield. + +| Rung | What it adds | Already-shipped substrate it lands on | The new edge(s) | +|---|---|---|---| +| **v1 — Core-first codegen** | thin classid-keyed adapters target OGAR; bodies codegen'd at build | `canonical_node`; `classid → ClassView` (#498); `tesseract-rs-ast-dll-codegen-v1` | — (this doc) | +| **v2 — two-tier compile** | ONE Elixir-shaped adapter source, TWO backends: **existing → compile-time** (Askama→Jinja: a proc-macro monomorphises to Rust, zero runtime cost); **new → JIT** (jitson/Cranelift) | `contract::jit` (`JitCompiler` / `JitTemplate` / `KernelHandle`); ndarray jitson/Cranelift; n8n-rs `CompiledStyleRegistry` | the `defadapter!` build-time macro (the Askama half); a JITSON lowering of the adapter shape (the JIT half) | +| **v3 — elixir-tissue over a fixed Core** | Core stays immutable; the DO-shaped business logic is **replaceable tissue** (BEAM hot-swap heritage) living in the **AST-DLL**, persisted + served + hot-swapped via **SurrealDB's API**; a **Kanban orchestration** reacts to **Odoo shapes** and dispatches the tissue | `contract::kanban` (`KanbanMove` / `KanbanColumn` / `StepDomain::Kanban`); `surreal_container` (`view`/`read` = projection, `write` = commit); `E-SUBSTRATE-IS-THE-SCHEDULER` (substrate emits the schedule, surreal LIVE reactive) | an **Odoo→kanban ingest** (Odoo model/stage shapes → `UnifiedStep{step_type:"kanban.*"}`); the AST-DLL **tissue store + hot-swap** over `surreal_container` | + +**Why Elixir is the right syntax to steal (the deep reason, not just ergonomics).** +Elixir/BEAM's defining property is **hot code swapping of a running system** — +which is *exactly* "replaceable tissue on a fixed Core." Stealing the syntax +(multi-clause heads → `match classid`; `|>` → method chain / `defadapter!`; +`with` → `?`; behaviours → traits; `@spec` → types) buys the ergonomics; the +architecture buys the swap. "Tissue" is the workspace's own word (`CLAUDE.md`: +"AriGraph / episodic / SPO / CAM-PQ are thinking tissue — not storage"): the +Core is the skeleton, the elixir-adapters are organs hot-swappable around it. + +**The Core-first invariant is unchanged across all three rungs.** Whether a body +is build-time-monomorphised (v2 Askama), JIT'd (v2 Cranelift), or hot-swapped +from the SurrealDB AST-DLL (v3), it is STILL a thin adapter that targets the OGAR +Core (classid / SoA tenants / ClassView / `UnifiedStep`). A tissue adapter that +needs state the Core can't hold is STILL a **Core gap → EXTEND-CORE**, never an +adapter-state-leak. The execution model changes; the iron guard does not. + +### Probes that gate the new rungs + +``` +PROBE-COMPILE-TWO-TIER (P1, gates v2) + Hypothesis: one Elixir-shaped adapter source produces byte-identical behaviour + whether lowered by the build-time macro (defadapter!) or the JIT + (JITSON → Cranelift). + Pass: macro-compiled and JIT-compiled adapter agree byte-for-byte on the + unicharset corpus (and both match the libtesseract oracle). + Fail: divergence ⇒ the shape's semantics aren't backend-independent; fix the + shared lowering before shipping two backends. + +PROBE-SURREAL-TISSUE-SWAP (P2, gates v3 tissue layer) + Hypothesis: a DO adapter served from the surreal_container AST-DLL can be + hot-swapped (replace the body) WITHOUT rebuilding the Core, and the + post-swap invocation still hits byte-parity. + Pass: swap body A→A' via the surreal API; ClassView dispatch picks up A'; + Core (classid/SoA/ClassView) untouched; parity holds for both. + Fail: the swap forces a Core change ⇒ the boundary isn't where v3 claims. + +(the Odoo→kanban ingest is the APPLICATION rung; gated on both probes above plus + a separate Odoo-shape → KanbanMove mapping spec, not yet written.) +``` + +All three rungs inherit the v1 falsifier (`PROBE-OGAR-ADAPTER-UNICHARSET`, below) +as their floor: no execution-model elaboration matters until one leaf adapter +hits byte-parity through a ClassView at all. + +## Scope boundary (where it holds vs. where it must NOT be forced) + +- **Holds cleanly** for the *mechanical, data-shaped leaf* methods — unicharset + id↔utf8, recoder encode/decode, dawg membership, weight-matrix walks. These + become clean DO-in/out adapters. (This is the exact subset the codegen plan + already scopes to.) +- **Does NOT collapse** for the *intrusive / stateful / virtual-dispatch-heavy* + core — ELIST/CLIST raw-pointer mutation, the BiLSTM / recodebeam numeric + kernels. Forcing these into the DO-adapter mold is the **Frankenstein + flattening** the workspace explicitly forbids (`frankenstein-checklist.md`). + They stay raw-pointer hand-ports (codegen plan §5). +- Therefore this is the holy grail for the **integration shape** (how + transcoded behaviour plugs into the substrate), **not** a free pass on the + **transcode difficulty** (the hard kernels remain hard). + +## The one iron guard (this is what keeps it honest) + +> **A Core gap shows up as an adapter that needs state, or a dispatch, the Core +> cannot hold. When that happens, EXTEND THE CORE deliberately — NEVER hack the +> adapter.** + +The moment an adapter starts carrying its own state, the elegance is gone and +you are back to the dirty parallel port. A Core gap is a *signal to grow the +deliberate Core* (a new value tenant, a ClassView capability), filed and +reviewed — not an excuse to fatten one adapter. + +## The falsifier (CONJECTURE → FINDING gate) + +Per `truth-architect` discipline, this doctrine is a CONJECTURE until measured. +The cheapest end-to-end probe: + +``` +PROBE-OGAR-ADAPTER-UNICHARSET (P0) + 1. Transcode 1–2 unicharset leaf methods (unichar_to_id / id_to_unichar) as + classid-keyed DO-in/out adapters. + 2. Mint an OGAR classid whose ClassView composes them, using the harvested + has_function manifest. + 3. Invoke through the ClassView. + Pass: byte-parity with libtesseract (FFI oracle) on a fixed corpus. + Fail / leak: the adapter needs state the SoA tenants can't carry, or a + dispatch the ClassView can't express → a Core gap found cheaply, BEFORE + building the whole transcode. +``` + +Until this runs green, "the OGAR Core makes the transcode clean" is a +CONJECTURE. Do NOT scale the adapter approach across modules until it passes. + +## Anti-patterns this doctrine exists to catch + +- **Residue-Core** — treating the Core as "the parts we couldn't codegen" + instead of the deliberate foundation designed first. Yields fat adapters. +- **Parallel-Object-Model** — building a second, standalone Tesseract-rs object + model (structs + impls) instead of growing OGAR with classid-keyed adapters. +- **Adapter-State-Leak** — an adapter carrying its own state because the Core + doesn't offer the tenant. Fix the Core, not the adapter. +- **Universal-Adapter-Flattening** — forcing intrusive/stateful methods into the + DO-adapter shape (Frankenstein). Route them to raw-pointer hand-port instead. +- **Harvest-is-orthogonal** — forgetting the SPO graph IS the ClassView + method-resolution manifest; treating harvester polish and codegen as unrelated. + +## Cross-references + +- `CLAUDE.md` § CANON — OGAR `canonical_node` (key / edge / value), the locked Core. +- `canonical_node.rs` — `NodeGuid` / `EdgeBlock` / `NodeRow`; classid / family / ClassView guards. +- PR #498 — `classid → ClassView` resolution (the composition mechanism). +- `crates/perturbation-sim/src/columns.rs` + PR #511 — `SoaMemberSpec` value-tenant calibration (Core-shaping: which column carries which value). +- `AdaWorldAPI/ruff` `ruff_cpp_spo` — the SPO harvester (the method-resolution manifest source). +- `.claude/plans/tesseract-rs-ast-dll-codegen-v1.md` — the codegen (adapter-body) plan; §5 module routing (codegen vs hand-port). +- `AdaWorldAPI/ruff/.claude/plans/cpp-spo-probes-v1.md` — the harvester gating probes. +- `lance_graph_contract::orchestration` — `OrchestrationBridge` / `UnifiedStep` (the adapter invocation surface). +- `.claude/knowledge/frankenstein-checklist.md` — composition-failure / flattening guard. +- `crates/lance-graph-contract/src/jit.rs` — `JitCompiler` / `JitTemplate` / `KernelHandle` (the v2 JIT tier; ndarray jitson/Cranelift compiles, n8n-rs caches). +- `crates/lance-graph-contract/src/kanban.rs` — `KanbanMove` / `KanbanColumn` / `StepDomain::Kanban` (the v3 orchestration seam; planner emits, ractor drives, surreal projects). +- `crates/surreal_container/` — the SurrealDB tier (`view`/`read` = projection, `write` = commit) that would host the v3 AST-DLL tissue store + hot-swap. +- `.claude/board/EPIPHANIES.md` `E-SUBSTRATE-IS-THE-SCHEDULER` — substrate emits the schedule (surreal LIVE reactive); the v3 Odoo→kanban reaction extends it. `E-TRANSCODE-EXEC-LADDER-1` records this ladder. +- `AdaWorldAPI/odoo` (`/home/user/odoo`) — the v3 shape source (Odoo model/stage shapes → `KanbanMove`). + +## Provenance / credit + +OGAR (the `canonical_node` / `classid → ClassView` / value-tenant Core) is +**operator-locked canon**, designed by the operator + prior sessions. What this +doc contributes is the **C++-transcode ↔ OGAR unification** (transcode as thin +classid-keyed adapters composed by ClassView, the SPO harvest as the +method-resolution manifest) and the **Core-first inversion** principle — that +the Core's deliberate shaping is *precisely* what lets the generated layer be +thin. Captured 2026-06-16, before it dilutes. diff --git a/CLAUDE.md b/CLAUDE.md index ed61029a..2a14ef38 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1058,6 +1058,7 @@ SIBLING REPOS: .claude/knowledge/lab-vs-canonical-surface.md — MANDATORY before touching REST/gRPC/Wire DTO/endpoint/shader-lab (prevents "add another REST endpoint" hallucination) .claude/knowledge/autoattended-multiagent-pattern.md — MANDATORY before planning a wave with ≥4 parallel workers; 4-savant taxonomy (PP-13/14/15/16), worker iron rules, atomic-consolidation pass .claude/knowledge/ndarray-vertical-simd-alien-magic.md — MANDATORY before writing SIMD code in any consumer crate OR filing a PR against `adaworldapi/ndarray` `src/simd_*`; canonical reference for the wave W1a (5 ndarray primitives) + W1b (5 consumer migrations) + W1.5 (3 sigker primitives, gated on jc Pillar 11) plan +.claude/knowledge/core-first-transcode-doctrine.md — MANDATORY before any C++→Rust transcode / codegen / AST-DLL / "port Tesseract" / DO-adapter work; the Core-first inversion (a generated layer is only as clean as the OGAR Core it targets) + the OGAR movable-parts assume-contract + the unicharset adapter-parity falsifier. Carried by the core-first-architect / adapter-shaper / core-gap-auditor ensemble. .claude/CALIBRATION_STATUS_GROUND_TRUTH.md — OVERRIDE: read BEFORE any SESSION_*.md .claude/PLAN_BF16_DISTANCE_TABLES.md — 5-phase plan for BF16 distance tables .claude/TECHNICAL_DEBT_SIGNED_SESSION.md — 56% useful, 44% bypass (honest review) @@ -1081,6 +1082,26 @@ Kahneman-Tversky System-1 easy path and is nearly always wrong; extending the canonical bridge is the System-2 correct move. See the Decision Procedure in that doc before writing a single new handler. +**Best practice (P0 for transcode work): `.claude/knowledge/core-first-transcode-doctrine.md` +must be read BEFORE any C++→Rust transcode, codegen, AST-DLL, "port Tesseract", +or DO/DTO-adapter work.** The Core-First inversion: a generated layer (AST / +adapters / codegen'd Rust) is only ever as clean as the Core it targets, so the +**OGAR Core is shaped first, deliberately** — never treated as the residue of +"what we couldn't codegen." Emit **thin classid-keyed adapters that ASSUME the +Core** (identity = `classid`; state = SoA value tenants per the #511 +`SoaMemberSpec` calibration; relations = `EdgeBlock`; +composition/inheritance = `classid → ClassView`; invocation = `UnifiedStep`), +and treat the `ruff_cpp_spo` SPO harvest (`has_function` / `inherits_from` / +`virtually_overrides`) as the **ClassView method-resolution manifest** — the +harvest and the codegen are two halves of ONE system, not orthogonal. **Never** +build a parallel Tesseract-rs object model, **never** let an adapter carry its +own state (a Core gap → *extend the Core deliberately*, never hack the adapter), +and **never** force intrusive/stateful methods into the adapter mold (route them +to raw-pointer hand-port — Frankenstein flattening guard). Holds for +mechanical/data-shaped leaf methods only; CONJECTURE until +`PROBE-OGAR-ADAPTER-UNICHARSET` runs byte-parity green. Carried by the +`core-first-architect` / `adapter-shaper` / `core-gap-auditor` ensemble. + Every `.claude/knowledge/` document has a `READ BY:` header listing which agents MUST load it before producing output in that domain. When a knowledge trigger fires (see `.claude/agents/BOOT.md § Knowledge Activation Protocol`), the relevant diff --git a/crates/perturbation-sim/examples/calibrate.rs b/crates/perturbation-sim/examples/calibrate.rs index 0a5d6c15..a22d02e5 100644 --- a/crates/perturbation-sim/examples/calibrate.rs +++ b/crates/perturbation-sim/examples/calibrate.rs @@ -123,11 +123,19 @@ fn main() { }; let n = grid.n; let alive = vec![true; grid.edges.len()]; + let m = grid.edges.len(); + // Degenerate grids would divide by zero (`k = 24.min(m) = 0` → `m / k`) and + // break the eigenvector / `m - 1` assumptions below. Guard up front. + if n < 2 || m == 0 { + eprintln!( + "calibrate requires a connected grid with at least 2 buses and 1 line (got n={n}, m={m})" + ); + std::process::exit(2); + } // Ground truth = the study's 5-factor contingency matrix on the real core. let base = symmetric_eigen(&grid.laplacian_of(&alive), n); let v2 = base.eigenvector(1); - let m = grid.edges.len(); let mut sens: Vec<(usize, f64)> = (0..m) .map(|e| { let d = v2[grid.edges[e].from] - v2[grid.edges[e].to]; diff --git a/crates/perturbation-sim/src/hhtl.rs b/crates/perturbation-sim/src/hhtl.rs index 67713762..32d397af 100644 --- a/crates/perturbation-sim/src/hhtl.rs +++ b/crates/perturbation-sim/src/hhtl.rs @@ -106,6 +106,13 @@ pub fn hhtl_keys(grid: &Grid) -> Vec { /// Per-leaf-basin algebraic connectivity `λ₂` keyed by HHTL address — the topology /// "value" the key indexes (read once from the spectrum, deterministic). pub fn basin_lambda2(grid: &Grid, keys: &[HhtlKey]) -> HashMap { + assert_eq!( + keys.len(), + grid.n, + "basin_lambda2 requires exactly one HHTL key per grid node (got {} keys for {} nodes)", + keys.len(), + grid.n + ); let mut groups: HashMap> = HashMap::new(); for (n, k) in keys.iter().enumerate() { groups.entry(*k).or_default().push(n); @@ -172,4 +179,15 @@ mod tests { assert!(!l2.is_empty(), "at least one keyed basin"); assert_eq!(k.len(), g.n); } + + #[test] + #[should_panic(expected = "exactly one HHTL key per grid node")] + fn basin_lambda2_rejects_key_count_mismatch() { + // Locks the one-key-per-node precondition: a short key vector must panic + // rather than silently group the wrong nodes. + let g = grid_2x2_blocks(); + let mut k = hhtl_keys(&g); + k.pop(); // keys.len() == g.n - 1 + let _ = basin_lambda2(&g, &k); + } }