|
| 1 | +# ADR: V3 as the spine + the polyglot transpiler (Rust / Python / C#) |
| 2 | + |
| 3 | +**Status:** Proposed (RFC). Design contract; not yet implemented or |
| 4 | +compile-verified. |
| 5 | +**Date:** 2026-06-28 |
| 6 | +**Context:** completes the "spine vs adapter" question left open by |
| 7 | +`SURREAL-AST-AS-ADAPTER.md` + `SURREAL-AST-TRAP-PREFLIGHT.md`, and names the |
| 8 | +transpiler superpower (re-emit the OGAR AST to any language via adapter). |
| 9 | + |
| 10 | +--- |
| 11 | + |
| 12 | +## Decision |
| 13 | + |
| 14 | +1. **V3 (the content-addressed rail record) is the spine.** SurrealQL / |
| 15 | + ClickHouse / PostgreSQL / TTL DDL are demoted to **peer adapters** that |
| 16 | + lower *from* V3 + `ClassView`. SurrealQL stops being a spine candidate. |
| 17 | +2. **The V3 record is dual-mode and tenant-structured** (below). |
| 18 | +3. **Codegen is an adapter family**: just as DDL adapters project the schema, |
| 19 | + `LangBackend` adapters re-emit *source code* (Rust / Python / C#) from the |
| 20 | + same IR. The IR is the interlingua; codegen is the transpiler. |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +## 1 · The V3 record (the spine primitive) |
| 25 | + |
| 26 | +### 1.1 Dual-mode facet — `12 B = 96 bits = 6×16 = 4×24`, classid tags which |
| 27 | +``` |
| 28 | +FacetCascade { facet_classid: u32, payload: [u8; 12] } // 16 B, content address |
| 29 | +
|
| 30 | + classid tag = Cascade → [FacetTier; 6] // 6 × (part_of:8, is_a:8) — POSITION (hierarchy) |
| 31 | + classid tag = Triplet → [SpoTriple; 4] // 4 × (subject:8, pred:8, object:8) — LOCAL EDGES (graph) |
| 32 | +``` |
| 33 | +- Cascade = depth-with-implied-predicates (mereology:taxonomy); subsumption is |
| 34 | + a bit-op. Triplet = breadth-with-explicit-predicates; **an SPO triple is a |
| 35 | + triplet-mode facet**, which unifies the SPO corpus with the facet primitive |
| 36 | + (today they are unjoined substrates). |
| 37 | +- The tag rides in the classid (zero extra bytes; precedent: `TailVariant`). |
| 38 | + |
| 39 | +### 1.2 The 512-byte record = 32 tenants |
| 40 | +``` |
| 41 | +NodeRow 512 B ≡ [Facet; 32] (AoS row) |
| 42 | + ≡ 32 tenants × [GUID; N] (SoA — "tenant" = a GUID member column) |
| 43 | + tenant 0 Self GUID |
| 44 | + tenant 1 Edges (EdgeBlock 12+4) |
| 45 | + tenants 2..31 30 composition slots → GUID references to other classes |
| 46 | +``` |
| 47 | +`ClassView::tenant_schema(classid) -> [TenantRole; 32]`, **static per classid** |
| 48 | +(keeps each tenant a homogeneous, SIMD-scannable GUID column). Roles: |
| 49 | +`{ Self, Edges, Structural, Do, Think, Adapter }` (+ `nested`). The |
| 50 | +`Do` (ActionDef / do-arm) · `Think` (cognitive plane) · `Adapter` (projection) |
| 51 | +tenants are the three arms reached *through* the classid. Nesting = a |
| 52 | +content-addressed FK column → a columnar composition DAG. |
| 53 | + |
| 54 | +> Reconciliation with current code: today `NodeRow` = `key(16) | edges(16) | |
| 55 | +> value(480)` with `value` **opaque**. The `[Facet; 32]` / `tenant_schema` is |
| 56 | +> the typed schema this ADR imposes on those same bytes — `ClassView` is the |
| 57 | +> missing brick that turns the 480-byte slab into 30 typed tenant slots. |
| 58 | +
|
| 59 | +### 1.3 Capacity is the SoC lint, not a limit |
| 60 | +`>64 fields` · `>256/tier` · `>6 deep` · `>4 edges` · `>30 slots` → the class |
| 61 | +lacks separation of concerns. **The encoding makes good SoC the only |
| 62 | +representable shape**: overflow in any dimension is the signal; "reference |
| 63 | +another class" (grow a limb) is always the fix. We own OGAR, so minting the |
| 64 | +new limb is free and convergence keeps it shared. Detector and refactor are |
| 65 | +the same mechanism. (The law is already written as a falsifier in |
| 66 | +`ruff_spo_address/examples/medcare_probe.rs` §[G]; promote it to a `ruff` |
| 67 | +diagnostic.) |
| 68 | + |
| 69 | +--- |
| 70 | + |
| 71 | +## 2 · The transpiler (the superpower) |
| 72 | + |
| 73 | +The IR (`ruff_spo_triplet::ModelGraph`) is already bidirectional |
| 74 | +(`expand` ⇄ `reassemble`), and `ruff_cpp_codegen` already proves |
| 75 | +`ModelGraph → Rust source`. Generalize that one backend into an adapter family: |
| 76 | + |
| 77 | +``` |
| 78 | +SOURCE (py/cpp/cs) ─ruff_*_spo─▶ ModelGraph ─mint─▶ Facet (content address, dedup across langs) |
| 79 | + │ |
| 80 | +TARGET (py/rust/cs) ◀─LangBackend─── ModelGraph |
| 81 | +``` |
| 82 | +- `LangBackend { fn render(&self, &ModelGraph) -> String }` — one adapter per |
| 83 | + target, peers of the DDL adapters. |
| 84 | +- Rust ◀ `ruff_cpp_codegen` (exists) · Python ◀ extend `ruff_python_codegen` |
| 85 | + (the formatter's generator) · C# ◀ new `ruff_csharp_codegen`. |
| 86 | +- Content-addressing gives **cross-language dedup**: the same construct in |
| 87 | + Python/C++/C# mints the same `Facet` (CI convergence test). |
| 88 | + |
| 89 | +### Honest boundary — structure transpiles, behaviour does not |
| 90 | +`OGAR-AS-IR.md`: "the behavioural arm cannot survive lowering and stays in the |
| 91 | +IR." The existing backend renders `MethodSig` *signatures*, not method bodies. |
| 92 | +So the deliverable is a **schema / interface / DTO / ORM-model transpiler** |
| 93 | +(API contracts, type defs, model shells) — enormous on its own. Full behaviour |
| 94 | +transpilation (method bodies → executable logic) is a later arm via |
| 95 | +`ActionDef` / `KausalSpec`, explicitly out of this ADR. |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +## 3 · Consequences |
| 100 | + |
| 101 | +- **Positive:** one content-addressed spine; SurrealQL/DDL become honest peer |
| 102 | + adapters (closes the trap); the SPO corpus and the facet primitive unify; |
| 103 | + capacity-as-lint is enforced structurally; codegen-via-adapter gives polyglot |
| 104 | + re-export; cross-app/cross-language dedup for free. |
| 105 | +- **Costs / risks:** (1) the rail address is **lossy** — it is a CAM *key*, not |
| 106 | + the content; the lossless shape lives in `ClassView` + the value tenants. |
| 107 | + (2) **Minting governance** — a content address is only stable if the |
| 108 | + rank-minter is frozen; the cross-language convergence test must be |
| 109 | + CI-enforced *before* scaling. (3) **"Everything in OGAR" = OGAR is the fleet |
| 110 | + bottleneck** — the zero-dep contract crate must be the only stable surface; |
| 111 | + the `#[deprecated]` `*Bridge` churn already shows the strain. |
| 112 | +- **Scale honesty:** the substrate is ~11 K nodes / ~24 K triples today, not |
| 113 | + the aspirational 2 M; the `ruff_spo_triplet` per-language pipeline is the |
| 114 | + lever that scales it. |
| 115 | + |
| 116 | +## 4 · Status of the pieces (verified `main`) |
| 117 | +Real: `ModelGraph` interlingua (bidirectional), `ruff_cpp_spo` / `ruff_ruby_spo` |
| 118 | +frontends, `ruff_csharp_spo` loader, the 16-byte mint (`ruff_spo_address`), one |
| 119 | +backend (`ruff_cpp_codegen` → Rust), `bridge_codebook_convergence` (identity). |
| 120 | +To build: Python→ModelGraph normalization, C# harvester generalization, the |
| 121 | +`LangBackend` trait + Python/C# backends, the dual-mode `FacetMode`, the |
| 122 | +`tenant_schema`, the round-trip + convergence CI, the `OGAR-SOC` lint. |
| 123 | + |
| 124 | +## 5 · Companion |
| 125 | +Implementation plan: `ruff` PR "OGAR Polyglot AST Integration (RFC)". |
0 commit comments