|
| 1 | +# AST-as-(part_of:is_a)-address — sinking compiled semantics into the GUID |
| 2 | + |
| 3 | +> **READ BY:** integration-lead, truth-architect, core-first-architect, |
| 4 | +> family-codec-smith, baton-handoff-auditor |
| 5 | +> **Status:** CONJECTURE (design; the **carrier is now SHIPPED** — see V3 |
| 6 | +> alignment below — only the rank-minter is unbuilt/unmeasured). |
| 7 | +> **Cross-ref:** `crates/lance-graph-contract/src/facet.rs` (the **SHIPPED** |
| 8 | +> `FacetCascade` substrate, #613/#614), `canonical_node.rs` |
| 9 | +> (`TailVariant::V3` / `mint_for` / `CLASSID_OSINT_V3`, #615), |
| 10 | +> `guid-canon-and-prefix-routing.md` (the GUID canon), |
| 11 | +> `core-first-transcode-doctrine.md` (harvest → ClassView → codegen), |
| 12 | +> OGAR `SURREAL-AST-AS-ADAPTER.md` + `SURREAL-AST-TRAP-PREFLIGHT.md` |
| 13 | +> (DDL is an adapter, never the spine), `encoding-ecosystem.md`. |
| 14 | +
|
| 15 | +> **V3 alignment (2026-06-26).** This doc was first drafted without awareness that |
| 16 | +> the V3 substrate **already shipped** (#613/#614 `FacetCascade`, #615 `mint_for`). |
| 17 | +> Three corrections, applied throughout: |
| 18 | +> 1. **The slot-count "open gate" is CLOSED.** `facet.rs::FacetCascade` = |
| 19 | +> `facet_classid(4) | 6×(8:8) = 16 B` — **6 tiers** (`HEEL·HIP·TWIG·LEAF·family· |
| 20 | +> identity`), `const _`-asserted. That IS the full-key 6-pair / 12-slot answer; |
| 21 | +> the *key* carries only the 4-tier routing prefix |
| 22 | +> (`NiblePath::from_guid_prefix_v3`), the complete 6-pair address lives in the |
| 23 | +> `FacetCascade` **value facet**. Nothing left to ratify. |
| 24 | +> 2. **The carrier is NOT missing.** `FacetCascade` is **content-blind** and |
| 25 | +> already lists `(part_of:is_a)` as a consumer projection: `hi_chain()` = |
| 26 | +> part_of, `lo_chain()` = is_a, with `hi_distance`/`lo_distance` the two |
| 27 | +> orthogonal prefix metrics. AST-as-(part_of:is_a) is one reading of this shipped |
| 28 | +> substrate; the only genuinely-new brick is the deterministic **rank-minter**. |
| 29 | +> 3. **The classid row is `0x1000_0700`-shaped** (shipped `CLASSID_OSINT_V3`: V3 |
| 30 | +> marker in the HIGH u16, domain routed on the LOW u16). Its `(part_of:is_a)` |
| 31 | +> *ordering* is **pending the operator's Canon:Custom correction** (canon→high) — |
| 32 | +> flagged on the row, not settled here. |
| 33 | +
|
| 34 | +## The claim |
| 35 | + |
| 36 | +The structural AST of a transcode *source* (C#/Python/C++/Ruby) can be stored |
| 37 | +**as the (part_of:is_a) GUID address itself**, in the lance-graph SoA — not as a |
| 38 | +SurrealQL AST/DDL, and not as a raw syntax tree. The GUID *is* the AST node's |
| 39 | +structural identity; the value columns + `CausalEdge64` edges hold the behavior. |
| 40 | +This closes the loop from the `ruff_*_spo` harvest to the OGAR Core, and makes |
| 41 | +an LSP (`ruff-lsp`) the natural read/serve surface. |
| 42 | + |
| 43 | +``` |
| 44 | +source (C#/…) ──ruff_*_spo harvest──► SPO triples ──► AR-shaped Model / ClassView |
| 45 | + │ rank-mint (NEW brick) |
| 46 | + ▼ |
| 47 | + (part_of:is_a) GUID ──► lance-graph SoA |
| 48 | + │ serve |
| 49 | + ▼ |
| 50 | + ruff-lsp ──► editor |
| 51 | +``` |
| 52 | + |
| 53 | +## Why this is the convergence, not a detour |
| 54 | + |
| 55 | +1. **An AST has exactly two structural relations, and they ARE the two tile axes.** |
| 56 | + - **is_a** (taxonomy / typing): `Patient is_a DbBase`, `kdnr is_a Property`, |
| 57 | + `Save is_a Function` ← harvest `inherits_from` + `rdf:type` ← the **is_a |
| 58 | + byte (TISSUE / what)** of a tier. |
| 59 | + - **part_of** (mereology / membership): `kdnr part_of Patient`, |
| 60 | + `Patient part_of MedCare.Models` ← harvest `has_field` / `has_function` |
| 61 | + (inverted) ← the **part_of byte (PLACE / where)** of a tier. |
| 62 | + |
| 63 | +2. **SurrealQL is an adapter, not the spine.** Storing the AST as |
| 64 | + `DEFINE TABLE`/`DEFINE FIELD` DDL is the "negative-beauty hijack" |
| 65 | + `SURREAL-AST-AS-ADAPTER.md` §0 rejects — DDL can't carry the behavioral arm |
| 66 | + and it makes the schema the spine. The (part_of:is_a) GUID + lance SoA is the |
| 67 | + spine; SurrealQL is at most a query projection over it. |
| 68 | + |
| 69 | +## The class wrapper + the rails-shaped (declarative) AST |
| 70 | + |
| 71 | +Two requirements, one insight: |
| 72 | + |
| 73 | +- **Class wrapper** = the OGAR `ClassView` / the `ruff_spo_triplet::Model` IR. |
| 74 | + The GUID is its *address*; the ClassView is the resolved declaration set the |
| 75 | + address points at ("the key prerenders nodes, zero value decode"). |
| 76 | +- **Rails-shaped, not syntax-tree.** You do NOT sink a raw `CSharpSyntaxTree` |
| 77 | + (imperative, positional — wrong shape). You sink the **declarative class-body |
| 78 | + shape** — exactly what `ruff_ruby_spo` harvests as the ActiveRecord shape: a |
| 79 | + class flattened to a *bag of typed declarations*, every one of which is a |
| 80 | + part_of or is_a edge. That is why `ruff_spo_triplet`'s IR is a `Model` with |
| 81 | + sibling declaration `Vec<…>` fields rather than a syntax tree — it is the |
| 82 | + language-agnostic, already-(part_of:is_a)-shaped class wrapper that every |
| 83 | + frontend (Python/C++/Ruby/C#) fills. |
| 84 | + |
| 85 | +## The GUID decomposition — 6×(part_of:is_a) (RESOLVED — the shipped `FacetCascade`) |
| 86 | + |
| 87 | +Operator framing (session): the GUID carries **6 (part_of:is_a) pairs = 12 |
| 88 | +slots**, read across the *whole* key, giving six levels of (composition × type) |
| 89 | +resolution — enough to encode a deep AST node's full structural path, each level |
| 90 | +capturing both where it sits (part_of) and what it is (is_a): |
| 91 | + |
| 92 | +| GUID region | proposed (part_of : is_a) reading | |
| 93 | +| --- | --- | |
| 94 | +| `classid` (4B) | **shipped** as hi u16 : lo u16 = custom/render : canon/concept (`CLASSID_OSINT_V3 = 0x1000_0700` — V3 marker hi, domain routed on lo). **OPEN (Canon:Custom):** the operator's correction flips this to canon(hi):custom(lo) so the prefix sorts by shared concept, not render skin — so the classid `(part_of:is_a)` half-order is **not settled** (orthogonal to the per-tier minting). | |
| 95 | +| `HEEL` / `HIP` / `TWIG` (2B each) | namespace·root-type / class·base / member-slot·member-kind | |
| 96 | +| `basin·leaf` + `identity` (6B) | basin `po_rank` : `ia_rank` (OGAR `family = (po_rank3<<8)|ia_rank3`) | |
| 97 | + |
| 98 | +**Reconciliation — RESOLVED by shipped code** (was framed as "the gate before any |
| 99 | +packer"). `facet.rs::FacetCascade` settles it: `facet_classid(4) | 6×(8:8)` = |
| 100 | +**6 tiers** (`HEEL·HIP·TWIG·LEAF·family·identity`), i.e. the **full-key 6-pair / |
| 101 | +12-slot** reading — *not* path-only. The "path = 6 bytes = CAM-PQ 6×256 = 3 tiers" |
| 102 | +canon describes the **key routing prefix** (`NiblePath::from_guid_prefix_v3`, |
| 103 | +4-tier); the *complete* 6-pair address is the 16-byte `FacetCascade` **value |
| 104 | +facet** (it does not fit the 64-bit key — facet.rs says so explicitly). So |
| 105 | +"path-only vs full-key" is answered **both**: the key carries the routing prefix, |
| 106 | +the FacetCascade carries the full address. The rank-minter therefore targets the |
| 107 | +6 `FacetCascade` tiers (`tier.hi` = part_of chain, `tier.lo` = is_a chain) — no |
| 108 | +allocation is left to ratify, and the only open ordering is the classid half-order |
| 109 | +above (Canon:Custom), which is orthogonal to per-tier packing. |
| 110 | + |
| 111 | +## The missing brick — a deterministic part_of/is_a rank-minter |
| 112 | + |
| 113 | +Between the SPO harvest and the GUID is the **one** genuinely-new component — the |
| 114 | +carrier (`FacetCascade`) is **already shipped**, so this is the only brick to |
| 115 | +build: a pure-Rust, dependency-free rank-minter. Given the corpus's part_of edges |
| 116 | ++ is_a edges, assign each node its `(po_rank, ia_rank)` at each tier and write |
| 117 | +them into the 6 `FacetCascade` tiers (`tier.hi = po_rank`, `tier.lo = ia_rank`); |
| 118 | +`FacetCascade::to_bytes()` is then the 16-byte facet and `hi_chain()`/`lo_chain()` |
| 119 | +the two prefix-routable hierarchies — no new layout, no new type. |
| 120 | + |
| 121 | +- **part_of tree** (namespace → class → member): deterministic sibling/ |
| 122 | + topological rank. |
| 123 | +- **is_a lattice** (class → base, member → kind): deterministic type rank. |
| 124 | +- For a **finite, known** AST this is deterministic assignment — **not** learned |
| 125 | + PQ centroids — so it is exact and roundtrip-lossless (no quantization error on |
| 126 | + a known class graph). Iron-rule clean per **I-VSA-IDENTITIES**: it encodes |
| 127 | + *identity positions* (which class, which base, which slot), never bundles |
| 128 | + content. |
| 129 | + |
| 130 | +## The boundary — skeleton in the address, muscle in the edges |
| 131 | + |
| 132 | +The (part_of:is_a) address holds the **THINK arm**: the class/member/type graph, |
| 133 | +the ClassView method-resolution manifest — precisely the harvest's *structural* |
| 134 | +predicates. It does **not** hold the **DO arm**: method bodies, control flow, the |
| 135 | +transcode logic. Those are not part_of/is_a relations; they live in |
| 136 | +`ActionDef`/`KausalSpec` and the harvest's behavioral edges (`reads_field`, |
| 137 | +`raises`, `traverses_relation`) as `CausalEdge64`, **keyed by** the GUID, not |
| 138 | +encoded in it. (Lance compresses the value/edges arbitrarily; the key stays |
| 139 | +transparent — compression never costs addressability.) |
| 140 | + |
| 141 | +## The serve surface — ruff-lsp (and it's language-agnostic) |
| 142 | + |
| 143 | +`AdaWorldAPI/ruff-lsp` today is a vanilla fork of the deprecated Python |
| 144 | +ruff-lsp (no SPO/lance wiring) — a clean slate for the *read* end. The core LSP |
| 145 | +operations **are** part_of/is_a queries: |
| 146 | + |
| 147 | +| LSP request | graph query | axis | |
| 148 | +| --- | --- | --- | |
| 149 | +| `textDocument/typeHierarchy` | walk the is_a lattice | **is_a** | |
| 150 | +| `textDocument/definition` | resolve symbol → ClassView node (GUID) | address lookup | |
| 151 | +| `references` / `callHierarchy` | walk part_of / behavioral edges | **part_of** + edges | |
| 152 | +| `documentSymbol` / `workspace/symbol` | the class→member part_of tree | **part_of** | |
| 153 | + |
| 154 | +Backed by the (part_of:is_a) lance store, an LSP is no longer a Python-only |
| 155 | +linter surface — it is a **language-agnostic semantic-navigation surface** over |
| 156 | +whatever frontend filled the graph (Python / C++ / **C#-MedCare** / Ruby), |
| 157 | +because at that layer it is all the same `Model`/ClassView keyed by GUIDs. |
| 158 | + |
| 159 | +## Next bricks (ordered; each gated on the prior) |
| 160 | + |
| 161 | +1. **Slot allocation — LOCKED (shipped).** The 6-pair / 12-slot layout is the |
| 162 | + shipped `FacetCascade` (#613/#614); no ratification needed. The ONLY remaining |
| 163 | + ordering decision is the classid `(part_of:is_a)` half-order (the operator's |
| 164 | + Canon:Custom correction) — orthogonal to per-tier ranking, so it does **not** |
| 165 | + block brick 2. |
| 166 | +2. **Build the deterministic rank-minter** (`ruff_spo_address`, pure std): SPO |
| 167 | + graph → `(po_rank, ia_rank)` per level → packed slots. Verifiable in |
| 168 | + isolation. |
| 169 | +3. **Probe on MedCare**: `ruff_csharp_spo` harvest → mint → lance SoA → |
| 170 | + `typeHierarchy`/`definition` query → diff the class graph against the C# |
| 171 | + original, with `MedCareV2` as the parity oracle. |
| 172 | + |
| 173 | +The per-tier layout is **locked** (the shipped `FacetCascade`), so brick 2 (the |
| 174 | +rank-minter) can proceed now — it writes into existing tiers, inventing no type. |
| 175 | +Only the classid half-order (Canon:Custom) remains a decision, and it is |
| 176 | +orthogonal to per-tier packing, so it does not gate the minter. |
0 commit comments