Skip to content

Commit b7eb02c

Browse files
authored
Merge pull request #616 from AdaWorldAPI/claude/medcare-bridge-lance-graph-wmx76z
knowledge: AST-as-(part_of:is_a)-address design doc
2 parents 26abde5 + acc3721 commit b7eb02c

1 file changed

Lines changed: 176 additions & 0 deletions

File tree

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# AST-as-(part_of:is_a)-address — sinking compiled semantics into the GUID
2+
3+
> **READ BY:** integration-lead, truth-architect, core-first-architect,
4+
> family-codec-smith, baton-handoff-auditor
5+
> **Status:** CONJECTURE (design; the **carrier is now SHIPPED** — see V3
6+
> alignment below — only the rank-minter is unbuilt/unmeasured).
7+
> **Cross-ref:** `crates/lance-graph-contract/src/facet.rs` (the **SHIPPED**
8+
> `FacetCascade` substrate, #613/#614), `canonical_node.rs`
9+
> (`TailVariant::V3` / `mint_for` / `CLASSID_OSINT_V3`, #615),
10+
> `guid-canon-and-prefix-routing.md` (the GUID canon),
11+
> `core-first-transcode-doctrine.md` (harvest → ClassView → codegen),
12+
> OGAR `SURREAL-AST-AS-ADAPTER.md` + `SURREAL-AST-TRAP-PREFLIGHT.md`
13+
> (DDL is an adapter, never the spine), `encoding-ecosystem.md`.
14+
15+
> **V3 alignment (2026-06-26).** This doc was first drafted without awareness that
16+
> the V3 substrate **already shipped** (#613/#614 `FacetCascade`, #615 `mint_for`).
17+
> Three corrections, applied throughout:
18+
> 1. **The slot-count "open gate" is CLOSED.** `facet.rs::FacetCascade` =
19+
> `facet_classid(4) | 6×(8:8) = 16 B`**6 tiers** (`HEEL·HIP·TWIG·LEAF·family·
20+
> identity`), `const _`-asserted. That IS the full-key 6-pair / 12-slot answer;
21+
> the *key* carries only the 4-tier routing prefix
22+
> (`NiblePath::from_guid_prefix_v3`), the complete 6-pair address lives in the
23+
> `FacetCascade` **value facet**. Nothing left to ratify.
24+
> 2. **The carrier is NOT missing.** `FacetCascade` is **content-blind** and
25+
> already lists `(part_of:is_a)` as a consumer projection: `hi_chain()` =
26+
> part_of, `lo_chain()` = is_a, with `hi_distance`/`lo_distance` the two
27+
> orthogonal prefix metrics. AST-as-(part_of:is_a) is one reading of this shipped
28+
> substrate; the only genuinely-new brick is the deterministic **rank-minter**.
29+
> 3. **The classid row is `0x1000_0700`-shaped** (shipped `CLASSID_OSINT_V3`: V3
30+
> marker in the HIGH u16, domain routed on the LOW u16). Its `(part_of:is_a)`
31+
> *ordering* is **pending the operator's Canon:Custom correction** (canon→high) —
32+
> flagged on the row, not settled here.
33+
34+
## The claim
35+
36+
The structural AST of a transcode *source* (C#/Python/C++/Ruby) can be stored
37+
**as the (part_of:is_a) GUID address itself**, in the lance-graph SoA — not as a
38+
SurrealQL AST/DDL, and not as a raw syntax tree. The GUID *is* the AST node's
39+
structural identity; the value columns + `CausalEdge64` edges hold the behavior.
40+
This closes the loop from the `ruff_*_spo` harvest to the OGAR Core, and makes
41+
an LSP (`ruff-lsp`) the natural read/serve surface.
42+
43+
```
44+
source (C#/…) ──ruff_*_spo harvest──► SPO triples ──► AR-shaped Model / ClassView
45+
│ rank-mint (NEW brick)
46+
47+
(part_of:is_a) GUID ──► lance-graph SoA
48+
│ serve
49+
50+
ruff-lsp ──► editor
51+
```
52+
53+
## Why this is the convergence, not a detour
54+
55+
1. **An AST has exactly two structural relations, and they ARE the two tile axes.**
56+
- **is_a** (taxonomy / typing): `Patient is_a DbBase`, `kdnr is_a Property`,
57+
`Save is_a Function` ← harvest `inherits_from` + `rdf:type` ← the **is_a
58+
byte (TISSUE / what)** of a tier.
59+
- **part_of** (mereology / membership): `kdnr part_of Patient`,
60+
`Patient part_of MedCare.Models` ← harvest `has_field` / `has_function`
61+
(inverted) ← the **part_of byte (PLACE / where)** of a tier.
62+
63+
2. **SurrealQL is an adapter, not the spine.** Storing the AST as
64+
`DEFINE TABLE`/`DEFINE FIELD` DDL is the "negative-beauty hijack"
65+
`SURREAL-AST-AS-ADAPTER.md` §0 rejects — DDL can't carry the behavioral arm
66+
and it makes the schema the spine. The (part_of:is_a) GUID + lance SoA is the
67+
spine; SurrealQL is at most a query projection over it.
68+
69+
## The class wrapper + the rails-shaped (declarative) AST
70+
71+
Two requirements, one insight:
72+
73+
- **Class wrapper** = the OGAR `ClassView` / the `ruff_spo_triplet::Model` IR.
74+
The GUID is its *address*; the ClassView is the resolved declaration set the
75+
address points at ("the key prerenders nodes, zero value decode").
76+
- **Rails-shaped, not syntax-tree.** You do NOT sink a raw `CSharpSyntaxTree`
77+
(imperative, positional — wrong shape). You sink the **declarative class-body
78+
shape** — exactly what `ruff_ruby_spo` harvests as the ActiveRecord shape: a
79+
class flattened to a *bag of typed declarations*, every one of which is a
80+
part_of or is_a edge. That is why `ruff_spo_triplet`'s IR is a `Model` with
81+
sibling declaration `Vec<…>` fields rather than a syntax tree — it is the
82+
language-agnostic, already-(part_of:is_a)-shaped class wrapper that every
83+
frontend (Python/C++/Ruby/C#) fills.
84+
85+
## The GUID decomposition — 6×(part_of:is_a) (RESOLVED — the shipped `FacetCascade`)
86+
87+
Operator framing (session): the GUID carries **6 (part_of:is_a) pairs = 12
88+
slots**, read across the *whole* key, giving six levels of (composition × type)
89+
resolution — enough to encode a deep AST node's full structural path, each level
90+
capturing both where it sits (part_of) and what it is (is_a):
91+
92+
| GUID region | proposed (part_of : is_a) reading |
93+
| --- | --- |
94+
| `classid` (4B) | **shipped** as hi u16 : lo u16 = custom/render : canon/concept (`CLASSID_OSINT_V3 = 0x1000_0700` — V3 marker hi, domain routed on lo). **OPEN (Canon:Custom):** the operator's correction flips this to canon(hi):custom(lo) so the prefix sorts by shared concept, not render skin — so the classid `(part_of:is_a)` half-order is **not settled** (orthogonal to the per-tier minting). |
95+
| `HEEL` / `HIP` / `TWIG` (2B each) | namespace·root-type / class·base / member-slot·member-kind |
96+
| `basin·leaf` + `identity` (6B) | basin `po_rank` : `ia_rank` (OGAR `family = (po_rank3<<8)|ia_rank3`) |
97+
98+
**Reconciliation — RESOLVED by shipped code** (was framed as "the gate before any
99+
packer"). `facet.rs::FacetCascade` settles it: `facet_classid(4) | 6×(8:8)` =
100+
**6 tiers** (`HEEL·HIP·TWIG·LEAF·family·identity`), i.e. the **full-key 6-pair /
101+
12-slot** reading — *not* path-only. The "path = 6 bytes = CAM-PQ 6×256 = 3 tiers"
102+
canon describes the **key routing prefix** (`NiblePath::from_guid_prefix_v3`,
103+
4-tier); the *complete* 6-pair address is the 16-byte `FacetCascade` **value
104+
facet** (it does not fit the 64-bit key — facet.rs says so explicitly). So
105+
"path-only vs full-key" is answered **both**: the key carries the routing prefix,
106+
the FacetCascade carries the full address. The rank-minter therefore targets the
107+
6 `FacetCascade` tiers (`tier.hi` = part_of chain, `tier.lo` = is_a chain) — no
108+
allocation is left to ratify, and the only open ordering is the classid half-order
109+
above (Canon:Custom), which is orthogonal to per-tier packing.
110+
111+
## The missing brick — a deterministic part_of/is_a rank-minter
112+
113+
Between the SPO harvest and the GUID is the **one** genuinely-new component — the
114+
carrier (`FacetCascade`) is **already shipped**, so this is the only brick to
115+
build: a pure-Rust, dependency-free rank-minter. Given the corpus's part_of edges
116+
+ is_a edges, assign each node its `(po_rank, ia_rank)` at each tier and write
117+
them into the 6 `FacetCascade` tiers (`tier.hi = po_rank`, `tier.lo = ia_rank`);
118+
`FacetCascade::to_bytes()` is then the 16-byte facet and `hi_chain()`/`lo_chain()`
119+
the two prefix-routable hierarchies — no new layout, no new type.
120+
121+
- **part_of tree** (namespace → class → member): deterministic sibling/
122+
topological rank.
123+
- **is_a lattice** (class → base, member → kind): deterministic type rank.
124+
- For a **finite, known** AST this is deterministic assignment — **not** learned
125+
PQ centroids — so it is exact and roundtrip-lossless (no quantization error on
126+
a known class graph). Iron-rule clean per **I-VSA-IDENTITIES**: it encodes
127+
*identity positions* (which class, which base, which slot), never bundles
128+
content.
129+
130+
## The boundary — skeleton in the address, muscle in the edges
131+
132+
The (part_of:is_a) address holds the **THINK arm**: the class/member/type graph,
133+
the ClassView method-resolution manifest — precisely the harvest's *structural*
134+
predicates. It does **not** hold the **DO arm**: method bodies, control flow, the
135+
transcode logic. Those are not part_of/is_a relations; they live in
136+
`ActionDef`/`KausalSpec` and the harvest's behavioral edges (`reads_field`,
137+
`raises`, `traverses_relation`) as `CausalEdge64`, **keyed by** the GUID, not
138+
encoded in it. (Lance compresses the value/edges arbitrarily; the key stays
139+
transparent — compression never costs addressability.)
140+
141+
## The serve surface — ruff-lsp (and it's language-agnostic)
142+
143+
`AdaWorldAPI/ruff-lsp` today is a vanilla fork of the deprecated Python
144+
ruff-lsp (no SPO/lance wiring) — a clean slate for the *read* end. The core LSP
145+
operations **are** part_of/is_a queries:
146+
147+
| LSP request | graph query | axis |
148+
| --- | --- | --- |
149+
| `textDocument/typeHierarchy` | walk the is_a lattice | **is_a** |
150+
| `textDocument/definition` | resolve symbol → ClassView node (GUID) | address lookup |
151+
| `references` / `callHierarchy` | walk part_of / behavioral edges | **part_of** + edges |
152+
| `documentSymbol` / `workspace/symbol` | the class→member part_of tree | **part_of** |
153+
154+
Backed by the (part_of:is_a) lance store, an LSP is no longer a Python-only
155+
linter surface — it is a **language-agnostic semantic-navigation surface** over
156+
whatever frontend filled the graph (Python / C++ / **C#-MedCare** / Ruby),
157+
because at that layer it is all the same `Model`/ClassView keyed by GUIDs.
158+
159+
## Next bricks (ordered; each gated on the prior)
160+
161+
1. **Slot allocation — LOCKED (shipped).** The 6-pair / 12-slot layout is the
162+
shipped `FacetCascade` (#613/#614); no ratification needed. The ONLY remaining
163+
ordering decision is the classid `(part_of:is_a)` half-order (the operator's
164+
Canon:Custom correction) — orthogonal to per-tier ranking, so it does **not**
165+
block brick 2.
166+
2. **Build the deterministic rank-minter** (`ruff_spo_address`, pure std): SPO
167+
graph → `(po_rank, ia_rank)` per level → packed slots. Verifiable in
168+
isolation.
169+
3. **Probe on MedCare**: `ruff_csharp_spo` harvest → mint → lance SoA →
170+
`typeHierarchy`/`definition` query → diff the class graph against the C#
171+
original, with `MedCareV2` as the parity oracle.
172+
173+
The per-tier layout is **locked** (the shipped `FacetCascade`), so brick 2 (the
174+
rank-minter) can proceed now — it writes into existing tiers, inventing no type.
175+
Only the classid half-order (Canon:Custom) remains a decision, and it is
176+
orthogonal to per-tier packing, so it does not gate the minter.

0 commit comments

Comments
 (0)