Skip to content

Commit cbd99e4

Browse files
committed
plans: identity architecture exists-vs-needs map + integration plan (v1)
Grounded map of the structured 128-bit identity (UUIDv8 = HHTL nibble-address formalized) against the existing substrate, from first-hand reads + two cross-repo sweeps. Four findings: (1) the 128-bit identity space is empty (no committed u128/Uuid/[u8;16]-as-id); (2) every GUID FIELD already exists as a committed scalar -> compose, do not re-invent (SchemaPtr + NiblePath + StructuralSignature + EdgeRef); (3) the cross-store transport is already solved by EntityKey(&[u8]) -- smb-bridge key_to_filter already length-branches; (4) the cold path has no stable structured identity today (node_id:u32 + String props, SpoStore u64 dn_hash) -- the identity fills a real gap. 6-layer exists inventory + 7 build gaps (N1-N7 unblocked, N8 surreal BLOCKED on fork coords) + phased plan A-H. Substrate is ~80% present; the work is composition/wiring, not green-field. One open decision: SchemaPtr.entity_type vs NiblePath-prefix as the class carrier. https://claude.ai/code/session_014A4JuRCqKP2yNENrQ9Ha7H
1 parent 030b3f3 commit cbd99e4

1 file changed

Lines changed: 218 additions & 0 deletions

File tree

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
# Identity Architecture — What Exists vs What Needs Building (v1)
2+
3+
> **Status:** INTEGRATION MAP + PLAN. Grounded by first-hand reads + two parallel
4+
> cross-repo sweeps (2026-06-09). Companion to
5+
> `cognitive-write-roundtrip-substrate-v1.md` (the round-trip mechanism).
6+
> **Branch:** `claude/nice-edison-g4rhhl`.
7+
8+
## Thesis
9+
10+
The hot path should carry a lean **128-bit structured immutable identity** (a
11+
UUIDv8 = the HHTL nibble-address *formalized + namespaced*); heavy content stays
12+
in consumer stores keyed by it. The identity does five jobs as register reads of
13+
one object: **resolve** (class-from-address), **route** (delegate switch),
14+
**witness** (immutable audit + merkle), **ground-truth** (shape_hash drift), and
15+
**dispatch-to-store** (EntityKey → consumer). This doc maps what already exists
16+
against what must be built, and phases the integration.
17+
18+
## Four headline findings (grounded)
19+
20+
1. **The 128-bit identity space is empty** — no committed `u128`/`Uuid`(binary)/
21+
`[u8;16]`-as-id exists (the single `[u8;16]`, `atoms.rs:74 I4x32`, is a
22+
thinking-style vector, doc-confirmed *not* an identity). A new GUID won't
23+
byte-collide. *(Agent A sweep, lance-graph + ndarray.)*
24+
25+
2. **But every GUID FIELD already exists as a committed scalar** → the iron
26+
mandate is **compose existing fields, do NOT re-invent**: `namespace` =
27+
`NamespaceId(u8)` inside `SchemaPtr.packed:u32 = [ns:8|entity_type:16|kind:8]`;
28+
`class/address` = `NiblePath` + `ClassId(u16)` + `EdgeRef{family:u8,local:u16}`;
29+
`shape_hash` = `StructuralSignature`; `local` = `EdgeRef.local`. A parallel
30+
re-pack duplicates ratified discriminators (`OD-CLASSID-WIDTH`,
31+
`I-VSA-IDENTITIES`). *(Agent A finding #2.)*
32+
33+
3. **The cross-store transport is ALREADY solved**`EntityKey<'a>(pub &'a [u8])`
34+
(repository.rs:12) is an opaque length-agnostic key both consumer repos use;
35+
`smb-bridge::key_to_filter` already branches on length (12→ObjectId, else→
36+
`Bson::Binary`) on Mongo *and* Lance. A 16-byte GUID is "just another length"
37+
the tested plumbing handles. *(Agent B sweep.)*
38+
39+
4. **The cold path has NO stable structured identity today** — it keys nodes by
40+
bare `node_id:u32` (no edge id; `String` label + `HashMap<String,String>`
41+
props), the SPO hot path keys by a `u64` *content* `dn_hash` (not stable),
42+
`CogRecord` carries no id ("id is the external dn_hash"), and durable identity
43+
is ad-hoc `Uuid→String` (learning crate) + `OgitUri(String)`. **The structured
44+
identity fills a real gap** — provided it *subsumes* `SchemaPtr` + `EdgeRef`,
45+
never parallels them. *(Agent A finding #3.)*
46+
47+
## WHAT EXISTS — grounded inventory (6 layers, file:line)
48+
49+
### Layer 0 — address / discriminator scalars (the GUID's fields)
50+
| Type | Width | Role | Status | Evidence |
51+
|---|---|---|---|---|
52+
| `NiblePath{path:u64,depth:u8}` | 72 | HHTL tree address (basin/child/is_ancestor_of, 16ⁿ) | **[G]** | hhtl.rs |
53+
| `SchemaPtr{packed:u32=[ns:8\|entity_type:16\|kind:8], ctx:u32}` | 64 | schema/type pointer | **[G]** | namespace.rs:119 |
54+
| `NamespaceId(u8)` | 8 | OGIT namespace ordinal | **[G]** | namespace.rs:24 |
55+
| `ClassId = u16` | 16 | per-row shape discriminator ("never a content hash") | **[G]** | class_view.rs:53 |
56+
| `EntityTypeId = u16` | 16 | per-row object-type (Palantir) | **[G]** | ontology.rs:81 |
57+
| `FieldMask(u64)` + `inherit` | 64 | presence bitmask, parent-OR-delta | **[G]** | class_view.rs:69,136 |
58+
| `StructuralSignature` (shape_hash) | hash | "deterministic hash over property-id set" | **[G] type / [H] live-wire** | odoo_blueprint::class_signature |
59+
| `EdgeRef{family:u8,local:u16}` | 24 | episodic HHTL family+local address | **[G]** | episodic_edges.rs:34 |
60+
61+
### Layer 1 — edge / handoff carriers (the LE "sound members")
62+
| Type | Width | Role | Status |
63+
|---|---|---|---|
64+
| `EpisodicEdges64(u64)` = 4×EdgeRef, MRU promote/evict, `to_le_bytes` | 64 | AriGraph episodic edges | **[G]** episodic_edges.rs |
65+
| `CausalEdge64(u64)` (NARS 10+10 ×1023) | 64 | baton/causal edge payload | **[G]** ndarray causal_diff.rs:153 |
66+
| Baton `(target:u16, edge:u64)` | 80 | inter-mailbox handoff | **[G]** collapse_gate.rs:235 |
67+
| `MailboxId=u32`, `MailboxRow{mailbox_ref:u32,row_idx:u32}` | 32/64 | mailbox + row address | **[G]** |
68+
69+
### Layer 2 — cold-path stores (TODAY: thin + inconsistent)
70+
| Store | Key | Status |
71+
|---|---|---|
72+
| `MetadataStore`: `NodeRecord{node_id:u32, label:String, properties:HashMap<String,String>}`, `EdgeRecord{source:u32,target:u32,edge_type:String}` | u32 + **STRING label/props (legacy Cypher)** | **[G]** metadata.rs:60,86 |
73+
| `SpoStore`: `HashMap<u64 dn_hash, SpoRecord>` | u64 **content-hash** (not stable id) | **[G]** spo/store.rs:38 |
74+
| ndarray `CogRecord{meta,cam,btree,embed}` | **no id** ("id is external dn_hash") | **[G]** cogrecord.rs:56 |
75+
| `WitnessId(u64)` (arigraph witness) | 64 opaque handle | **[G]** witness_corpus.rs:63 |
76+
77+
### Layer 3 — resolution (class-from-address)
78+
| Surface | Status |
79+
|---|---|
80+
| `RegistryClassView: ClassView` (fields/template/dolce_category_id) | **[G] resolve / [H] field-enum deferred** class_resolver.rs |
81+
| `OntologyRegistry`: `resolve_uri`, `enumerate_first_with_entity_type_id(u16)`, `resolve_iri_in` | **[G]** registry.rs |
82+
83+
### Layer 4 — commit + witness (the membrane)
84+
| Surface | Status |
85+
|---|---|
86+
| `SoaEnvelope` trait + `ColumnDescriptor` (container-LE geometry) | **[G] trait / [H] ZERO impls** soa_envelope.rs |
87+
| `MailboxSoaView`/`MailboxSoaOwner` (read airgap + Rubicon `try_advance_phase`) | **[G]** soa_view.rs |
88+
| `commit_event` sole-writer + `ExternalMembrane::project` + `CommitFilter`/`MembraneGate` | **[G]** lance_membrane.rs:315 |
89+
| `CognitiveEventRow` (scalar audit event — VSA stripped) | **[G]** external_intent.rs:113 |
90+
| `MerkleRoot(u64)` ×3 (audit/SPO/unified) + `AuditSink` (jsonl/lance) | **[G]** audit_sink/, merkle.rs |
91+
| `SlaPolicy`, `TenantScope` | **[G] types** sla.rs |
92+
93+
### Layer 5 — cross-store transport (the consumer boundary)
94+
| Surface | Status |
95+
|---|---|
96+
| `EntityKey<'a>(pub &'a [u8])` — opaque length-agnostic key | **[G]** repository.rs:12 |
97+
| `EntityStore`/`EntityWriter`/`Batch` traits | **[G]** repository.rs |
98+
| `smb-bridge`: implements both for Mongo+Lance, `key_to_filter` length-branch | **[G]** smb-bridge/mongo.rs:79, lance.rs:92 |
99+
| MedCare-rs: MySQL i64 PKs; DMS `sha256`(NOT NULL)+`storage_key`; imports EntityKey | **[G]** dms.rs:14, graph_contract.rs:31 |
100+
| smb-office-rs: Mongo `ObjectId`(12B) + `String` refs; actively impls repository | **[G]** base.rs:92 |
101+
102+
### Layer 6 — round-trip / substrate-hardening
103+
| Surface | Status |
104+
|---|---|
105+
| `TripletProjection` trait + `roundtrip_eq``RoundTripFailure` | **[G]** codegen_spine.rs:107 |
106+
| cognitive-write projection (mailbox SoA → SPO+edges) | **[H] does not exist** |
107+
108+
## WHAT NEEDS BUILDING — 7 gaps (each: what it REUSES [G] + what it ADDS [H])
109+
110+
| # | Gap | Reuses (exists [G]) | Adds [H] | Blocked? |
111+
|---|---|---|---|---|
112+
| **N1** | **`NodeGuid`/`EdgeGuid`** 128-bit identity type | `SchemaPtr``NiblePath``StructuralSignature``EdgeRef.local` | the UUIDv8 composition + layout version + the 5 readings | no |
113+
| **N2** | wire `StructuralSignature` into live `RegistryClassView` | `StructuralSignature` type, `ClassView` | the field-enum from `MappingRow` (the deferred D-CLS audit) | no |
114+
| **N3** | `SoaEnvelope` **implementor** for `MailboxSoA<N>` | `SoaEnvelope` trait, `MailboxSoaView` | the zero-copy impl (mailbox bytes == cold bytes) | no |
115+
| **N4** | cognitive-write `TripletProjection` + `roundtrip_eq` | `TripletProjection`, `EpisodicEdges64`/`CausalEdge64` `to_le_bytes` | the project/decompile over the identity graph | no |
116+
| **N5** | `project_graph` emitter through the gate | `commit_event`, `CommitFilter`/`MembraneGate`, `ExternalMembrane` | the node/edge projection (today emits scalar `CognitiveEventRow`) | no |
117+
| **N6** | **`MetadataStore` string→identity migration** | `MetadataStore`, `EntityKey` | `NodeRecord`/`EdgeRecord` keyed by `NodeGuid` not `String` label/props | no (I-LEGACY-API gated) |
118+
| **N7** | GUID-as-`EntityKey` wiring + MedCare `external_ref` | `EntityKey`, `EntityStore`/`EntityWriter`, smb `key_to_filter` | pass 16-byte key + **one** MedCare column (or reuse `sha256`) | no |
119+
| **N8** | surreal_container SurrealQL read glove | `surreal_container` skeleton | the kv-lance read path | **BLOCKED(C)** fork coords |
120+
121+
**Only N8 is blocked.** N1-N7 need no surrealdb coords.
122+
123+
## N1 — the identity type as a COMPOSITION (the iron mandate from Agent A #2)
124+
125+
```rust
126+
// crates/lance-graph-contract/src/identity.rs (NEW, zero-dep)
127+
// EVERY field is an existing committed type. No re-invention.
128+
129+
/// 128-bit immutable structured node identity (UUIDv8, RFC 9562).
130+
/// Frozen at write; the class is RE-RESOLVED from the address (never stored mutable).
131+
#[repr(C, align(16))]
132+
pub struct NodeGuid([u8; 16]);
133+
// bits 0..32 : SchemaPtr.packed [ns:8 | entity_type:16 | kind:8] ← REUSE namespace.rs:119
134+
// bits 32..74 : NiblePath prefix (path bits + small depth; ver nibble carved at 48..52)
135+
// bits 74..98 : StructuralSignature (shape_hash, truncated) ← REUSE odoo_blueprint
136+
// bits 98..122 : local instance (EdgeRef.local widened) ← REUSE episodic_edges
137+
// bits 48..52 : version = 8 · bits 64..66 : variant = 10 ← RFC 9562 reserved (6 b)
138+
139+
/// 128-bit edge identity: source address ⊕ the episodic EdgeRef.
140+
#[repr(C, align(16))]
141+
pub struct EdgeGuid([u8; 16]);
142+
// = [ source SchemaPtr/NiblePath | EdgeRef{family:u8, local:u16} | shape_hash ] ← REUSE EpisodicEdges64
143+
```
144+
145+
**The five readings (register reads of one key):**
146+
- **resolve** `guid.schema_ptr() → entity_type → ClassView` (class-from-address, O(1) bit-shift + cache)
147+
- **route** `guid.niblepath().is_ancestor_of(...)` → delegate switch (HHTL bit-shift, through `OrchestrationBridge`)
148+
- **witness** frozen `[u8;16]` + `MerkleRoot` chain (immutable, examined-in-place)
149+
- **ground-truth** `guid.shape_hash() != resolve(addr).shape_hash_now` → drift (read-time diff)
150+
- **dispatch-to-store** `EntityKey(guid.as_bytes())` → consumer (Layer-5 transport, already [G])
151+
152+
**Immutability law (ratified this session):** `class_id` never updates — it's the
153+
lineage id, re-resolved from the address for free; the GUID is write-once; drift
154+
*repair* is a **new immutable version** (Lance is versioned), never an in-place
155+
mutation. `I-VSA-IDENTITIES` Test 0: the GUID is a register key (points to
156+
content), never VSA-bundled.
157+
158+
### ⚠ One open DECISION (yours to pin — both grounded, bijective)
159+
The class can be carried two ways; pick the **stored** form, resolve the other:
160+
- **(D1) `SchemaPtr.entity_type:u16`** — reuse the existing dense pointer (Agent A "compose existing"). Compact, exact.
161+
- **(D2) `NiblePath` prefix** — identity-IS-address (ADR-1374, your "nibble = the GUID class"). O(1) ancestry-routing without a cache hit.
162+
- **Recommendation:** store **SchemaPtr (exact) + a truncated NiblePath prefix (for routing)** — SchemaPtr resolves deep paths exactly; the prefix gives branchless `is_ancestor_of`. Costs ~42 bits for the prefix; worth it for probe-free routing.
163+
164+
## Phased integration plan (A→H; each phase = one landable PR)
165+
166+
| Phase | Gap | Crate | Deliverable | DoD | Dep |
167+
|---|---|---|---|---|---|
168+
| **A** | N1 | contract | `NodeGuid`/`EdgeGuid` as composition of existing fields + layout version | byte-decompose round-trips to `SchemaPtr`/`NiblePath`/`StructuralSignature`/`local`; UUIDv8 validates; zero-dep; clippy/fmt ||
169+
| **B** | N2 | ontology | wire `StructuralSignature``RegistryClassView` (enumerate field-set from `MappingRow`) | `shape_hash(class_id)` returns a stable signature; the deferred D-CLS field-enum closed | A |
170+
| **C** | N3 | shader-driver | `impl SoaEnvelope for MailboxSoA<N>` (zero-copy) | `as_le_bytes().as_ptr()==backing`; `verify_layout()` green ||
171+
| **D** | N4 | lance-graph | cognitive-write `TripletProjection` + `roundtrip_eq` over the identity graph | passes the `account.move` fixture; corrupt-pack fails; (f,c) within 1/1023 | A, C |
172+
| **E** | N5 | callcenter | `project_graph` (node/edge emitter) through `commit_event`+gate | committed cycle queryable as `NodeGuid` nodes + `EdgeGuid` edges; version ticks; RBAC applies | A, D |
173+
| **F** | N6 | lance-graph core | `MetadataStore` string→identity: `NodeRecord`/`EdgeRecord` keyed by `NodeGuid` (label/props → resolved-from-identity) | old string path feature-gated/migrated; field-isolation tests (I-LEGACY-API); query parity | A, B, E |
174+
| **G** | N7 | consumers | GUID-as-`EntityKey`(16B) + MedCare `external_ref` (or `sha256` reuse) | smb: 16-byte key resolves via existing `key_to_filter`; MedCare: GUID→row reverse lookup | A |
175+
| **H** | N8 | surreal_container | SurrealQL read glove | DEFERRED — **BLOCKED(C)** fork coords | E |
176+
177+
**Critical path:** A → (B, C) → D → E → F. G hangs off A (parallel). H is gated.
178+
**Smallest unblocked first brick:** Phase A (the `NodeGuid` composition, zero-dep contract) OR Phase C (the `SoaEnvelope` impl) — both leaf, both needed by D.
179+
180+
## Honest ledger
181+
182+
- **[G] (exists, reuse):** all 6 layers above — `NiblePath`, `SchemaPtr`, `ClassId`,
183+
`StructuralSignature` (type), `EdgeRef`, `EpisodicEdges64`/`CausalEdge64` LE,
184+
`commit_event`+gate, `MerkleRoot`+`AuditSink`, `SlaPolicy`/`TenantScope`,
185+
`EntityKey`+`EntityStore`/`EntityWriter`, `TripletProjection`. **The substrate is
186+
~80% present.**
187+
- **[H] (build):** N1-N7 — but each is a *composition/wiring* of [G] parts, not a
188+
green-field invention. The largest is N6 (cold-path string→identity migration).
189+
- **[BLOCKED(C)]:** N8 only (surrealdb fork coords — human gate; lance-graph P0
190+
"STOP and ask").
191+
- **One open [DECISION]:** D1 vs D2 (SchemaPtr-entity_type vs NiblePath-prefix as
192+
the class carrier) — recommendation: both (exact + routing prefix).
193+
194+
## Guards (iron rules this plan must not violate)
195+
196+
- **I-VSA-IDENTITIES:** the GUID is a register key that POINTS TO content; never
197+
VSA-bundle it, never intern open content (only the closed vocabulary). Identities
198+
intern; scanned papers / free text stay in consumer stores (Layer 5).
199+
- **Compose, don't parallel (Agent A #2):** N1 MUST subsume `SchemaPtr` +
200+
`EdgeRef`, not re-pack ns/class/family beside them.
201+
- **I-LEGACY-API-FEATURE-GATED:** N6's string→identity layout reclaim needs a
202+
version gate + field-isolation matrix tests.
203+
- **Sole-writer / no-&mut-during-compute:** N5 reads SoA (`&self`), builds owned
204+
identity rows, `commit_event` is the gated write-back; drift *repair* is a new
205+
version, never in-place mutation (the immutability law).
206+
- **AGI-as-SoA:** the GUID is per-NODE at the membrane, NOT a 16-byte-per-row SoA
207+
column (the hot SoA keeps its lean `u16 class_id`).
208+
209+
## Provenance
210+
211+
First-hand reads (2026-06-09): hhtl.rs · soa_envelope.rs · soa_view.rs ·
212+
class_resolver.rs · class_view.rs · episodic_edges.rs · metadata.rs:60-94 ·
213+
registry.rs · namespace.rs · wikidata_hhtl.rs · lance_membrane.rs:315-429 ·
214+
external_intent.rs:113 · sla.rs · codegen_spine.rs · atoms.rs:74 · audit_sink/.
215+
Cross-repo sweeps: Agent A (lance-graph + ndarray identity-type inventory) ·
216+
Agent B (MedCare-rs + smb-office-rs store keys — `EntityKey`, MySQL i64 / Mongo
217+
ObjectId, DMS `sha256`/`storage_key`). Companion:
218+
`cognitive-write-roundtrip-substrate-v1.md`.

0 commit comments

Comments
 (0)