Skip to content

Commit 43d2ae4

Browse files
committed
feat(contract): node_rows_from_le_bytes — zero-copy SoA read contract (surrealdb second-brain primitive)
The inverse of NodeRowPacket::as_le_bytes (the WRITE cast): a CHECKED zero-copy read of a &[NodeRow] out of an external LE byte buffer — the load-bearing primitive for "a store hands lance-graph its SoA view without a copy." node_rows_from_le_bytes(&[u8]) -> Option<&[NodeRow]> - Some(view) iff len % 512 == 0 AND ptr % align_of::<NodeRow>() (64) == 0 - None otherwise (caller copies rather than risk UB); empty -> Some(&[]) - SAFETY documented: NodeRow is #[repr(C, align(64))], size 512 (const-asserted), every 512-byte pattern is a valid NodeRow. This IS the LE contract a backing store satisfies: hand a 64-aligned, 512-multiple LE buffer and its bytes ARE the SoA the cognitive shader reads in place. Intended consumer: surrealdb kv-lance storing each node as an UNcompressed arrow FixedSizeBinary(512) blob (64-aligned value buffer) — then surrealdb's bytes are a zero-copy &[NodeRow], a "second brain" sharing one byte layout with no serialize boundary. Brutal caveats documented in the fn + EPIPHANY: - A variable-length Binary column does NOT qualify (no fixed stride / no align); the SoA value path needs FixedSizeBinary(512). - Value zero-copy holds only if the column is UNcompressed (a compressed column decodes to a buffer first = one copy, still no per-field deserialize). The KEY (address) is always zero-copy (OGAR canon: the key is never compressed). Re-exported from lib.rs (+ NodeRowPacket). 2 tests (zero-copy round-trip with ptr-identity assert; rejects non-multiple-512 + a misaligned-but-correct-length window). 712 contract lib green; clippy -D warnings (default + guid-v2-tail) + fmt clean. Board: AGENT_LOG (cont.14), LATEST_STATE IN-PR entry, EPIPHANIES E-SURREALDB-SECOND-BRAIN-IS-ZERO-COPY-IFF-FIXEDSIZEBINARY. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
1 parent 3350d78 commit 43d2ae4

5 files changed

Lines changed: 132 additions & 2 deletions

File tree

.claude/board/AGENT_LOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
## 2026-06-20 (cont.¹⁴) — zero-copy SoA read contract (`node_rows_from_le_bytes`) — the surrealdb "second brain" primitive
2+
3+
**Main thread (Opus), autoattended.** Operator: "create a contract … that ensures LE contract to the lance-graph SoA view → zero-copy symbiont; surrealdb becomes a second brain inside lance-graph." Brutal feasibility pass against real code on both sides (see EPIPHANY `E-SURREALDB-SECOND-BRAIN-IS-ZERO-COPY-IFF-FIXEDSIZEBINARY`):
4+
- lance-graph side already zero-copy-ready: `NodeRow` `#[repr(C, align(64))]` 512B LE; `NodeRowPacket::as_le_bytes` is the WRITE cast. **Shipped the missing READ inverse:** `canonical_node::node_rows_from_le_bytes(&[u8]) -> Option<&[NodeRow]>` — checked (`len % 512`, `ptr % 64`), `None` on violation (caller copies, no UB), empty→Some(empty). Re-exported from lib.rs; +`NodeRowPacket` re-export. 2 tests (zero-copy round-trip with ptr-identity assert; rejects non-multiple + misaligned-but-correct-length window). 712 lib green, clippy `-D warnings` both configs + fmt clean.
5+
- surrealdb side does NOT yet qualify: `.claude/lance-backend/lance/schema.rs` stores `val: DataType::Binary` (variable BinaryArray, no fixed stride / no align) → not castable. Needs `FixedSizeBinary(512)` SoA value column + deps the zero-dep contract + reads through `node_rows_from_le_bytes`. Caveat: value zero-copy iff stored UNcompressed (compressed = one decode-copy; key always zero-copy). That's the surrealdb-side plan (the lance-backend wiring), not done here.
6+
7+
Rides on the jirak branch (PR #564 arc — the symbiont contract surface: OGAR activation + SoA zero-copy reader). Next: the surrealdb-side FixedSizeBinary(512) SoA path plan.
8+
19
## 2026-06-20 (cont.¹³) — clean separation: NEW `lance-graph-ogar` activation crate (OGAR AR surface), #563 merged
210

311
**Main thread (Opus), autoattended.** Operator: "what about clean separation — lance-graph-ontology OGIT / lance-graph-ogar OGAR" + correction "OGAR isn't just vocab, it's classes, ClassView, active-record shape" + "563 merged". Rebased jirak onto new main (ff1a3452 = merged #563, so `contract::ogar_codebook` is now ON main).

.claude/board/EPIPHANIES.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
1+
## 2026-06-20 — E-SURREALDB-SECOND-BRAIN-IS-ZERO-COPY-IFF-FIXEDSIZEBINARY — surrealdb (kv-lance) can become a zero-copy "second brain" inside lance-graph ONLY if it stores each node as an uncompressed `FixedSizeBinary(512)` LE blob; the contract a store satisfies is `node_rows_from_le_bytes(&[u8]) -> Option<&[NodeRow]>` (the inverse of `NodeRowPacket::as_le_bytes`), and a variable-length `Binary` column does NOT qualify
2+
3+
**Status:** FINDING (brutal feasibility pass, 2026-06-20; contract primitive shipped, surrealdb side planned).
4+
5+
Operator proposal: "a contract in surrealdb that ensures LE contract to the lance-graph SoA view → zero-copy symbiont; surrealdb becomes a second brain inside lance-graph." Verified against the real code on both sides:
6+
7+
**lance-graph side — already zero-copy-ready:** `NodeRow` is `#[repr(C, align(64))]`, 512 bytes (const-asserted), key(16)+edges(16)+value(480), all canon-LE. `NodeRowPacket::as_le_bytes` already casts `&[NodeRow] → &[u8]` (the WRITE path). The missing piece — the READ path — is now shipped: **`node_rows_from_le_bytes(&[u8]) -> Option<&[NodeRow]>`** (checked: `len % 512 == 0` AND `ptr % 64 == 0`, else `None` → caller copies rather than risk UB). That function IS the LE contract a backing store satisfies.
8+
9+
**surrealdb side — NOT zero-copy as scaffolded:** the `.claude/lance-backend` Arrow schema stores `val: DataType::Binary` (variable-length `BinaryArray` = offsets + an unaligned, non-fixed-stride value buffer). That **cannot** be cast to `&[NodeRow]`. The SoA value path needs `DataType::FixedSizeBinary(512)` (a single contiguous N×512 buffer, which arrow-rs allocates 64-byte aligned) so the column buffer IS a `&[NodeRow]` in place.
10+
11+
**The two honest caveats (brutal):**
12+
1. **Zero-copy on the VALUE requires the column be UNcompressed.** A Lance-compressed `FixedSizeBinary` decodes to a contiguous buffer first — that's ONE copy (still no per-field deserialize, far better than serde, but not literally zero). The KEY (16 bytes) is always addressable zero-copy (OGAR canon: "the key is never compressed"). So: zero-copy address always; zero-copy value iff stored uncompressed.
13+
2. **Direction of the contract:** the LE layout is OWNED by `lance-graph-contract` (`NodeRow` + `node_rows_from_le_bytes`); surrealdb SATISFIES it by depending on the **zero-dep** contract (the handshake, not the engine — OGAR-pattern) and storing `FixedSizeBinary(512)`. "Second brain" = surrealdb's kv-lance bytes ARE the SoA the cognitive shader reads in place; surrealdb adds SurrealQL + MVCC + Lance versioning over the SAME bytes, no serialize boundary.
14+
15+
Shipped: `node_rows_from_le_bytes` + round-trip/reject tests (712 lib green, clippy+fmt clean). Planned: surrealdb-side `FixedSizeBinary(512)` SoA value path + read-through (the `.claude/lance-backend` 12-day wiring). Cross-ref: AGENT_LOG 2026-06-20 (cont.¹⁴); `soa_envelope::SoaEnvelope`; surrealdb `.claude/lance-backend/lance/schema.rs`; OGAR canon "the GUID is the key of key-value … the key is never compressed."
16+
117
## 2026-06-20 — E-OGAR-IS-AR-CORE-AUTOACTIVATED-BY-CARGO-PRESENCE — OGAR is the Active-Record Core (Class + ClassView), not "just vocab"; it already `impl`s the contract's `ClassView`, so a lance-graph-side `lance-graph-ogar` crate AUTO-ACTIVATES the real AR surface wherever it is compiled in (Cargo presence = the switch, no runtime detection), guarded against drift by a parity fuse — while OGAR stays headless-capable and the contract stays zero-dep
218

319
**Status:** FINDING (operator clean-separation lock, 2026-06-20; shipped `crates/lance-graph-ogar`).

.claude/board/LATEST_STATE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616

1717
---
1818

19+
> **2026-06-20 — IN PR (`claude/jirak-math-theorems-harvest-rfii13`)****Zero-copy SoA read contract: `node_rows_from_le_bytes` (the surrealdb "second brain" primitive).** The inverse of `NodeRowPacket::as_le_bytes` (WRITE) — `canonical_node::node_rows_from_le_bytes(&[u8]) -> Option<&[NodeRow]>`, a CHECKED zero-copy cast (`len % 512 == 0` AND `ptr % 64 == 0`, else `None` → caller copies, no UB; empty→Some(empty)). This IS the LE contract a backing store satisfies so its bytes ARE the SoA the cognitive shader reads in place. **Brutal verdict:** lance-graph side now zero-copy-ready end-to-end; surrealdb's kv-lance does NOT qualify as scaffolded (`val: DataType::Binary` variable-length → needs `FixedSizeBinary(512)`), and value zero-copy holds only if stored UNcompressed (key/address always zero-copy). 712 contract lib green, clippy `-D warnings` both configs + fmt clean. Refs: AGENT_LOG 2026-06-20 (cont.¹⁴), EPIPHANIES `E-SURREALDB-SECOND-BRAIN-IS-ZERO-COPY-IFF-FIXEDSIZEBINARY`.
20+
>
1921
> **2026-06-20 — IN PR (`claude/jirak-math-theorems-harvest-rfii13`)** — **Clean separation: NEW `lance-graph-ogar` activation crate (OGAR Active-Record surface).** The OGAR half of `ontology=OGIT / ogar=OGAR`. OGAR is the AR Core and ALREADY `impl`s the contract: `ogar-class-view::OgarClassView impl lance_graph_contract::ClassView` (32 concepts), `ogar-vocab::Class` = AR shape, `canonical_concept_id == ClassId`. NEW `crates/lance-graph-ogar` (EXCLUDED, own `[workspace]`, git-deps OGAR@main + lance-graph-contract@main = ONE source, no `[patch]`) re-exports the full AR surface (ogar-vocab + ogar-class-view + ogar-ontology + ogar-adapter-surrealql) + a **parity-guard** (`assert_codebook_parity`: bijective `ogar_codebook::CODEBOOK ⇄ ogar_vocab::class_ids::ALL` + domain agreement, FAILS build on drift). Features: `default` (light, emit-only), `surrealql-parser` (parser half), `serde`. **Auto-activation = Cargo presence**: pull the crate → real OGAR AR + drift fuse; don't → contract's zero-dep mirror + bare ClassView trait (OGAR stays headless). `cargo test --manifest-path crates/lance-graph-ogar/Cargo.toml` **3/3** green, clippy + fmt clean, contract = ONE source (git main #ff1a3452). Refs: AGENT_LOG 2026-06-20 (cont.¹³), EPIPHANIES `E-OGAR-IS-AR-CORE-AUTOACTIVATED-BY-CARGO-PRESENCE`, plan D-OVC-5. **(#563 D-OVC contract realign now MERGED to main.)**
2022
>
2123
> **2026-06-20 — MERGED #563 (`claude/jirak-math-theorems-harvest-rfii13`)** — **D-OVC: contract classids realigned to OGAR `0xDDCC` + `contract::ogar_codebook` wire-compat mirror.** Resolved ISS-CLASSID-OGAR-DRIFT (operator-signed). **Realigned (layout-preserving const values, no `ENVELOPE_LAYOUT_VERSION` bump):** `CLASSID_OSINT 0x0007 → 0x0700` (OSINT domain root, `>>8 == 0x07`), `CLASSID_FMA 0x0008 → 0x0901` (anatomy concept in Health domain, `0x0900` = root). **Minted:** `CLASSID_PROJECT = 0x0100` + `CLASSID_ERP = 0x0200` with `ReadMode::{PROJECT, ERP}` (Cognitive/CoarseOnly) registered in `BUILTIN_READ_MODES`; `soa_graph::{PROJECT, ERP}` DomainSpecs. **NEW `contract::ogar_codebook`** (zero-dep, **wire-compat — NO OGAR↔contract dependency**): `ConceptDomain` (7 domains, `id>>8` route), `canonical_concept_domain`, `classid_concept_domain` (D-OVC-4 classid→domain), `source_domain_concept`, `CODEBOOK` (26 project `0x01XX` + 6 commerce `0x02XX`, mirrored from OGAR `ogar-vocab` `lib.rs:1073`), `canonical_concept_id`, `LabelDTO::from_canonical` + `id_le`. Drift-guard test pins the shared `0xDDCC` ids. Contract **710** lib (default) / **716** (`guid-v2-tail`), callcenter `--features query` **211** green; clippy `-D warnings` + fmt clean both configs. Refs: AGENT_LOG 2026-06-20 (cont.¹²), plan `ogar-vocab-contract-codebook-migration-v1.md` (D-OVC-1/2/4 SHIPPED, D-OVC-3 PARTIAL), ISSUES `ISS-CLASSID-OGAR-DRIFT` (RESOLVING).

crates/lance-graph-contract/src/canonical_node.rs

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -968,10 +968,114 @@ impl<'a> SoaEnvelope for NodeRowPacket<'a> {
968968
}
969969
}
970970

971+
/// Zero-copy **read** of a [`NodeRow`] slice out of an external LE byte buffer —
972+
/// the inverse of [`NodeRowPacket::as_le_bytes`] and the load-bearing primitive
973+
/// for "a store hands lance-graph its SoA view without a copy."
974+
///
975+
/// This is the **LE contract a backing store satisfies**: hand a byte slice that
976+
/// is (a) a whole number of 512-byte rows and (b) aligned to
977+
/// `align_of::<NodeRow>()` (64), and you get back a `&[NodeRow]` viewing the SAME
978+
/// bytes — no allocation, no deserialize. Returns `None` if either invariant
979+
/// fails (a sliced/offset buffer that lost 64-byte alignment, or a length that
980+
/// isn't a multiple of the stride), so the caller can fall back to a copy rather
981+
/// than risk UB.
982+
///
983+
/// Intended consumer: a Lance-backed key-value store (e.g. surrealdb's `kv-lance`)
984+
/// that persists each node as a fixed-size 512-byte LE blob
985+
/// (`arrow::FixedSizeBinary(512)`, whose value buffer arrow-rs allocates 64-byte
986+
/// aligned). The store's value buffer is then directly a `&[NodeRow]` the
987+
/// cognitive shader reads in place — surrealdb's bytes ARE the SoA. (A
988+
/// *variable-length* `Binary` column does NOT qualify: it has no fixed stride and
989+
/// no alignment guarantee; the store must use `FixedSizeBinary(512)` for the SoA
990+
/// value path. And the buffer must be uncompressed for the read to be literally
991+
/// zero-copy — a Lance-compressed column decodes to a contiguous buffer first,
992+
/// which is one copy, still no per-field deserialize.)
993+
///
994+
/// The bytes are interpreted in canon-LE order exactly as [`NodeGuid`]/[`EdgeBlock`]
995+
/// wrote them, so no endianness translation happens at the boundary.
996+
#[inline]
997+
#[must_use]
998+
pub fn node_rows_from_le_bytes(bytes: &[u8]) -> Option<&[NodeRow]> {
999+
if bytes.is_empty() {
1000+
return Some(&[]);
1001+
}
1002+
if !bytes.len().is_multiple_of(NODE_ROW_STRIDE) {
1003+
return None;
1004+
}
1005+
if !(bytes.as_ptr() as usize).is_multiple_of(core::mem::align_of::<NodeRow>()) {
1006+
return None;
1007+
}
1008+
let n = bytes.len() / NODE_ROW_STRIDE;
1009+
// SAFETY: NodeRow is #[repr(C, align(64))], size_of == 512 == NODE_ROW_STRIDE
1010+
// (const-asserted above). We checked (1) bytes.len() is an exact multiple of
1011+
// the stride, so n rows span the whole slice with no trailing bytes, and (2)
1012+
// the pointer is aligned to align_of::<NodeRow>() (64). Every bit pattern in
1013+
// the 512 bytes is a valid NodeRow (NodeGuid is bytes, EdgeBlock is [u8;16],
1014+
// value is [u8;480] — no niche/enum to invalidate), so the reinterpretation
1015+
// is sound. The returned slice borrows `bytes` for its lifetime (no copy).
1016+
Some(unsafe { core::slice::from_raw_parts(bytes.as_ptr().cast::<NodeRow>(), n) })
1017+
}
1018+
9711019
#[cfg(test)]
9721020
mod tests {
9731021
use super::*;
9741022

1023+
#[test]
1024+
fn node_rows_le_bytes_round_trip_zero_copy() {
1025+
// Build a small SoA, view it as LE bytes (the write path), then read it
1026+
// back as &[NodeRow] (the inverse) — same bytes, no copy.
1027+
let rows = vec![
1028+
NodeRow {
1029+
key: NodeGuid::new(NodeGuid::CLASSID_OSINT, 1, 2, 3, 0xAB, 0xCD),
1030+
edges: EdgeBlock::default(),
1031+
value: [7u8; 480],
1032+
},
1033+
NodeRow {
1034+
key: NodeGuid::new(NodeGuid::CLASSID_PROJECT, 4, 5, 6, 0x11, 0x22),
1035+
edges: EdgeBlock::default(),
1036+
value: [9u8; 480],
1037+
},
1038+
];
1039+
let packet = NodeRowPacket::new(&rows, 0);
1040+
let bytes = packet.as_le_bytes();
1041+
assert_eq!(bytes.len(), 2 * NODE_ROW_STRIDE);
1042+
1043+
let view = node_rows_from_le_bytes(bytes).expect("aligned, 512-multiple");
1044+
assert_eq!(view.len(), 2);
1045+
assert_eq!(view[0].key.classid(), NodeGuid::CLASSID_OSINT);
1046+
assert_eq!(view[1].key.classid(), NodeGuid::CLASSID_PROJECT);
1047+
assert_eq!(view[0].value, [7u8; 480]);
1048+
// Truly zero-copy: the view aliases the SAME backing store as `rows`.
1049+
assert_eq!(view.as_ptr().cast::<u8>(), rows.as_ptr().cast::<u8>());
1050+
}
1051+
1052+
#[test]
1053+
fn node_rows_from_le_bytes_rejects_bad_inputs() {
1054+
let rows = vec![
1055+
NodeRow {
1056+
key: NodeGuid::local(1),
1057+
edges: EdgeBlock::default(),
1058+
value: [0u8; 480],
1059+
},
1060+
NodeRow {
1061+
key: NodeGuid::local(2),
1062+
edges: EdgeBlock::default(),
1063+
value: [0u8; 480],
1064+
},
1065+
];
1066+
let packet = NodeRowPacket::new(&rows, 0);
1067+
let bytes = packet.as_le_bytes(); // 1024 bytes, 64-aligned
1068+
// empty → Some(empty)
1069+
assert_eq!(node_rows_from_le_bytes(&[]).map(<[_]>::len), Some(0));
1070+
// not a whole number of rows → None (length check)
1071+
assert!(node_rows_from_le_bytes(&bytes[..NODE_ROW_STRIDE - 1]).is_none());
1072+
// a 512-length window offset by 1 off the 64-aligned base: correct length
1073+
// but misaligned → None via the alignment check (no UB cast).
1074+
let misaligned = &bytes[1..1 + NODE_ROW_STRIDE];
1075+
assert_eq!(misaligned.len(), NODE_ROW_STRIDE);
1076+
assert!(node_rows_from_le_bytes(misaligned).is_none());
1077+
}
1078+
9751079
#[test]
9761080
fn defaults_are_zero_and_bootstrap() {
9771081
let g = NodeGuid::local(0x00_00CD);

crates/lance-graph-contract/src/lib.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,8 +121,8 @@ pub mod world_model;
121121

122122
// Re-exports for the most commonly used collapse_gate types.
123123
pub use canonical_node::{
124-
classid_read_mode, EdgeBlock, EdgeCodecFlavor, GuidParts, NodeGuid, NodeRow, ReadMode,
125-
ValueSchema, ValueTenant, VALUE_TENANTS,
124+
classid_read_mode, node_rows_from_le_bytes, EdgeBlock, EdgeCodecFlavor, GuidParts, NodeGuid,
125+
NodeRow, NodeRowPacket, ReadMode, ValueSchema, ValueTenant, VALUE_TENANTS,
126126
};
127127
pub use class_view::{ClassId, ClassProjection, ClassView, FieldMask, RenderRow};
128128
pub use collapse_gate::{GateDecision, MailboxId, MergeMode};

0 commit comments

Comments
 (0)