|
| 1 | +# Integration Plan: ndarray's role in the four-repo convergence |
| 2 | + |
| 3 | +**This repo**: `AdaWorldAPI/ndarray` — SIMD distance kernels + tensor primitives, shared across the stack. |
| 4 | + |
| 5 | +**Status**: planning document. Companion plans at the same path in the other repos: |
| 6 | +- `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` |
| 7 | +- `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` |
| 8 | +- `AdaWorldAPI/sea-orm:.claude/plans/integration-plan.md` |
| 9 | + |
| 10 | +--- |
| 11 | + |
| 12 | +## 1. The convergence target |
| 13 | + |
| 14 | +Across all four repos: |
| 15 | + |
| 16 | +> *Foundry-style ontology + BEAM-style supervision + ClickHouse-style analytic + Postgres-style ACID + cognitive primitives — all on one Arrow substrate, surfaced to consumers as a typed sea-orm API.* |
| 17 | +
|
| 18 | +Four glue crates close the gap: |
| 19 | + |
| 20 | +| # | Glue crate | Owner repo | Bridges | |
| 21 | +|---|---|---|---| |
| 22 | +| 1 | `surrealdb-ractor` | surrealdb | `cf` / live queries → ractor mailboxes | |
| 23 | +| 2 | `lance-graph-tikv-provider` | lance-graph | TiKV ranges → Arrow `TableProvider` | |
| 24 | +| 3 | `sea-orm-ractor` | sea-orm | `Entity::PK` → ractor process registry | |
| 25 | +| 4 | `cognitive-shader-actor` | lance-graph | cognitive shaders → `ractor::Actor` adapter | |
| 26 | + |
| 27 | +**This repo owns no glue crate.** It owns the **shared low-level numeric substrate** that the other three depend on — SIMD distance kernels (cosine, L1, L2, Linf), `F64x8` polyfills, `heel_f64x8` helpers, `hpc-extras` feature. |
| 28 | + |
| 29 | +### Integration principle: additive contract shape (this repo IS the canonical case) |
| 30 | + |
| 31 | +**This repo is the load-bearing example of the contract-shape discipline.** Every symbol this repo exposes is consumed by surrealdb-core (`idx/trees/vector.rs`) and lance-graph cognitive crates (`bgz-tensor`, `holograph`, `deepnsm`, `causal-edge`). One signature change breaks the entire stack. The discipline: |
| 32 | + |
| 33 | +1. **Existing stable APIs never change signature.** Period. If a hypothetical improvement requires a different signature, the new signature ships as a new function next to the old one. The old function stays forever or for a 5+-version deprecation runway, whichever is longer. |
| 34 | +2. **New kernels are added as new functions in new or existing modules.** Adding `F32x16` doesn't touch `F64x8`. Adding `hamming_u8_simd` doesn't touch `cosine_f64_simd`. |
| 35 | +3. **Internal SIMD backends (AVX2/AVX-512/NEON paths) are not public surface.** They can change without notice. Only the public entry points are load-bearing. |
| 36 | +4. **The `[patch.crates-io]` block in surrealdb's root Cargo.toml is the diamond-dep guard.** This repo's existence + that patch line is what makes downstream `ort` (ONNX runtime) link the same `ndarray` as surrealdb-core. Breaking the patch contract breaks ONNX interop. |
| 37 | + |
| 38 | +**Per-repo enforcement**: every Sprint item below is read as "add this; don't change what's there." |
| 39 | + |
| 40 | +### Contracts (existing + new) |
| 41 | + |
| 42 | +| Contract | Owner repo | Status today | This plan adds | |
| 43 | +|---|---|---|---| |
| 44 | +| `ndarray::hpc::F64x8` + `heel_f64x8::*` | **this repo** | 0.17 fork, stable per §5 below | **unchanged — only new kernels (e.g. `F32x16`, int8, Hamming) added in new symbols** | |
| 45 | +| `[patch.crates-io] ndarray = ...` in surrealdb root Cargo.toml | surrealdb | active (diamond-dep guard) | not touched | |
| 46 | +| `lance-graph-contract` (for cognitive shader / IR vocabulary) | lance-graph | 0.1.x → 0.2.0 additive | not touched by us | |
| 47 | +| surrealdb `MvccSource` / `CfStream` | surrealdb | new additive traits | not touched by us | |
| 48 | +| sea-orm `EntityActor` / `SelectArrowExt` | sea-orm | new additive trait/derive | not touched by us | |
| 49 | + |
| 50 | +--- |
| 51 | + |
| 52 | +## 2. Architecture diagram |
| 53 | + |
| 54 | +``` |
| 55 | + ┌──────────────────────────────────────────┐ |
| 56 | + │ consumer crate │ |
| 57 | + └──────────────────┬───────────────────────┘ |
| 58 | + │ typed entities |
| 59 | + ▼ |
| 60 | + ┌──────────────────────────────────────────┐ |
| 61 | + │ sea-orm-arrow 2.0 │ |
| 62 | + └────┬─────────────────┬───────────────┬───┘ |
| 63 | + │ │ │ |
| 64 | + ▼ ▼ ▼ |
| 65 | + ┌───────────┐ ┌───────────┐ ┌───────────┐ |
| 66 | + │ ractor │◄────│ surrealdb │ │lance-graph│ |
| 67 | + │ (actors, │ #1 │ (cf + │ │ (Cypher, │ |
| 68 | + │ mailboxes,│ │ live │ │ ontology, │ |
| 69 | + │ supervis.)│ │ queries) │ │cognitive) │ |
| 70 | + └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ |
| 71 | + │ #3 │ │ #2,#4 |
| 72 | + ▼ ▼ ▼ |
| 73 | + ┌─────────────────────────────────────────────┐ |
| 74 | + │ TiKV substrate (Raft + Percolator) │ |
| 75 | + └─────────────────────────────────────────────┘ |
| 76 | + │ |
| 77 | + ▼ |
| 78 | + ┌────────────────────────────┐ |
| 79 | + │ THIS REPO (ndarray) │ |
| 80 | + │ - hpc-extras feature │ |
| 81 | + │ - F64x8 polyfill │ |
| 82 | + │ - heel_f64x8 distances │ |
| 83 | + │ - diamond-dep guard │ |
| 84 | + └────────────────────────────┘ |
| 85 | +``` |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +## 3. Role of ndarray in the integration |
| 90 | + |
| 91 | +This is the **shared low-level numeric substrate**. The AdaWorldAPI fork of ndarray 0.17 with `hpc-extras` lives at the bottom of the stack. Two direct consumers: |
| 92 | + |
| 93 | +1. **surrealdb-core** |
| 94 | + - `core/Cargo.toml:71-77` — `vector-hpc` feature flips on cfg-gated dispatch in `idx/trees/vector.rs` |
| 95 | + - `core/src/idx/trees/vector.rs` — distance helpers (l1/l2/linf) inlined here, using this repo's SIMD kernels |
| 96 | + - Comment from surrealdb's root `Cargo.toml:88-93`: |
| 97 | + > *Always the AdaWorldAPI fork — never crates.io. Direct git dep at the workspace level. Distance helpers (l1/l2/linf) are inlined in surrealdb/core/src/idx/trees/vector.rs.* |
| 98 | +
|
| 99 | +2. **lance-graph cognitive crates** |
| 100 | + - `crates/bgz-tensor/` — element-wise ops use ndarray's `Zip` + `F64x8` chunks |
| 101 | + - `crates/holograph/` — holographic distance metrics |
| 102 | + - `crates/deepnsm/` — neural state machine distance kernels |
| 103 | + - `crates/causal-edge/` — causality scoring uses cosine over embedding vectors |
| 104 | + |
| 105 | +Indirectly via sea-orm and the planner, every vector / distance / similarity operation in the stack lands here. |
| 106 | + |
| 107 | +--- |
| 108 | + |
| 109 | +## 4. Current state — what makes this fork special |
| 110 | + |
| 111 | +### `F64x8` polyfill |
| 112 | + |
| 113 | +`hpc-extras` feature exposes an 8-wide `f64` SIMD vector type that works on: |
| 114 | +- **x86_64 AVX-512** — native 8-wide |
| 115 | +- **x86_64 AVX2** — two 4-wide ops, software-packed |
| 116 | +- **aarch64 NEON** — two 4-wide via NEON 128-bit, software-packed |
| 117 | +- **other archs** — scalar fallback |
| 118 | + |
| 119 | +This is the kernel both surrealdb's `idx/trees/vector.rs` and lance-graph's cognitive shaders rely on. |
| 120 | + |
| 121 | +### `heel_f64x8` distance kernels |
| 122 | + |
| 123 | +Functions composing `F64x8` chunks into a distance: |
| 124 | + |
| 125 | +``` |
| 126 | +heel_f64x8::cosine_f64_simd(a: &[f64], b: &[f64]) -> f64 |
| 127 | +heel_f64x8::l1_f64_simd (a: &[f64], b: &[f64]) -> f64 |
| 128 | +heel_f64x8::l2_f64_simd (a: &[f64], b: &[f64]) -> f64 |
| 129 | +heel_f64x8::linf_f64_simd (a: &[f64], b: &[f64]) -> f64 |
| 130 | +``` |
| 131 | + |
| 132 | +### Diamond-dep guard |
| 133 | + |
| 134 | +The `[patch.crates-io]` block at the bottom of surrealdb's root `Cargo.toml`: |
| 135 | + |
| 136 | +```toml |
| 137 | +[patch.crates-io] |
| 138 | +ndarray = { git = "https://github.com/AdaWorldAPI/ndarray.git" } |
| 139 | +``` |
| 140 | + |
| 141 | +ensures any transitive consumer of `ndarray = "0.17.x"` from crates.io lands on this fork. Without the patch, `ort` (ONNX runtime, optional `ml` feature in surrealdb) would link a separate `ndarray` and surrealdb-core would link this one — two distinct `TypeId`s, no interop. |
| 142 | + |
| 143 | +**This repo's existence is what makes the patch work.** Without it, the diamond-dep workaround has no target to redirect to. |
| 144 | + |
| 145 | +### The `lance-index` 0.16 gap (known) |
| 146 | + |
| 147 | +From surrealdb root `Cargo.toml:100-101`: |
| 148 | + |
| 149 | +> *Scope: 0.17 line only. `lance-index 4.0` depends on `ndarray = "0.16"`, a separate major version that this patch does not affect; eliminating that crates.io 0.16 entry requires upstream `lance-index` to bump.* |
| 150 | +
|
| 151 | +**Plan**: watch upstream `lance-index` for the 0.17 bump (see §6 Sprint 2). When it lands, the diamond-dep guard becomes single-version-clean. |
| 152 | + |
| 153 | +--- |
| 154 | + |
| 155 | +## 5. API stability commitment (this repo's contract) |
| 156 | + |
| 157 | +This repo doesn't own a glue *crate* — it owns the **API contract that the SIMD layer of three downstream repos depends on**. The commitment is absolute: |
| 158 | + |
| 159 | +### Stable public surface (no break without major bump, none planned) |
| 160 | + |
| 161 | +| Symbol | Kind | |
| 162 | +|---|---| |
| 163 | +| `ndarray::hpc::F64x8` | type — layout, lane count (8) frozen | |
| 164 | +| `ndarray::hpc::heel_f64x8::cosine_f64_simd(a, b) -> f64` | signature frozen | |
| 165 | +| `ndarray::hpc::heel_f64x8::l1_f64_simd(a, b) -> f64` | signature frozen | |
| 166 | +| `ndarray::hpc::heel_f64x8::l2_f64_simd(a, b) -> f64` | signature frozen | |
| 167 | +| `ndarray::hpc::heel_f64x8::linf_f64_simd(a, b) -> f64` | signature frozen | |
| 168 | +| feature `hpc-extras` | name + what it enables frozen | |
| 169 | + |
| 170 | +**"Frozen" means**: no signature change, no rename, no semantic drift. If we want to refine — e.g., a fused multiply-add variant of cosine — we add `cosine_f64_simd_fma(a, b) -> f64` as a NEW function. Both coexist forever (or 5+ versions, whichever is longer). |
| 171 | + |
| 172 | +### Internal / unstable |
| 173 | + |
| 174 | +- Polyfill backends (AVX2/AVX-512/NEON paths) — implementation detail |
| 175 | +- Auto-dispatch heuristics — can change without notice |
| 176 | +- Numeric tolerance in non-cancellation-prone paths — within `f64::EPSILON * len` of scalar reference |
| 177 | + |
| 178 | +### Doc commitment |
| 179 | + |
| 180 | +- Each stable function gets a doc-test |
| 181 | +- Cross-arch behaviour documented in `docs/hpc-stability.md` (Sprint 0) |
| 182 | +- A CI matrix runs the doc-tests on x86_64-AVX2, x86_64-AVX-512, aarch64-NEON, and scalar-fallback |
| 183 | + |
| 184 | +--- |
| 185 | + |
| 186 | +## 6. Sprint sequence (this repo) |
| 187 | + |
| 188 | +All work is **additive** — new symbols in new or existing modules; no existing symbol changes signature. |
| 189 | + |
| 190 | +### Sprint 0 — API freeze + doc (1 week) |
| 191 | +- Mark stable APIs with `#[stable]`-style doc tag (custom attribute or doc-comment convention) |
| 192 | +- Write `docs/hpc-stability.md` listing the commitment from §5 |
| 193 | +- Add CI cross-arch doc-test matrix |
| 194 | +- Cross-link from this plan |
| 195 | + |
| 196 | +### Sprint 1 — `bgz-tensor` direct coupling (1 week) |
| 197 | +- `bgz-tensor` (lance-graph crate) takes a direct dep on this fork (additive: new dep line, no existing dep changes) |
| 198 | +- Ensures `bgz-tensor` users always get the SIMD kernels regardless of feature-flag composition |
| 199 | +- Coordinate with lance-graph plan §4 |
| 200 | + |
| 201 | +### Sprint 2 — `lance-index` 0.17 readiness (timing depends on upstream) |
| 202 | +- Watch upstream `lance-index` for the 0.17 bump |
| 203 | +- Have a forked `lance-index` 0.17 ready to slot in if upstream delays |
| 204 | +- Once available, extend the surrealdb `[patch.crates-io]` block to cover both 0.16 (if still needed) and 0.17 |
| 205 | +- This is purely additive on this repo's side (we add no symbols; we are the target of the patch) |
| 206 | + |
| 207 | +### Sprint 3 — additional kernels as needed (ad-hoc; all additive) |
| 208 | +- Add `F32x16` polyfill if cognitive shaders migrate to f32 (NEW type, F64x8 unchanged) |
| 209 | +- Add quantised int8 distance kernels for embedding compression (NEW module `heel_i8x32::*`) |
| 210 | +- Add Hamming distance kernel for binary embeddings (NEW function `heel_u8x32::hamming_u8_simd`) |
| 211 | + |
| 212 | +--- |
| 213 | + |
| 214 | +## 7. Examples |
| 215 | + |
| 216 | +### Example 1 — surrealdb using the fork's SIMD |
| 217 | + |
| 218 | +```rust |
| 219 | +// surrealdb/core/src/idx/trees/vector.rs — sketch of what's already wired |
| 220 | +use ndarray::hpc::heel_f64x8; |
| 221 | + |
| 222 | +pub fn cosine_distance(a: &[f64], b: &[f64]) -> f64 { |
| 223 | + debug_assert_eq!(a.len(), b.len()); |
| 224 | + #[cfg(feature = "vector-hpc")] |
| 225 | + { 1.0 - heel_f64x8::cosine_f64_simd(a, b) } |
| 226 | + #[cfg(not(feature = "vector-hpc"))] |
| 227 | + { scalar_cosine(a, b) } |
| 228 | +} |
| 229 | +``` |
| 230 | + |
| 231 | +### Example 2 — lance-graph cognitive shader using the fork |
| 232 | + |
| 233 | +```rust |
| 234 | +// lance-graph/crates/holograph/src/distance.rs |
| 235 | +use ndarray::hpc::heel_f64x8; |
| 236 | +use crate::HolographEmbedding; |
| 237 | + |
| 238 | +impl HolographEmbedding { |
| 239 | + pub fn similarity(&self, other: &Self) -> f64 { |
| 240 | + heel_f64x8::cosine_f64_simd(self.as_slice(), other.as_slice()) |
| 241 | + } |
| 242 | +} |
| 243 | +``` |
| 244 | + |
| 245 | +### Example 3 — `bgz-tensor` element-wise ops via the fork |
| 246 | + |
| 247 | +```rust |
| 248 | +// lance-graph/crates/bgz-tensor/src/ops.rs |
| 249 | +use ndarray::hpc::F64x8; |
| 250 | +use ndarray::Zip; |
| 251 | + |
| 252 | +impl BgzTensor<f64> { |
| 253 | + pub fn elementwise_mul(&self, other: &Self) -> Self { |
| 254 | + let mut out = self.clone(); |
| 255 | + Zip::from(&mut out.data) |
| 256 | + .and(&other.data) |
| 257 | + .for_each(|a, &b| *a *= b); |
| 258 | + // F64x8-chunked path handled by ndarray's Zip internals for large tensors. |
| 259 | + out |
| 260 | + } |
| 261 | +} |
| 262 | +``` |
| 263 | + |
| 264 | +### Example 4 — The diamond-dep guard (replicated for cross-reference) |
| 265 | + |
| 266 | +```toml |
| 267 | +# surrealdb root Cargo.toml (already in place; documented here so the |
| 268 | +# fork knows what surfaces are load-bearing). |
| 269 | +[patch.crates-io] |
| 270 | +ndarray = { git = "https://github.com/AdaWorldAPI/ndarray.git" } |
| 271 | +``` |
| 272 | + |
| 273 | +Without this patch: |
| 274 | +- `ort` pulls `ndarray = "0.17.2"` from crates.io |
| 275 | +- `surrealdb-core` pulls this fork |
| 276 | +- They have distinct `TypeId`s → no interop between ONNX outputs and surrealdb's index code |
| 277 | + |
| 278 | +With this patch, both link the same crate. **This fork's stability is the diamond-dep fix.** |
| 279 | + |
| 280 | +### Example 5 — New kernel landing as a new symbol (additive) |
| 281 | + |
| 282 | +Hypothetical: a fused multiply-add cosine variant lands. Old + new coexist: |
| 283 | + |
| 284 | +```rust |
| 285 | +// crates/ndarray/src/hpc/heel_f64x8.rs — new function, existing unchanged |
| 286 | +pub fn cosine_f64_simd(a: &[f64], b: &[f64]) -> f64 { /* existing */ } |
| 287 | + |
| 288 | +/// FMA variant. Lower latency on AVX-512 + AVX2-FMA hosts. |
| 289 | +/// Numerically identical within f64::EPSILON * len. |
| 290 | +pub fn cosine_f64_simd_fma(a: &[f64], b: &[f64]) -> f64 { /* new */ } |
| 291 | +``` |
| 292 | + |
| 293 | +Consumers pick. Nothing breaks. |
| 294 | + |
| 295 | +--- |
| 296 | + |
| 297 | +## 8. What this plan asks of the other repos |
| 298 | + |
| 299 | +Nothing structural — only that consumers stay on the stable surface (§5) and report breakage promptly. Specifically: |
| 300 | + |
| 301 | +- **surrealdb**: `idx/trees/vector.rs` should only use `ndarray::hpc::*` items listed in §5. Anything else is a non-stable detail and may break without notice. |
| 302 | +- **lance-graph**: cognitive crates should use `heel_f64x8` distance kernels; if a kernel is missing (e.g. Hamming), file an issue here rather than implementing locally. |
| 303 | +- **sea-orm**: no direct dep on this fork; touches it only transitively if a consumer uses sea-orm-arrow with `f64` Arrow columns. |
| 304 | + |
| 305 | +--- |
| 306 | + |
| 307 | +## 9. Open questions |
| 308 | + |
| 309 | +1. **`F32x16` priority** — is a cognitive shader consumer planning to move to f32? If yes, Sprint 3 fast-track. If no, defer. |
| 310 | +2. **Quantised int8 distance kernels** — trigger Sprint 3 item when a concrete consumer surfaces. |
| 311 | +3. **WASM target** — surrealdb has a WASM build path. Does it need `vector-hpc`? Today the scalar fallback covers it. Confirm with surrealdb plan. |
| 312 | +4. **Numeric tolerance documentation** — currently "within `f64::EPSILON * len`"; doc-test it in Sprint 0. |
| 313 | +5. **`#[stable]` attribute convention** — use Rust nightly `#[stable]` (not available on stable) or a doc-comment convention? Probably the latter for portability; revisit when nightly `#[stable]` stabilises. |
| 314 | + |
| 315 | +--- |
| 316 | + |
| 317 | +## 10. Cross-references |
| 318 | + |
| 319 | +- **Glue #1** (surrealdb-ractor): `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` §5 |
| 320 | +- **Glue #2** (TiKV TableProvider): `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` §5 |
| 321 | +- **Glue #3** (sea-orm-ractor): `AdaWorldAPI/sea-orm:.claude/plans/integration-plan.md` §5 |
| 322 | +- **Glue #4** (cognitive-shader-actor): `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` §6 |
| 323 | +- **Cognitive crate consumers** (the load-bearing reason this fork exists): `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` §3 + §4 |
| 324 | +- **surrealdb's `vector-hpc` feature**: `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` §4 (`core/Cargo.toml:71-77`) |
| 325 | +- **`lance-projection` sibling** (analytic view of cognitive crate outputs): `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` §6 |
0 commit comments