Skip to content

Commit 697fb96

Browse files
authored
Merge pull request #160 from AdaWorldAPI/claude/lance-surrealdb-analysis-LXmug
feat(hpc): l1/l2/linf SIMD kernels + stability docs (Sprint 0)
2 parents e63158e + f102959 commit 697fb96

4 files changed

Lines changed: 1757 additions & 0 deletions

File tree

.claude/plans/integration-plan.md

Lines changed: 325 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,325 @@
1+
# Integration Plan: ndarray's role in the four-repo convergence
2+
3+
**This repo**: `AdaWorldAPI/ndarray` — SIMD distance kernels + tensor primitives, shared across the stack.
4+
5+
**Status**: planning document. Companion plans at the same path in the other repos:
6+
- `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md`
7+
- `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md`
8+
- `AdaWorldAPI/sea-orm:.claude/plans/integration-plan.md`
9+
10+
---
11+
12+
## 1. The convergence target
13+
14+
Across all four repos:
15+
16+
> *Foundry-style ontology + BEAM-style supervision + ClickHouse-style analytic + Postgres-style ACID + cognitive primitives — all on one Arrow substrate, surfaced to consumers as a typed sea-orm API.*
17+
18+
Four glue crates close the gap:
19+
20+
| # | Glue crate | Owner repo | Bridges |
21+
|---|---|---|---|
22+
| 1 | `surrealdb-ractor` | surrealdb | `cf` / live queries → ractor mailboxes |
23+
| 2 | `lance-graph-tikv-provider` | lance-graph | TiKV ranges → Arrow `TableProvider` |
24+
| 3 | `sea-orm-ractor` | sea-orm | `Entity::PK` → ractor process registry |
25+
| 4 | `cognitive-shader-actor` | lance-graph | cognitive shaders → `ractor::Actor` adapter |
26+
27+
**This repo owns no glue crate.** It owns the **shared low-level numeric substrate** that the other three depend on — SIMD distance kernels (cosine, L1, L2, Linf), `F64x8` polyfills, `heel_f64x8` helpers, `hpc-extras` feature.
28+
29+
### Integration principle: additive contract shape (this repo IS the canonical case)
30+
31+
**This repo is the load-bearing example of the contract-shape discipline.** Every symbol this repo exposes is consumed by surrealdb-core (`idx/trees/vector.rs`) and lance-graph cognitive crates (`bgz-tensor`, `holograph`, `deepnsm`, `causal-edge`). One signature change breaks the entire stack. The discipline:
32+
33+
1. **Existing stable APIs never change signature.** Period. If a hypothetical improvement requires a different signature, the new signature ships as a new function next to the old one. The old function stays forever or for a 5+-version deprecation runway, whichever is longer.
34+
2. **New kernels are added as new functions in new or existing modules.** Adding `F32x16` doesn't touch `F64x8`. Adding `hamming_u8_simd` doesn't touch `cosine_f64_simd`.
35+
3. **Internal SIMD backends (AVX2/AVX-512/NEON paths) are not public surface.** They can change without notice. Only the public entry points are load-bearing.
36+
4. **The `[patch.crates-io]` block in surrealdb's root Cargo.toml is the diamond-dep guard.** This repo's existence + that patch line is what makes downstream `ort` (ONNX runtime) link the same `ndarray` as surrealdb-core. Breaking the patch contract breaks ONNX interop.
37+
38+
**Per-repo enforcement**: every Sprint item below is read as "add this; don't change what's there."
39+
40+
### Contracts (existing + new)
41+
42+
| Contract | Owner repo | Status today | This plan adds |
43+
|---|---|---|---|
44+
| `ndarray::hpc::F64x8` + `heel_f64x8::*` | **this repo** | 0.17 fork, stable per §5 below | **unchanged — only new kernels (e.g. `F32x16`, int8, Hamming) added in new symbols** |
45+
| `[patch.crates-io] ndarray = ...` in surrealdb root Cargo.toml | surrealdb | active (diamond-dep guard) | not touched |
46+
| `lance-graph-contract` (for cognitive shader / IR vocabulary) | lance-graph | 0.1.x → 0.2.0 additive | not touched by us |
47+
| surrealdb `MvccSource` / `CfStream` | surrealdb | new additive traits | not touched by us |
48+
| sea-orm `EntityActor` / `SelectArrowExt` | sea-orm | new additive trait/derive | not touched by us |
49+
50+
---
51+
52+
## 2. Architecture diagram
53+
54+
```
55+
┌──────────────────────────────────────────┐
56+
│ consumer crate │
57+
└──────────────────┬───────────────────────┘
58+
│ typed entities
59+
60+
┌──────────────────────────────────────────┐
61+
│ sea-orm-arrow 2.0 │
62+
└────┬─────────────────┬───────────────┬───┘
63+
│ │ │
64+
▼ ▼ ▼
65+
┌───────────┐ ┌───────────┐ ┌───────────┐
66+
│ ractor │◄────│ surrealdb │ │lance-graph│
67+
│ (actors, │ #1 │ (cf + │ │ (Cypher, │
68+
│ mailboxes,│ │ live │ │ ontology, │
69+
│ supervis.)│ │ queries) │ │cognitive) │
70+
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘
71+
│ #3 │ │ #2,#4
72+
▼ ▼ ▼
73+
┌─────────────────────────────────────────────┐
74+
│ TiKV substrate (Raft + Percolator) │
75+
└─────────────────────────────────────────────┘
76+
77+
78+
┌────────────────────────────┐
79+
│ THIS REPO (ndarray) │
80+
│ - hpc-extras feature │
81+
│ - F64x8 polyfill │
82+
│ - heel_f64x8 distances │
83+
│ - diamond-dep guard │
84+
└────────────────────────────┘
85+
```
86+
87+
---
88+
89+
## 3. Role of ndarray in the integration
90+
91+
This is the **shared low-level numeric substrate**. The AdaWorldAPI fork of ndarray 0.17 with `hpc-extras` lives at the bottom of the stack. Two direct consumers:
92+
93+
1. **surrealdb-core**
94+
- `core/Cargo.toml:71-77``vector-hpc` feature flips on cfg-gated dispatch in `idx/trees/vector.rs`
95+
- `core/src/idx/trees/vector.rs` — distance helpers (l1/l2/linf) inlined here, using this repo's SIMD kernels
96+
- Comment from surrealdb's root `Cargo.toml:88-93`:
97+
> *Always the AdaWorldAPI fork — never crates.io. Direct git dep at the workspace level. Distance helpers (l1/l2/linf) are inlined in surrealdb/core/src/idx/trees/vector.rs.*
98+
99+
2. **lance-graph cognitive crates**
100+
- `crates/bgz-tensor/` — element-wise ops use ndarray's `Zip` + `F64x8` chunks
101+
- `crates/holograph/` — holographic distance metrics
102+
- `crates/deepnsm/` — neural state machine distance kernels
103+
- `crates/causal-edge/` — causality scoring uses cosine over embedding vectors
104+
105+
Indirectly via sea-orm and the planner, every vector / distance / similarity operation in the stack lands here.
106+
107+
---
108+
109+
## 4. Current state — what makes this fork special
110+
111+
### `F64x8` polyfill
112+
113+
`hpc-extras` feature exposes an 8-wide `f64` SIMD vector type that works on:
114+
- **x86_64 AVX-512** — native 8-wide
115+
- **x86_64 AVX2** — two 4-wide ops, software-packed
116+
- **aarch64 NEON** — two 4-wide via NEON 128-bit, software-packed
117+
- **other archs** — scalar fallback
118+
119+
This is the kernel both surrealdb's `idx/trees/vector.rs` and lance-graph's cognitive shaders rely on.
120+
121+
### `heel_f64x8` distance kernels
122+
123+
Functions composing `F64x8` chunks into a distance:
124+
125+
```
126+
heel_f64x8::cosine_f64_simd(a: &[f64], b: &[f64]) -> f64
127+
heel_f64x8::l1_f64_simd (a: &[f64], b: &[f64]) -> f64
128+
heel_f64x8::l2_f64_simd (a: &[f64], b: &[f64]) -> f64
129+
heel_f64x8::linf_f64_simd (a: &[f64], b: &[f64]) -> f64
130+
```
131+
132+
### Diamond-dep guard
133+
134+
The `[patch.crates-io]` block at the bottom of surrealdb's root `Cargo.toml`:
135+
136+
```toml
137+
[patch.crates-io]
138+
ndarray = { git = "https://github.com/AdaWorldAPI/ndarray.git" }
139+
```
140+
141+
ensures any transitive consumer of `ndarray = "0.17.x"` from crates.io lands on this fork. Without the patch, `ort` (ONNX runtime, optional `ml` feature in surrealdb) would link a separate `ndarray` and surrealdb-core would link this one — two distinct `TypeId`s, no interop.
142+
143+
**This repo's existence is what makes the patch work.** Without it, the diamond-dep workaround has no target to redirect to.
144+
145+
### The `lance-index` 0.16 gap (known)
146+
147+
From surrealdb root `Cargo.toml:100-101`:
148+
149+
> *Scope: 0.17 line only. `lance-index 4.0` depends on `ndarray = "0.16"`, a separate major version that this patch does not affect; eliminating that crates.io 0.16 entry requires upstream `lance-index` to bump.*
150+
151+
**Plan**: watch upstream `lance-index` for the 0.17 bump (see §6 Sprint 2). When it lands, the diamond-dep guard becomes single-version-clean.
152+
153+
---
154+
155+
## 5. API stability commitment (this repo's contract)
156+
157+
This repo doesn't own a glue *crate* — it owns the **API contract that the SIMD layer of three downstream repos depends on**. The commitment is absolute:
158+
159+
### Stable public surface (no break without major bump, none planned)
160+
161+
| Symbol | Kind |
162+
|---|---|
163+
| `ndarray::hpc::F64x8` | type — layout, lane count (8) frozen |
164+
| `ndarray::hpc::heel_f64x8::cosine_f64_simd(a, b) -> f64` | signature frozen |
165+
| `ndarray::hpc::heel_f64x8::l1_f64_simd(a, b) -> f64` | signature frozen |
166+
| `ndarray::hpc::heel_f64x8::l2_f64_simd(a, b) -> f64` | signature frozen |
167+
| `ndarray::hpc::heel_f64x8::linf_f64_simd(a, b) -> f64` | signature frozen |
168+
| feature `hpc-extras` | name + what it enables frozen |
169+
170+
**"Frozen" means**: no signature change, no rename, no semantic drift. If we want to refine — e.g., a fused multiply-add variant of cosine — we add `cosine_f64_simd_fma(a, b) -> f64` as a NEW function. Both coexist forever (or 5+ versions, whichever is longer).
171+
172+
### Internal / unstable
173+
174+
- Polyfill backends (AVX2/AVX-512/NEON paths) — implementation detail
175+
- Auto-dispatch heuristics — can change without notice
176+
- Numeric tolerance in non-cancellation-prone paths — within `f64::EPSILON * len` of scalar reference
177+
178+
### Doc commitment
179+
180+
- Each stable function gets a doc-test
181+
- Cross-arch behaviour documented in `docs/hpc-stability.md` (Sprint 0)
182+
- A CI matrix runs the doc-tests on x86_64-AVX2, x86_64-AVX-512, aarch64-NEON, and scalar-fallback
183+
184+
---
185+
186+
## 6. Sprint sequence (this repo)
187+
188+
All work is **additive** — new symbols in new or existing modules; no existing symbol changes signature.
189+
190+
### Sprint 0 — API freeze + doc (1 week)
191+
- Mark stable APIs with `#[stable]`-style doc tag (custom attribute or doc-comment convention)
192+
- Write `docs/hpc-stability.md` listing the commitment from §5
193+
- Add CI cross-arch doc-test matrix
194+
- Cross-link from this plan
195+
196+
### Sprint 1 — `bgz-tensor` direct coupling (1 week)
197+
- `bgz-tensor` (lance-graph crate) takes a direct dep on this fork (additive: new dep line, no existing dep changes)
198+
- Ensures `bgz-tensor` users always get the SIMD kernels regardless of feature-flag composition
199+
- Coordinate with lance-graph plan §4
200+
201+
### Sprint 2 — `lance-index` 0.17 readiness (timing depends on upstream)
202+
- Watch upstream `lance-index` for the 0.17 bump
203+
- Have a forked `lance-index` 0.17 ready to slot in if upstream delays
204+
- Once available, extend the surrealdb `[patch.crates-io]` block to cover both 0.16 (if still needed) and 0.17
205+
- This is purely additive on this repo's side (we add no symbols; we are the target of the patch)
206+
207+
### Sprint 3 — additional kernels as needed (ad-hoc; all additive)
208+
- Add `F32x16` polyfill if cognitive shaders migrate to f32 (NEW type, F64x8 unchanged)
209+
- Add quantised int8 distance kernels for embedding compression (NEW module `heel_i8x32::*`)
210+
- Add Hamming distance kernel for binary embeddings (NEW function `heel_u8x32::hamming_u8_simd`)
211+
212+
---
213+
214+
## 7. Examples
215+
216+
### Example 1 — surrealdb using the fork's SIMD
217+
218+
```rust
219+
// surrealdb/core/src/idx/trees/vector.rs — sketch of what's already wired
220+
use ndarray::hpc::heel_f64x8;
221+
222+
pub fn cosine_distance(a: &[f64], b: &[f64]) -> f64 {
223+
debug_assert_eq!(a.len(), b.len());
224+
#[cfg(feature = "vector-hpc")]
225+
{ 1.0 - heel_f64x8::cosine_f64_simd(a, b) }
226+
#[cfg(not(feature = "vector-hpc"))]
227+
{ scalar_cosine(a, b) }
228+
}
229+
```
230+
231+
### Example 2 — lance-graph cognitive shader using the fork
232+
233+
```rust
234+
// lance-graph/crates/holograph/src/distance.rs
235+
use ndarray::hpc::heel_f64x8;
236+
use crate::HolographEmbedding;
237+
238+
impl HolographEmbedding {
239+
pub fn similarity(&self, other: &Self) -> f64 {
240+
heel_f64x8::cosine_f64_simd(self.as_slice(), other.as_slice())
241+
}
242+
}
243+
```
244+
245+
### Example 3 — `bgz-tensor` element-wise ops via the fork
246+
247+
```rust
248+
// lance-graph/crates/bgz-tensor/src/ops.rs
249+
use ndarray::hpc::F64x8;
250+
use ndarray::Zip;
251+
252+
impl BgzTensor<f64> {
253+
pub fn elementwise_mul(&self, other: &Self) -> Self {
254+
let mut out = self.clone();
255+
Zip::from(&mut out.data)
256+
.and(&other.data)
257+
.for_each(|a, &b| *a *= b);
258+
// F64x8-chunked path handled by ndarray's Zip internals for large tensors.
259+
out
260+
}
261+
}
262+
```
263+
264+
### Example 4 — The diamond-dep guard (replicated for cross-reference)
265+
266+
```toml
267+
# surrealdb root Cargo.toml (already in place; documented here so the
268+
# fork knows what surfaces are load-bearing).
269+
[patch.crates-io]
270+
ndarray = { git = "https://github.com/AdaWorldAPI/ndarray.git" }
271+
```
272+
273+
Without this patch:
274+
- `ort` pulls `ndarray = "0.17.2"` from crates.io
275+
- `surrealdb-core` pulls this fork
276+
- They have distinct `TypeId`s → no interop between ONNX outputs and surrealdb's index code
277+
278+
With this patch, both link the same crate. **This fork's stability is the diamond-dep fix.**
279+
280+
### Example 5 — New kernel landing as a new symbol (additive)
281+
282+
Hypothetical: a fused multiply-add cosine variant lands. Old + new coexist:
283+
284+
```rust
285+
// crates/ndarray/src/hpc/heel_f64x8.rs — new function, existing unchanged
286+
pub fn cosine_f64_simd(a: &[f64], b: &[f64]) -> f64 { /* existing */ }
287+
288+
/// FMA variant. Lower latency on AVX-512 + AVX2-FMA hosts.
289+
/// Numerically identical within f64::EPSILON * len.
290+
pub fn cosine_f64_simd_fma(a: &[f64], b: &[f64]) -> f64 { /* new */ }
291+
```
292+
293+
Consumers pick. Nothing breaks.
294+
295+
---
296+
297+
## 8. What this plan asks of the other repos
298+
299+
Nothing structural — only that consumers stay on the stable surface (§5) and report breakage promptly. Specifically:
300+
301+
- **surrealdb**: `idx/trees/vector.rs` should only use `ndarray::hpc::*` items listed in §5. Anything else is a non-stable detail and may break without notice.
302+
- **lance-graph**: cognitive crates should use `heel_f64x8` distance kernels; if a kernel is missing (e.g. Hamming), file an issue here rather than implementing locally.
303+
- **sea-orm**: no direct dep on this fork; touches it only transitively if a consumer uses sea-orm-arrow with `f64` Arrow columns.
304+
305+
---
306+
307+
## 9. Open questions
308+
309+
1. **`F32x16` priority** — is a cognitive shader consumer planning to move to f32? If yes, Sprint 3 fast-track. If no, defer.
310+
2. **Quantised int8 distance kernels** — trigger Sprint 3 item when a concrete consumer surfaces.
311+
3. **WASM target** — surrealdb has a WASM build path. Does it need `vector-hpc`? Today the scalar fallback covers it. Confirm with surrealdb plan.
312+
4. **Numeric tolerance documentation** — currently "within `f64::EPSILON * len`"; doc-test it in Sprint 0.
313+
5. **`#[stable]` attribute convention** — use Rust nightly `#[stable]` (not available on stable) or a doc-comment convention? Probably the latter for portability; revisit when nightly `#[stable]` stabilises.
314+
315+
---
316+
317+
## 10. Cross-references
318+
319+
- **Glue #1** (surrealdb-ractor): `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` §5
320+
- **Glue #2** (TiKV TableProvider): `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` §5
321+
- **Glue #3** (sea-orm-ractor): `AdaWorldAPI/sea-orm:.claude/plans/integration-plan.md` §5
322+
- **Glue #4** (cognitive-shader-actor): `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` §6
323+
- **Cognitive crate consumers** (the load-bearing reason this fork exists): `AdaWorldAPI/lance-graph:.claude/plans/integration-plan.md` §3 + §4
324+
- **surrealdb's `vector-hpc` feature**: `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` §4 (`core/Cargo.toml:71-77`)
325+
- **`lance-projection` sibling** (analytic view of cognitive crate outputs): `AdaWorldAPI/surrealdb:.claude/plans/integration-plan.md` §6

0 commit comments

Comments
 (0)