docs(plan): lite-unified-surrealql-lance-v1 (feature-gated, test-don't-commit)

claude · claude · commit 2821a0336843 · 2026-06-18T13:01:11.000Z
Capture the "lite unified" bet as a CONJECTURE plan to test behind a feature gate (NOT a default-build change): collapse the two query engines (datafusion + SurrealQL) + two stores (lance + rocksdb) to ONE store (lance-KV) + ONE primary query surface (SurrealQL/AR-API); datafusion feature-gated for analytical SQL; rocksdb dropped; DO-arm ExecTarget::SurrealQl becomes the primary exec path. Win for graph/AR/cognitive (Cypher->SurrealQL is a better lowering than Cypher->datafusion-SQL); downgrade for analytical SQL (datafusion kept feature-gated). Falsifier: datafusion_planner query-shape coverage in SurrealQL. Blockers: kv-lance not feature-wired, polyglot->SurrealQL lowering missing. Gated on a convergence+cross-domain+truth-architect probe before any promotion. Board: INTEGRATION_PLANS prepended. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
diff --git a/.claude/board/INTEGRATION_PLANS.md b/.claude/board/INTEGRATION_PLANS.md
@@ -1,3 +1,15 @@
+## 2026-06-18 — lite-unified-surrealql-lance-v1 (one store + one query surface, feature-gated; CONJECTURE, test-don't-commit)
+
+**Status:** CONJECTURE / design — feature-gated test path, NOT a default-build change. **Plan file:** `.claude/plans/lite-unified-surrealql-lance-v1.md`.
+**Owns:** the "lite unified" bet — collapse the two query engines (datafusion + SurrealQL) + two stores (lance + rocksdb) to **ONE store (lance-KV) + ONE primary query surface (SurrealQL/AR-API)**, datafusion feature-gated (`datafusion-analytical`), rocksdb dropped; the DO-arm `ExecTarget::SurrealQl` becomes the primary exec path.
+- **Win** (graph/AR/CRUD/cognitive/vector): Cypher→SurrealQL is a better lowering than Cypher→datafusion-SQL (surreal is natively graph); drops the rocksdb C++ build + makes datafusion optional. **Downgrade** (heavy analytical SQL): datafusion's strength → kept feature-gated, not deleted.
+- **Falsifier (truth-architect):** lance-graph `datafusion_planner` test queries → can SurrealQL express each? Covered → drop datafusion for that path; gaps → keep `datafusion-analytical`. Measure footprint (proxy: lance-graph ≈889 crates, surreal-all ≈1148, SurrealQL-engine marginal ~260, rocksdb separate C++).
+- **Blockers (OQ-LU-1/2/3):** surreal kv-lance not yet feature-wired (`surrealdb/core/src/kvs/lance/` module implemented, no `kv-lance` feature); polyglot→SurrealQL lowering doesn't exist (today polyglot→datafusion); SPARQL/Gremlin lowering cleanliness unknown.
+**Gate before any promotion:** convergence + cross-domain (mechanism-vs-rhyme) + truth-architect (query-shape coverage). Do NOT touch the default build until green.
+**Repos:** lance-graph (+ surrealdb fork for kv-lance). Surfaced from the footprint discussion (drop datafusion+rocksdb) on branch `claude/soa-write-deinterlace-inc2`.
+
+---
+
 ## 2026-06-18 — mailbox-belief-update-and-substrate-test-v1 ("what did I learn" = NARS-revision delta + two-axis test; 5+3-ratified; slots S2.5b)
 
 **Status:** CONJECTURE / design — 5+3 COMPLETE. **Plan file:** `.claude/plans/mailbox-belief-update-and-substrate-test-v1.md`. Parent: `bindspace-singleton-to-mailbox-soa-v1` §11 + `E-SOA-CYCLE-OWNERSHIP`.
diff --git a/.claude/plans/lite-unified-surrealql-lance-v1.md b/.claude/plans/lite-unified-surrealql-lance-v1.md
@@ -0,0 +1,81 @@
+# lite-unified-surrealql-lance-v1 — one store + one query surface, behind a feature gate
+
+> **Status:** CONJECTURE / design. **Test via feature gate; do NOT commit the
+> stack change.** Needs a convergence + cross-domain + truth-architect probe
+> (mechanism-vs-rhyme + the query-shape measurement) before any promotion.
+> **Date:** 2026-06-18. **Parent threads:** the DO-arm (`ExecTarget::SurrealQl`,
+> `lance-graph-contract::action`), `docs/STACK_SCAFFOLD.md`, the
+> "cold TS + kanban stay Lance-native" ruling.
+
+## Epiphany (less is more)
+
+Today there are **two query engines over the same lance storage** (lance-graph's
+*datafusion* planner + surreal's *SurrealQL*) and **two storage engines**
+(lance vs rocksdb). The "lite unified" bet collapses both: **ONE store (lance-KV)
++ ONE primary query surface (SurrealQL via the AR-API adapter)**, datafusion
+**feature-gated**, rocksdb **dropped**. Cypher/SQL/neo4j lower to SurrealQL —
+which is *natively* graph (`->edge->`), a better target than Cypher→datafusion-SQL.
+
+## The bet, as a feature gate (default-OFF)
+
+A `lite-unified` feature that, when ON:
+1. **Storage = surreal kv-lance** (one store; drop rocksdb). *Blocked on:* surreal
+   kv-lance is implemented as a module but not yet feature-wired
+   (`surrealdb/core/src/kvs/lance/`, the `.claude/lance-backend` integration).
+2. **Query/exec = SurrealQL** via the AR-API adapter. The polyglot parser
+   (Cypher/GQL/Gremlin/SPARQL/neo4j) lowers to **SurrealQL** (or the DO-arm
+   `ActionInvocation`) instead of datafusion SQL. *Missing today:* the
+   polyglot→SurrealQL lowering (today it's polyglot→datafusion).
+3. **datafusion = `optional`, OFF** on this path. Kept behind a separate
+   `datafusion-analytical` feature for the workloads that genuinely need
+   vectorized/analytical SQL (joins, aggregations) — SurrealQL's weak spot.
+4. The DO-arm `ExecTarget::SurrealQl` becomes the **primary** exec path, not one
+   of four.
+
+## What stays regardless (NOT datafusion)
+
+lance vector search, CAM-PQ / bgz17 codec stack, the cognitive substrate
+(BindSpace→MailboxSoA, the write contract, the SPO/AriGraph tissue). These are
+orthogonal to the query-engine choice.
+
+## Where it's a win vs a downgrade (the honest split)
+
+- **Win (the bulk):** graph traversal, AR CRUD, cognitive/SPO, vector search —
+  SurrealQL-on-lance fits, and Cypher→SurrealQL graph is a *better* lowering.
+  Footprint: drop the rocksdb C++ build outright; make datafusion (a large Rust
+  dep) optional.
+- **Downgrade:** heavy analytical SQL (multi-way joins, aggregations, columnar
+  scan) — datafusion's strength, SurrealQL's weakness. Hence datafusion stays
+  feature-gated, not deleted.
+
+## Falsifier (truth-architect — measure before promoting)
+
+Take lance-graph's `datafusion_planner` test queries (the Cypher→SQL cases) and
+check **SurrealQL can express each**. Covered → drop datafusion for that path;
+analytical gaps → keep `datafusion-analytical` for those only. Also measure the
+real footprint delta (`cargo tree --no-default-features` + release `cargo bloat`)
+once kv-lance is feature-wired — the proxy is lance-graph ≈ 889 crates, surreal
+(all backends) ≈ 1148; the marginal SurrealQL-engine cost is ~260 crates, rocksdb
+is a separate C++ build.
+
+## Increments (all behind `lite-unified`, none committed to the default path)
+
+1. **Probe (no code):** convergence + cross-domain (mechanism-vs-rhyme) +
+   truth-architect (the datafusion_planner query-shape coverage check). Gate.
+2. **Wire surreal kv-lance** as a feature (finish the `.claude/lance-backend`
+   integration; add the `kv-lance` feature + lance dep + `mod lance` in `kvs/mod.rs`).
+3. **Polyglot→SurrealQL lowering** — the missing front-end leg (parallel to the
+   existing polyglot→datafusion).
+4. **`datafusion` → `optional`** + a `datafusion-analytical` feature; default the
+   common path to SurrealQL-on-lance under `lite-unified`.
+5. **Measure** footprint + query-shape coverage; promote CONJECTURE→FINDING or
+   correct.
+
+## Blockers / open questions
+
+- **OQ-LU-1:** surreal kv-lance feature-wiring (the integration TODOs).
+- **OQ-LU-2:** does SurrealQL cover the lance-graph datafusion_planner query
+  shapes the live workloads actually use? (the falsifier).
+- **OQ-LU-3:** is the polyglot→SurrealQL lowering cleaner than polyglot→datafusion
+  for the non-graph dialects (SPARQL/Gremlin)?
+- Do NOT touch the default build until the probe is green.