Skip to content

doc(knowledge): substrate-b consumer integration — NEW-stack capability shape + plans#465

Merged
AdaWorldAPI merged 6 commits into
mainfrom
doc/knowledge-old-stack-capability-parity
Jun 4, 2026
Merged

doc(knowledge): substrate-b consumer integration — NEW-stack capability shape + plans#465
AdaWorldAPI merged 6 commits into
mainfrom
doc/knowledge-old-stack-capability-parity

Conversation

@AdaWorldAPI

@AdaWorldAPI AdaWorldAPI commented Jun 4, 2026

Copy link
Copy Markdown
Owner

Summary

Adds a knowledge doc capturing the substrate-b consumer integration shape: which lance-graph + ractor + surrealdb capabilities compose, the three load-bearing primitives consumers must understand, and an honest capability roadmap (built / partial / not-yet) so consumer integration sequencing isn't guessed.

This is the lance-graph-side complement to lab-vs-canonical-surface.md (the rule) + hollow-wire-failure-modes.md (the failure mode) — adding "the capability shape this rule + failure mode protect."

What the doc captures

§1 The seven-capability composition

A substrate-b integration of lance-graph composes:

  1. Storage (lance-graph + Lance versions as temporal axis)
  2. Distributed KV (surrealdb kv-lance via docs: mark Phase 2 DONE, add integration status and progress tracker Phase 2 (bgz17 container/semiring) verified complete — 121 tests passing. Added cross-repo integration status section to FINAL_STACK.md. Created PROGRESS.md tracking Plateaus 0-3 against master integration plan. https://claude.ai/code/session_01CdqyUTUfjKZuk8YGJzv6LB #35/docs: update FalkorCompat shim status in architecture map Documents that FalkorCompat currently only has blasgraph backend wired; DataFusion and palette backends not yet connected. https://claude.ai/code/session_01CdqyUTUfjKZuk8YGJzv6LB #36, or external TiKV)
  3. Search (Tantivy)
  4. Analytics (DataFusion over Lance versions)
  5. Actors / dispatch (ractor + MessagingErr::Saturated via AdaWorldAPI/ractor#1)
  6. In-process event bus (LanceVersionWatcher, std::sync, never tokio per I-2)
  7. OIDC / IAM (external Zitadel + in-proc auth-plug JWT validation)

§2 Three load-bearing NEW-stack primitives

  1. Lance versions are multi-purpose. One primitive serves three capabilities a consumer would otherwise build separately: checkout_version(V) = point-in-time; the version log = time-series; append-only immutability = audit. Consumers should NOT introduce separate stores.
  2. Per-element auth = palette256 + Hamming popcount on Binary16K. Per-vertex bitmap materialised on write; check on read via bit-intersection; uncached / immediate-effect by construction. ACL changes at version V are in effect at every read at version ≥ V — no auth-cache to invalidate.
  3. ractor Actor + Lance-version-as-state-machine = Rubicon phase machine. Typed state enum + state-enter side-effect fires the Lance commit at the Decision state + deferred-events-before-Decision + per-state timeouts. The actor's state history IS the Lance version log on its dataset.

§3 Honest capability roadmap

Built today: Lance versions, LanceVersionWatcher (std::sync), MessagingErr::Saturated, surrealdb kv-lance, planner 16 strategies + 12 thinking styles + NARS, auth-plug, palette256+Hamming primitives, cognitive-shader-driver, EpisodicEdges64 Phase A, OGAR Sprint 5/6 (#5/#6/#7/#8).

Partial: lance-graph consumer surface, DataFusion OLAP surface, distributed actor topology, OGIT data-model coverage.

Not yet: Tantivy wiring, OGAR Sprint 7 (gated on protoc-build access), peer-Raft consensus pick (openraft / surreal-cluster / TiKV), the migration endpoint router (consumer-side), WS/gRPC Layer-3 outbound.

Gated: DemotionSink Phase C cold-tier impl (OQ-11.6), EpisodicWitness64 Phase D SoA column (needs MailboxSoA<N>).

§4 The migration endpoint contract

The substrate-b dual-stack ground-truth surface — same workload replayed against substrate-b AND the system being replaced, per-endpoint §14 verdict. The contract (minimal shape consumers extend per workload):

  • POST /v1/{entity,edge,traverse,query,graphql,audit} + version-pinnable reads via ?at=V
  • WS /v1/stream proxies LanceVersionWatcher::subscribe() (Layer-3 tokio outbound)
  • POST /v1/dispatch is substrate-b-specific (no comparison)
  • Lifecycle markers, never hard delete

§5 Five integration patterns that fall out of the primitives

  1. Three OLD components collapse to one Lance-versions primitive (consumers shouldn't introduce separate stores).
  2. ACL changes immediate-by-construction (no auth cache to invalidate).
  3. State history IS the version log (no separate state-machine event store).
  4. In-process events are std::sync per I-2 (tokio reserved for Layer-3 outbound); this is a hollow-wire-failure-modes.md failure-mode magnet.
  5. OGAR is the data-model entry point (5-step integration sequence from Class IR → MappingProposalOntologyRegistry → planner dispatch → LanceMembrane projection → LanceVersionWatcher fan-out).

§6 OGAR carrier integration sequence

5-step pattern for wiring new domain models through OGAR → lance-graph-ontology → lance-graph-planner → lance-graph-callcenter's LanceMembrane.

§7 Process rule

4-step strip-back check before proposing a new lance-graph integration trait / contract / coordination surface (capability roadmap → encoding ecosystem → lab-vs-canonical → hollow-wire).

Why it lives here (not in consumer repos)

Any substrate-b consumer integrating lance-graph + ractor + surrealdb hits the same correspondence questions: what's built, what's partial, which primitive serves which design pattern. Documenting it once upstream — alongside the rule (lab-vs-canonical-surface.md) and the failure mode (hollow-wire-failure-modes.md) — lets every consumer reuse the answer without re-deriving.

What's NOT in this doc

This is the substrate-b shape doc — capability composition + integration patterns + roadmap honesty. It does NOT cross consumer-internal boundaries: no specifications, versions, or shapes from any specific consumer system are referenced. Every cross-reference is to a lance-graph / surrealdb / ractor / OGAR PR or knowledge doc.

Board hygiene (per CLAUDE.md mandatory rule)

Same PR includes:

  • .claude/board/EPIPHANIES.md PREPEND: E-SUBSTRATE-B-CAPABILITY-ROADMAP — codifies the three load-bearing NEW-stack primitives.
  • .claude/board/AGENT_LOG.md PREPEND: D-SUBSTRATE-B-CONSUMER-DOC SHIPPED.

Asks

  1. Confirm doc placement (.claude/knowledge/) — happy to relocate.
  2. Confirm the "old-stack-capability-parity" filename — happy to rename to substrate-b-capability-shape.md or similar if that lands better given the rewrite's NEW-stack focus.
  3. Review the §3 capability roadmap — happy to refine the built / partial / not-yet classifications.

Severity

P2 — preventive documentation. The capability roadmap + integration patterns are the natural reference any substrate-b consumer needs; this codifies them upstream so consumers don't have to re-derive them. Companion to existing lab-vs-canonical-surface.md + hollow-wire-failure-modes.md.

Summary by CodeRabbit

  • Chores
    • Updated internal documentation and knowledge base entries to reflect system architecture and integration patterns.

…spondence

Companion to lab-vs-canonical-surface.md (the rule) + hollow-wire-failure-modes.md
(the failure mode). Captures the confirmed substrate-b ↔ OLD HIRO/Bardioc stack
correspondence from three ground-truth sources:

- Almato's own published OSS manifest (bitbucket.org/almatoag/opensource, the
  list r7.1 Product Description §5.5 cites)
- OGIT closed-PR harvest (~493 PRs 2017-2026)
- Almato Bardioc r7.1 Product Description (Oct 2025, 33 pages)

20-row correspondence table covers every confirmed OLD component → substrate-b
primitive: Zitadel-stays + Security-Mesh bit-ops = palette256+Hamming + signed-
audit = Lance-version-log + Kafka 0.8.2.2+ZooKeeper 3.4.6 = LanceVersionWatcher
(in-proc) + Titan 0.4.4 → lance-graph + Gremlin 2.4 → lance-graph-planner +
Cassandra → TiKV/surrealdb kv-lance + ES 1.7+Lucene 4.10 → Tantivy + ClickHouse+
Spark → DataFusion + InfluxDB → Lance versions + S3 → Lance fragments + swarm/
libcluster → ractor + sbroker/pobox → bounded mailbox + lru_cache/con_cache →
dn_redis + rafted_value → openraft + gen_statem → Rubicon-as-ractor-actor
(shape-exact) + expr → planner thinking + Jetty WS → substrate-b WS + Jena →
OGIT compile-time check.

Three load-bearing structural findings:
1. Three OLD components (Historisation + TSDB + audit) collapse to one NEW
   primitive (Lance versions)
2. Security Mesh bit-ops = palette256+Hamming (shape-exact)
3. gen_statem is the confirmed OLD-stack precedent for the Rubicon model
   (rafted_value uses it; state_enter/postpone/timeouts ARE Rubicon semantics)

Boundary collapse documented: OLD 8 boundaries + 4 concurrency models → NEW 0
in-binary application boundaries + 1 retained external Go IAM + 1 concurrency
model + honest Raft consensus tax.

OGIT data model captured: 10 production workloads (BGFS, Auth/Device, three-
layer identity Person/Account/DataScope, Documents, Automation, Knowledge,
Tickets, OSINT, Org-lifecycle, Trust, Forms, MARS-survives-within-OGIT).
… NEW (Lance versions); Security Mesh = palette256+Hamming; gen_statem = Rubicon precedent

PREPEND-only per board-hygiene rule. Records the three load-bearing structural
findings from the substrate-b ↔ OLD HIRO/Bardioc capability correspondence,
grounded in Almato's own published OSS manifest + the OGIT closed-PR harvest.
…ility-parity.md (companion to lab-vs-canonical-surface + hollow-wire)

PREPEND-only per board-hygiene rule.
…ty shape + plans

REWRITE focusing on substrate-b NEW-stack capabilities only — no consumer-
internal specifications cross the upstream boundary. Captures: seven-capability
composition (lance-graph + surrealdb kv-lance + Tantivy + DataFusion + ractor +
LanceVersionWatcher + external Zitadel); three load-bearing primitives (Lance
versions multi-purpose / palette256+Hamming per-element auth / ractor Actor +
Lance-version-as-state-machine = Rubicon); built-today capability roadmap honest
accounting; the migration endpoint contract as substrate-b's dual-stack ground-
truth surface (POST /v1/{entity,edge,traverse,query,graphql,audit} + WS /v1/
stream + POST /v1/dispatch); five consumer integration patterns that fall out
of the primitives (3-in-1 collapse / ACL changes immediate-by-construction /
state history IS the version log / in-proc events are std::sync per I-2 /
OGAR is the data-model entry point); process rule for substrate-b consumers.
…ack primitives codified; consumer integration shape documented
…integration shape (companion to lab-vs-canonical-surface + hollow-wire-failure-modes)
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Three documentation files create a unified reference for the substrate-b integration architecture: a comprehensive knowledge document specifying the capability shape, load-bearing primitives (Lance versioning, per-element authorization, Rubicon ractor state machine), capability roadmap, and migration endpoint contract; an epiphanies entry codifying the key discoveries; and an agent log recording the shipment.

Changes

Substrate-b Integration Documentation

Layer / File(s) Summary
Substrate-b integration specification and primitives
\.claude/knowledge/old-stack-capability-parity.md, \.claude/board/EPIPHANIES.md, \.claude/board/AGENT_LOG.md
Creates comprehensive substrate-b integration reference spanning seven cooperating capabilities, three load-bearing primitives (Lance versions, Binary16K bitmap per-element auth with Hamming popcount, ractor Rubicon phase machine), capability roadmap, dual-stack migration endpoint contract with version-pinnable reads and WS subscription preservation, and integration patterns. Epiphanies entry codifies primitives and consumer wiring patterns. AGENT_LOG documents the knowledge artifact shipment.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • AdaWorldAPI/lance-graph#418: Documents the same Rubicon ractor/actor state-machine semantics tied to irreversibility and ractor lifecycle, with this PR extending the pattern to Lance-version commit/history.
  • AdaWorldAPI/lance-graph#454: Overlaps on Lance versions as temporal/snapshot primitive and version log replication semantics, sharing the same foundational concepts around versioning and consensus.
  • AdaWorldAPI/lance-graph#434: Formalizes the same Rubicon workflow and Lance-version-as-temporal-commit primitive concepts in its unified SOA convergence plan, making the architectural patterns tightly aligned.

Poem

🐰 A docstring hops through substrate-b,
Lance versions bloom like carrots, oh so free,
Per-element auth with popcount's gleam,
Rubicon ractor joins the team,
The knowledge grows, the pattern's bright,
Our system dances through the night! 🥕✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main change: a new knowledge document describing the substrate-b consumer integration shape with its capability composition and roadmap for the NEW-stack.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch doc/knowledge-old-stack-capability-parity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 03149d7a4e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


- **Point-in-time query** = `dataset.checkout_version(V_ref)` — pin an immutable snapshot at any version
- **Time-series** = the version log itself — every commit is a versioned event with a timestamp
- **Immutable audit** = append-only by construction — versions never disappear; the log IS the audit trail

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve audit retention outside prunable Lance versions

For deployments that run Lance version cleanup, this premise is unsafe: Lance 7.0.0 exposes Dataset::cleanup_old_versions and lance.auto_cleanup.* settings that can remove old versions, so the version log is not guaranteed to be an immutable audit trail unless consumers explicitly disable cleanup/tag retained versions. Because this doc later tells substrate-b consumers not to introduce separate audit storage, following it can make historical audit reads disappear after cleanup.

Useful? React with 👍 / 👎.

@AdaWorldAPI AdaWorldAPI merged commit 6c93d48 into main Jun 4, 2026
1 check was pending
AdaWorldAPI added a commit that referenced this pull request Jun 4, 2026
…-policy-gated, not by-construction-immutable (codex P1 on #465)

§2.1 audit bullet: renamed from 'Immutable audit' to 'Audit (retention-policy-
gated)'; explicit retention guidance (disable auto-cleanup OR tag versions OR
route to separate sink); regulatory-grade audit requires external signed
write-once sink — Lance versions alone NOT a substitute.

§5.1 collapse pattern: renamed from 'Three OLD components collapse to one' to
'Two-and-a-half OLD components collapse to one'; non-regulatory audit (with
retention configured) shares Lance versions; regulatory audit remains a
separate concern.

The three-primitives codification (E-SUBSTRATE-B-CAPABILITY-ROADMAP) survives.
Multi-purpose-Lance-versions claim still load-bearing — what changes is the
audit guarantee + the consumer-default guidance.

Codex P1 finding on #465: Lance 7.0+ exposes Dataset::cleanup_old_versions +
lance.auto_cleanup.*; following the original 'introduce no separate store'
guidance could make historical audit reads disappear after cleanup.
AdaWorldAPI added a commit that referenced this pull request Jun 4, 2026
…rsions-as-audit claim corrected to retention-policy-gated (codex P1 on #465)
AdaWorldAPI added a commit that referenced this pull request Jun 4, 2026
…bility-parity-fix

fix(knowledge): audit retention caveat — Lance versions are retention-policy-gated, not by-construction-immutable (codex P1 on #465)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant