Skip to content

feat(graph): Backend::MailboxSoa — classid node-match + CLAM/CAKES neighborhood (Inc 0)#544

Open
AdaWorldAPI wants to merge 9 commits into
mainfrom
claude/inc0-mailbox-soa-backend
Open

feat(graph): Backend::MailboxSoa — classid node-match + CLAM/CAKES neighborhood (Inc 0)#544
AdaWorldAPI wants to merge 9 commits into
mainfrom
claude/inc0-mailbox-soa-backend

Conversation

@AdaWorldAPI

@AdaWorldAPI AdaWorldAPI commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Inc 0 — Backend::MailboxSoa: the substrate-is-the-graph dispatch table, two facets landed

The substrate IS the graph (E-GUID-IS-THE-GRAPH); a query routes to the cheapest facet that answers it, off the GUID key, zero value decode. This PR lands the two key-only facets.

Facet 1 — classid node-match (MATCH (n:Label))

  • match_nodes_by_class(view, class_id) — classid prefix-route; reads only the class column.
  • match_node_by_local_key(view, local_key) — point lookup via MailboxSoaView::row_for_local_key (None-fallback to positional).

Facet 2 — CLAM/CAKES neighborhood (proximity) — panCAKES ≡ radix trie ≡ HHTL

The CLAM cluster tree isn't a separate structure — it is the radix trie of the classid·HEEL·HIP·TWIG nibble paths already in the keys (E-PANCAKES-IS-RADIX-IS-HHTL). So neighborhood = pure prefix arithmetic:

  • NiblePath::common_prefix_depth (contract) — the radix-trie NN measure (CAKES attraction = longest-common-prefix).
  • MailboxSoaView::hhtl_path_at (contract) — per-row HHTL path, deferred-binding default None (canon NodeRow already carries key(16)).
  • clam_contained (is_ancestor_of = the radix subtree = CLAM cluster) + cakes_nearest (common_prefix_depth k-NN). Zero value decode.

Gates (tests, all green, clippy clean)

  • node-match parity vs reference classid filter;
  • CLAM containment = the radix subtree (rows 0,1,2 under 1·2; leaf query narrows to 0);
  • CAKES ranks by shared-prefix depth [(0,3),(1,2),(2,2)];
  • deferred-None HHTL ⇒ scan yields nothing (coarser-facet fallback, never a wrong row);
  • F2 zero-value-decode: a GuardedSoa whose energy()/meta_raw() panic — all facets complete without touching the 480 B value slab.
  • 7/7 mailbox_scan, 21/21 hhtl, 674 contract green.

Dispatch-table tiers still ahead (grounded, not faked)

  • EdgeBlock typed-edge ((a)-[r:TYPE]->(b)): 12-family/4-external or 32×4 turbovec, per classid → EdgeCodecFlavor (E-ADJACENCY-IS-KEY-AND-EDGECODEC).
  • helix exact-location (Signed360, HelixResidue tenant): the one tier that IS a value decode, costed as such (E-HELIX-IS-EXACT-LOCATION).
  • CHAODA anomaly + CausalEdge64 SPO (E-CLAM-IS-THE-MANIFOLD-ENGINE).

Additive, layout-preserving — no NodeRow/stride/ENVELOPE_LAYOUT_VERSION change, no new object model (F5).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation

    • Updated design documentation with enhanced graph routing architecture and geometry ensemble clarifications.
  • Chores

    • Extended graph routing infrastructure with new backend variant for deferred edge traversal operations.
    • Implemented graph scanning operations for node matching, key resolution, and neighborhood detection.
    • Added graph topology API enhancements for prefix-depth computation across hierarchical paths.

…(Inc 0 first slice)

cypher-kanban-ast-unification-v1 Inc 0, the verified-safe half: the substrate IS
the graph (E-GUID-IS-THE-GRAPH), so MATCH (n:Label) is a classid prefix-route
over the zero-dep MailboxSoaView contract, resolved off the class column with
zero value decode.

- graph_router::Backend gains the MailboxSoa variant (the named router gap).
- graph/mailbox_scan.rs: match_nodes_by_class (classid route; reads only the
  class column) + match_node_by_local_key (local_key->row via row_for_local_key,
  None-fallback to positional address).
- Gates: parity (matched set == reference classid filter); F2 zero-value-decode
  proven structurally by a GuardedSoa whose energy()/meta_raw() panic on access;
  key-index point lookup. 4/4 green, no new clippy warnings.

Edge-traversal ((a)-[r]->(b)) deliberately deferred, grounded not faked:
CausalEdge64 (the edges_raw column) is an SPO triple, NOT a row->row adjacency
pointer, and the View exposes no EdgeBlock adjacency accessor. That is the
edge-representation boundary the 5+3 council said to pin first (verdict 4b);
it lands as the next slice once the classid-resolved edge rep + adjacency
accessor are added.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@AdaWorldAPI, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 55 minutes and 34 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0195b240-b221-4534-b2c2-e09317816459

📥 Commits

Reviewing files that changed from the base of the PR and between 237ae65 and 8fa4238.

📒 Files selected for processing (1)
  • .claude/board/EPIPHANIES.md
📝 Walkthrough

Walkthrough

Adds Backend::MailboxSoa variant and MailboxSoaView contract extensions (hhtl_path_at, edge_block_at), implements radix-trie shared-prefix depth utility, introduces a mailbox_scan module with NodeMatch-based class/local-key node matching and zero-decode neighborhood facets (CLAM containment, CAKES nearest, coarse edge slots), plus comprehensive guarded tests and design documentation clarifying helix location encoding, decode-cost ladders, and multi-facet adjacency representation.

Changes

MailboxSoA Node-Match Routing and Zero-Decode Neighborhood Facets

Layer / File(s) Summary
Backend::MailboxSoa enum variant and module export
crates/lance-graph/src/graph/graph_router.rs, crates/lance-graph/src/graph/mod.rs
Adds MailboxSoa variant to the Backend enum with documentation describing canonical GUID-keyed substrate routing and deferred edge traversal; exports the new mailbox_scan submodule.
MailboxSoaView contract extensions
crates/lance-graph-contract/src/soa_view.rs
Adds optional per-row accessors hhtl_path_at(row) -> Option<NiblePath> and edge_block_at(row) -> Option<EdgeBlock> with #[inline] default None implementations for deferred per-row data materialization.
Radix-trie shared-prefix depth utility
crates/lance-graph-contract/src/hhtl.rs
Implements NiblePath::common_prefix_depth(self, other) -> u8 computing the longest matching prefix depth between two HHTL paths, plus unit tests verifying correct depth for identical, divergent, ancestor, disjoint, symmetric, and empty-path cases.
NodeMatch struct and basic scan functions
crates/lance-graph/src/graph/mailbox_scan.rs
Defines NodeMatch carrying row index and Backend::MailboxSoa tag; implements match_nodes_by_class scanning only the class_id SoA column and match_node_by_local_key resolving via row_for_local_key, both without value-slab column access.
Neighborhood geometry facets: CLAM, CAKES, and coarse edge slots
crates/lance-graph/src/graph/mailbox_scan.rs
Implements clam_contained filtering rows by CLAM subtree ancestry; cakes_nearest computing per-row shared-prefix depth vs. query and sorting descending; EdgeNeighbors struct and edge_slots_coarse reading EdgeBlock for CoarseOnly flavor and extracting non-zero slot bytes.
Comprehensive zero-decode test suite
crates/lance-graph/src/graph/mailbox_scan.rs
Adds extensive unit tests using GuardedSoa test double that panics on value-slab access, validating class/local-key resolution, CLAM containment, CAKES depth sorting, and edge-slot flavor gating across positive/empty/skip-when-path-missing cases.
Design documentation: helix location and multi-facet adjacency
.claude/board/EPIPHANIES.md
Prepends five epiphany entries documenting tenant-switch costed sweep, panCAKES/GUID-key prefix unification, CLAM manifold geometry engine with cost classes, helix decode-cost ladder (containment → deterministic Place → optional HelixResidue), and multi-facet adjacency representation (HHTL/CLAM prefix cascade for zero-decode neighborhood, EdgeBlock typed edges, and CausalEdge64 causal arcs).

Sequence Diagram(s)

sequenceDiagram
  participant CypherRouter
  participant match_nodes_by_class
  participant match_node_by_local_key
  participant MailboxSoaView

  rect rgba(100, 149, 237, 0.5)
    Note over CypherRouter,MailboxSoaView: MATCH (n:Label) — class route
    CypherRouter->>match_nodes_by_class: view, class_id
    match_nodes_by_class->>MailboxSoaView: class_id() column only
    MailboxSoaView-->>match_nodes_by_class: entity_type rows
    match_nodes_by_class-->>CypherRouter: Vec<NodeMatch {row, Backend::MailboxSoa}>
  end

  rect rgba(144, 238, 144, 0.5)
    Note over CypherRouter,MailboxSoaView: MATCH (n {key}) — local-key route
    CypherRouter->>match_node_by_local_key: view, local_key
    match_node_by_local_key->>MailboxSoaView: row_for_local_key(local_key)
    MailboxSoaView-->>match_node_by_local_key: Option<usize>
    match_node_by_local_key-->>CypherRouter: Option<NodeMatch>
  end

  rect rgba(255, 218, 185, 0.5)
    Note over CypherRouter,MailboxSoaView: Neighborhood facets — zero-decode geometry
    CypherRouter->>match_nodes_by_class: clam_contained(view, query)
    match_nodes_by_class->>MailboxSoaView: hhtl_path_at(row) per candidate
    MailboxSoaView-->>match_nodes_by_class: Option<NiblePath>
    match_nodes_by_class-->>CypherRouter: Vec<NodeMatch> in CLAM subtree
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • AdaWorldAPI/lance-graph#437: Introduced the MailboxSoaView contract that this PR extends with hhtl_path_at and edge_block_at methods to enable zero-decode neighborhood facet implementations.
  • AdaWorldAPI/lance-graph#542: Added MailboxSoaView::row_for_local_key which the new match_node_by_local_key function directly calls to resolve local keys to row indices.

Poem

🐇 Hop through the SoA rows with care,
No value-slab decodes to spare!
class_id column, prefix depth measured true,
CLAM containment, CAKES through and through.
The helix ladder climbs with grace—
Exact coordinates find their place! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: introducing Backend::MailboxSoa with classid node-matching and CLAM/CAKES neighborhood facets, which are the three key-resident dispatch facets landed in this PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6e2b3e1c4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +61 to +64
let classes = view.class_id();
classes
.iter()
.enumerate()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Clamp class scans to logical rows

This iterates the entire class_id() slice instead of the view's logical row count. That is fine for the test GuardedSoa, but the real in-memory MailboxSoA<N> reports n_rows() == populated while entity_type() returns the full backing array capacity, initialized to zeros (cognitive-shader-driver/src/mailbox_soa.rs documents this phantom-row guard around n_rows). In contexts using that view, MATCH can return unpopulated padding rows (for example an empty mailbox queried for class 0, or stale padding after the logical size shrinks), corrupting node-match results; please clamp the scan with view.n_rows()/take(view.n_rows()).

Useful? React with 👍 / 👎.

claude added 4 commits June 18, 2026 20:24
…rep boundary (§4b)

Operator correction: adjacency lives in two places, classid/key-resolved, never
query-guessed:
1. HHTL cascade in the GUID key = CLAM hierarchical neighborhood (NiblePath
   is_ancestor_of/prefix; graph/neighborhood/clam.rs) — free, zero value decode.
2. 16-byte EdgeBlock = explicit typed edges per EdgeCodecFlavor: CoarseOnly
   (16x8) = 12 in-family + 4 external; Pq32x4 (32x4) = turbovec residue edges.
3. edges_raw = CausalEdge64 = SPO causal arcs (separate facet).
The class picks the rep (classid -> ClassView -> EdgeCodecFlavor). Unblocks the
deferred edge half of #544: next slice exposes the HHTL/key + EdgeBlock per row
on MailboxSoaView, then CLAM prefix-route + EdgeBlock slot-deref, both
zero-value-decode.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…lix Signed360 is the exact orthogonal point, "where" is a decode ladder

Operator correction: the helix is not more adjacency, it is the EXACT orthogonal
LOCATION. Adjacency (HHTL/CLAM near, EdgeBlock connected) is relational; helix
Signed360 (ValueTenant::HelixResidue, signed full-sphere golden-spiral, 6B) is
the absolute exact coordinate. "Where" is a decode-cost ladder:
1. HHTL/CLAM containment - key prefix, zero value decode (which cluster).
2. Helix PLACE - deterministic from the address, zero value decode.
3. Helix RESIDUE - Signed360 6B in the value slab, one value-tenant decode (exact).
Router consequence: proximity query = key (free); exact-position query = read the
HelixResidue tenant (a value decode, costed as such). Grounded in canonical_node
ValueTenant::HelixResidue + ValueSchema::Compressed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…LFD, not a containment check; the full geometry-of-a-node surface

Operator: ndarray also has chaoda. Grounded in ndarray/src/hpc/clam.rs (CAKES
Alg1/4/6, panCAKES Alg2, CHAODA Phase 4 anomaly_scores from LFD) + perturbation-
sim chaoda (CHAODA-lite, names ndarray ClamTree as production). The CLAM facet
is the manifold engine: containment + CAKES ranked-NN (attraction) + CHAODA
anomaly (repulsion) + panCAKES compression, one tree, LFD the shared measure.

Synthesized geometry-of-a-node: off one GUID the substrate answers which-cluster
(CLAM, free), nearest-similar (CAKES), how-anomalous (CHAODA), exact-location
(helix Signed360, value decode), connected-to (EdgeBlock), caused-by
(CausalEdge64) - a complete geometric+relational surface, each at its own decode
cost; the router dispatches a query to the cheapest facet that answers it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…ixes (Inc 0, the manifold facet)

panCAKES == radix trie == HHTL (E-PANCAKES-IS-RADIX-IS-HHTL): the CLAM cluster
tree IS the radix trie of the classid·HEEL·HIP·TWIG nibble paths already in the
keys, so the structural neighborhood is pure prefix arithmetic, zero value decode.

- NiblePath::common_prefix_depth (contract) — the radix-trie nearest-neighbor
  measure; longest-common-prefix = CAKES attraction. +1 unit test.
- MailboxSoaView::hhtl_path_at (contract) — per-row HHTL NiblePath, deferred-
  binding default None (canon NodeRow already carries key(16); the override
  exposes what's there).
- graph::mailbox_scan::clam_contained (is_ancestor_of = the radix subtree =
  CLAM cluster) + cakes_nearest (common_prefix_depth ranking, k-NN). Both
  key-only, zero value decode.
- Tests: containment = radix subtree (rows 0,1,2 under 1·2; leaf narrows to 0);
  CAKES ranks by shared depth [(0,3),(1,2),(2,2)]; deferred-None yields nothing
  (coarser-facet fallback); F2 zero-value-decode extended to CLAM/CAKES (the
  GuardedSoa value columns still panic-guarded). 7/7 mailbox_scan + 21/21 hhtl
  green, clippy clean.

This is the first dispatch-table facet beyond the classid node-match: proximity/
neighborhood resolves on the key (CLAM/CAKES), free, per E-CLAM-IS-THE-MANIFOLD-
ENGINE. Edge-deref (EdgeBlock) + helix exact-location (value decode) are the
next tiers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
@AdaWorldAPI AdaWorldAPI changed the title feat(graph): Backend::MailboxSoa + the classid-route node-match (Inc 0 first slice) feat(graph): Backend::MailboxSoa — classid node-match + CLAM/CAKES neighborhood (Inc 0) Jun 18, 2026
claude added 4 commits June 18, 2026 20:58
…ly/4-external slot decode (Inc 0)

The third dispatch-table facet: explicit typed edges (a)-[r]->... under the
classid-resolved EdgeCodecFlavor (E-ADJACENCY-IS-KEY-AND-EDGECODEC). EdgeBlock is
bytes 16..32 (the edge region), NOT the value slab, so still zero value decode.

- MailboxSoaView::edge_block_at(row) -> Option<EdgeBlock> (contract, deferred
  default None; the canon NodeRow carries edges(16), the override exposes it).
- graph::mailbox_scan::{EdgeNeighbors, edge_slots_coarse}: under CoarseOnly,
  decode the 12 in-family + 4 external slots to their populated (non-zero) refs,
  family vs external. Pq32x4 (turbovec residue) / CoarseResidue are refused -
  they are NOT adjacency, never coerced to slots (boundary 4b: classid-resolved,
  not query-guessed).
- Slot-byte -> neighbor-row resolution is deliberately deferred (the basin-local-
  index convention + zero-collision is the next encoding decision, analogous to
  local_key->row); this facet lands the structure (which slots are edges, family
  vs external, under which flavor), never fakes the row resolution.
- Tests: populated decode ([2,5] family + [1] external), all-zero = no edges,
  no-block = None, non-Coarse flavors refused. F2 zero-value-decode extended.
  9/9 mailbox_scan, clippy clean (sort_by_key + Reverse).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…anded facets

CI fmt --check failed on three blocks (let-chain filter, EdgeNeighbors collect,
cakes assert). Applied cargo fmt. Also refreshed the stale module doc header
(it still described only the node-match half and called CLAM/EdgeBlock
"deferred") to document the three landed key-resident facets + the genuinely
deferred tiers (slot->row convention, helix/CHAODA/SPO costed tier). No logic
change; 9/9 mailbox_scan green, fmt clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…switch 16K-from-an-angle compare is the costed value sweep, composing with the free key facets as a two-stage cascade

"Switch tenant + compare across the 16K mailbox from an angle" decomposes as:
batch Hamming sweep (hamming_top_k over a contiguous identity plane) = the right
use of popcount on the homogeneous 16K fingerprint; "angle" = which plane
(content/topic/angle) + query; "tenant switch" = column selector. Load-bearing:
it composes with #544's free key facets as a two-stage HHTL cascade - CLAM/CAKES
prefix prune (free, zero decode) then angle-Hamming rank over the pruned set
(costed). Key prune + content rank = two halves of one query. Cost-class
boundary: this facet deliberately decodes the value plane, NOT in the F2
zero-decode class, lands on its own branch with its own cost gate. Grounded in
MailboxSoA content/topic/angle_row + ndarray hamming_top_k + cycle snapshot.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
…p; IVF coarse quantizer IS the HHTL/CLAM prefix

Operator: "sweep or just 90° fingerprint vector cam index" → CAM index. The
value-side rank is a CAM-PQ ADC (distance-table lookups + IVF probe), never a
linear Hamming sweep (that's only the no-index fallback). The load-bearing
correction: CAM-PQ = IVF-PQ, and its IVF coarse quantizer IS the HHTL/CLAM
prefix (turbovec: palette256 = coarse quantizer) while the PQ residual IS the
turbovec Pq32x4 / value-slab codes. So #544's cakes_nearest prefix prune is
literally the IVF coarse-quantization stage, not a prefilter on a scan. 90° =
content/topic/angle orthogonal axes each get their own distance table; tenant
switch = which orthogonal table to ADC against. Retitled the entry (was
"…-SWEEP-IS-PRUNE-THEN-RANK"); fixed an accidental duplicate E-PANCAKES header.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CcpLeEC3XK8Eye53GKBVvi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants