Skip to content

Commit f5a3f8a

Browse files
committed
Merge remote-tracking branch 'origin/main' into HEAD
2 parents 15ced2d + 094f06d commit f5a3f8a

13 files changed

Lines changed: 2144 additions & 9 deletions

File tree

.claude/board/EPIPHANIES.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,59 @@
1+
## 2026-06-06 — E-DEINTERLACE-TWO-SCALES — deinterlace is one operation at two scales; no-cross-cycle-lag = byte-scale deinterlace
2+
3+
**Status:** FINDING (source-grounded; `temporal.rs` PR #468 confirms row-scale; byte-scale is a documented gap)
4+
**Confidence:** High
5+
6+
**The synthesis:** temporal causality in the SoA system must be enforced at two
7+
independent scales that share the same monotonic clock:
8+
9+
```text
10+
Row/query scale → HLC tick + DependsClosure → temporal.rs::deinterlace() (SHIPPED, PR #468)
11+
Byte/column scale → SoaEnvelope::cycle() stamp → MailboxSoA Arc-swap COW (GAP — plan written)
12+
```
13+
14+
Both are the SAME operation — "sort by the causal clock and project the result
15+
into the reader's reference frame" — but at different granularities.
16+
17+
**Row scale (PR #468 confirms):**
18+
`temporal.rs:18-20` defines the standing wave correctly: "merge-sort by HLC
19+
tick and every field's row lands on one timeline. The result IS the standing
20+
wave / kanban SoA." The `deinterlace()` function + `EpistemicMode` (Strict /
21+
Aware / Retro) + `DependsClosure` implement this. 8 tests pass.
22+
23+
**Byte scale (current gap):**
24+
Nothing in the codebase prevents a reader from holding column data from SoA
25+
cycle N and cycle N+1 in the same SIMD sweep. The `SoaEnvelope::cycle()` stamp
26+
exists but is not enforced as a snapshot barrier.
27+
28+
**The fix (plan: `cycle-coherent-soa-snapshot-v1.md`):**
29+
Arc-swap COW at column granularity in `MailboxSoa::advance_phase`:
30+
1. Writer increments `cycle`, then swaps the `Arc<[u8]>` of each mutated column.
31+
2. Reader snapshots all column Arcs under one cycle stamp (lock-free retry).
32+
3. The resulting `MailboxSoaSnapshot { cycle, cols }` is structurally coherent.
33+
34+
**The boundary:**
35+
`MultiLaneColumn` in ndarray stays layout-only. The Arc-swap policy lives in
36+
lance-graph's `MailboxSoa`. ndarray does not learn that cycles exist.
37+
38+
**The clock is one clock:**
39+
`SoaEnvelope::cycle()` (byte scale) and `QueryReference::ref_version` (row
40+
scale) are the same monotonic sequence. Threading `snapshot.cycle` into
41+
`QueryReference` closes the loop: row-scale and byte-scale deinterlace use
42+
the same clock.
43+
44+
**Standing wave clarification (Q3 probe result):**
45+
The "standing wave" is NOT a compute recurrence. It is the deinterlaced
46+
projection over Lance versions — provided by Lance versioning itself (O(1)
47+
90° lookup). Do not implement a standing wave in compute.
48+
49+
**Cross-ref:**
50+
- PR #468 (`temporal.rs`) — row-scale (SHIPPED)
51+
- PR #477 (`soa_envelope.rs`) — envelope contract (IN REVIEW)
52+
- `.claude/plans/cycle-coherent-soa-snapshot-v1.md` — byte-scale fix plan
53+
- `docs/probes/q3-standing-wave-falsification.md` — probe confirming no wave in compute
54+
55+
---
56+
157
## 2026-06-04 — E-AUDIT-RETENTION-CAVEAT — substrate-b consumer doc Lance-versions-as-audit claim was overstated; corrected to retention-policy-gated (codex P1 on #465)
258

359
**Status:** CORRECTION (codex P1 on PR #465, 2026-06-04; merged + immediate follow-up correction per the no-silent-edit discipline — the FIX appends; the original epiphany E-SUBSTRATE-B-CAPABILITY-ROADMAP stands as the corrected reference now reads).

.claude/board/TECH_DEBT.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,17 @@
1515

1616
## Open Debt
1717

18+
### TD-UNBUNDLE-FROM-1 — `unbundle_from` is NOT the inverse of `bundle_into` (2026-06-07)
19+
20+
**Open.** `crates/lance-graph-planner/src/cache/kv_bundle.rs``unbundle_from`
21+
uses `wrapping_sub` as the "undo" of `bundle_into`. But `bundle_into` is a
22+
weighted average: `(old * w_self + new * w_new) / total`. Subtraction is not the
23+
inverse. `AttentionMatrix::set` calls both in sequence, silently corrupting the
24+
gestalt ~1 bit per epoch. Measurable after ~100 epochs. Function is marked
25+
`#[deprecated]` with a doc warning; callers use `#[allow(deprecated)]` + FIXME.
26+
**Paid by:** switch to raw-sum + count tracking so exact integer subtraction is
27+
possible. Cross-ref: `kv_bundle.rs:28-33`.
28+
1829
### TD-HELIX-OVERLAP-1 (D-HELIX-1) — `helix` re-derives existing CERTIFIED primitives (clean-room by directive)
1930

2031
**Open.** `crates/helix` ships as a zero-dep clean-room codec per the user directive "scoped only to crate." ~80% of its pipeline duplicates existing, in-places-CERTIFIED workspace code: Fisher-Z/arctanh→i8 (`bgz-tensor::projection::Base17Fz`, `bgz-tensor::fisher_z::FamilyGamma` ρ≥0.999), golden-spiral azimuth (`jc::weyl`), stride-4 coupling (`thinking-engine::reencode_safety`, `highheelbgz`), EULER_GAMMA hand-off (`jc::precond`, `bgz-tensor::euler_fold`), 256-palette/L1 (`bgz17::palette`). Genuinely new = the `√u` equal-area hemisphere placement + the PLACE/RESIDUE doctrine. **Paid by** (when it graduates from clean-room): the consolidation path in `crates/helix/KNOWLEDGE.md` § Overlap & Consolidation — re-export `FamilyGamma` behind a feature; route coupling through the canonical `(i·11)%17`/stride-4 zipper; feed `ResidueEdge` into the existing HIP/TWIG CAKES tier. **Also owed:** a fidelity-vs-ground-truth probe (the naive-u8 floor gate ≥0.9980 Pearson is currently CONJECTURE — NOT RUN) before promotion. Cross-ref: E-HELIX-OVERLAP, `encoding-ecosystem.md`.
Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
# Plan: Cycle-Coherent SoA Snapshot — No-Cross-Cycle-Lag Guarantee
2+
3+
**Version:** v1
4+
**Date:** 2026-06-06
5+
**Status:** Queued
6+
**D-ids:** D-SOA-SNAP-1 through D-SOA-SNAP-6
7+
8+
---
9+
10+
## The problem
11+
12+
`temporal.rs` (PR #468) closes the row-scale deinterlace gap: HLC tick →
13+
`classify/deinterlace` → causally-coherent row sequence. But there is a
14+
parallel byte-scale gap: nothing prevents a reader from holding a mix of
15+
column data from cycle N and cycle N+1 within the same SIMD sweep. This is
16+
the **cross-cycle lag problem** — a SIMD sweep that is not internally
17+
single-cycle is not coherent.
18+
19+
The deinterlace operation is one operation at two scales:
20+
21+
```text
22+
Row/query scale → HLC tick + DependsClosure → temporal.rs (SHIPPED, PR #468)
23+
Byte/column scale → SoaEnvelope::cycle() stamp → MailboxSoA Arc-swap (THIS PLAN)
24+
```
25+
26+
---
27+
28+
## The mechanism: Arc-swap COW at column granularity
29+
30+
The SoA mailbox carries its columns as `Arc<[u8]>` slices (via
31+
`MultiLaneColumn` in ndarray). The invariant is:
32+
33+
> **A reader that snapshots all column Arcs at the same `cycle()` stamp sees
34+
> a single coherent cycle. No column can be from a prior cycle.**
35+
36+
### Write path (in `lance-graph`, `MailboxSoa::advance_phase`)
37+
38+
On every `advance_phase(to: KanbanPhase)`:
39+
40+
1. Increment `cycle` counter on the envelope.
41+
2. For each mutated column: swap the `Arc` pointer — `Arc::make_mut` on the
42+
backing `Arc<[u8]>` of the `MultiLaneColumn`, write the new data, then
43+
publish the new Arc via an `ArcSwap` (or `RwLock<Arc<MultiLaneColumn>>`).
44+
3. The cycle increment is a `SeqCst` store (fence) BEFORE the column Arc
45+
swaps. Readers who observe the new cycle will see the new column data.
46+
47+
### Read path (in `lance-graph`, `MailboxSoaView`)
48+
49+
On `snapshot()`:
50+
51+
1. Load cycle stamp.
52+
2. Clone all column Arcs under the same cycle stamp (atomic snapshot loop:
53+
re-read cycle after loading all Arcs; retry if it changed — lock-free
54+
single-retry is sufficient because writers are serialized through
55+
`advance_phase`).
56+
3. Return `MailboxSoaSnapshot { cycle, cols: [...] }`.
57+
58+
The snapshot guarantees all column data is from the same cycle.
59+
60+
### Boundary: ndarray stays layout-only
61+
62+
`MultiLaneColumn` in ndarray is `Arc<[u8]>` with typed lane iterators —
63+
**layout-only**. The Arc-swap policy (when to swap, how to snapshot, the
64+
cycle fence) belongs in `lance-graph`'s `MailboxSoa`. ndarray never learns
65+
that cycles or snapshots exist. The boundary is:
66+
67+
```text
68+
ndarray::simd::MultiLaneColumn — Arc<[u8]>, lane iters, Send + Sync, zero-copy reads
69+
lance-graph::MailboxSoa — Arc-swap on advance_phase, cycle fence, snapshot()
70+
```
71+
72+
### Connection to temporal.rs
73+
74+
`SoaEnvelope::cycle()` is the byte-scale clock. `QueryReference::ref_version`
75+
is the row-scale clock (a Lance version). They are the same monotonic clock
76+
at different granularities — Lance version N corresponds to SoA cycle C(N).
77+
When `temporal.rs::deinterlace` runs at query time, the `V_ref` it uses should
78+
align with the `cycle()` of the snapshot being queried.
79+
80+
Wiring: `VersionScheduler::on_version(&view, at, exec)` provides the Lance
81+
version; the `MailboxSoaSnapshot` that went into that version carries its
82+
`cycle`. Threading `snapshot.cycle` into `QueryReference` closes the loop so
83+
row-scale and byte-scale deinterlace use the same clock.
84+
85+
---
86+
87+
## Deliverables
88+
89+
### D-SOA-SNAP-1 — `MailboxSoaSnapshot` type in lance-graph-contract
90+
91+
A `MailboxSoaSnapshot` struct: `cycle: u32`, `cols: Vec<Arc<MultiLaneColumn>>`.
92+
Snapshot is `Send + Sync`. No reference to the originating `MailboxSoa`.
93+
This is a point-in-time read — immutable after creation.
94+
95+
### D-SOA-SNAP-2 — `SnapshotProvider` trait in lance-graph-contract
96+
97+
```rust
98+
pub trait SnapshotProvider {
99+
fn snapshot(&self) -> MailboxSoaSnapshot;
100+
}
101+
```
102+
103+
Zero deps in contract. `MailboxSoa` in lance-graph implements it.
104+
105+
### D-SOA-SNAP-3 — Arc-swap write path in `MailboxSoa::advance_phase`
106+
107+
In lance-graph (not contract, not ndarray): implement the cycle fence +
108+
column Arc-swap on every `advance_phase`. Use `std::sync::RwLock<Arc<MultiLaneColumn>>`
109+
per column (no external arc-swap crate needed unless benchmarks show
110+
contention; add as a feature flag if needed).
111+
112+
### D-SOA-SNAP-4 — `snapshot()` implementation on `MailboxSoa`
113+
114+
Lock-free snapshot: load cycle, clone all column Arcs, re-read cycle, retry
115+
once if changed. Return `MailboxSoaSnapshot`.
116+
117+
### D-SOA-SNAP-5 — No-cross-cycle-lag falsification test
118+
119+
```rust
120+
// Spawn a writer thread: advance_phase in a loop (100 cycles).
121+
// Spawn 8 reader threads: each calls snapshot() in a loop.
122+
// Assert: every snapshot has all columns reporting the same cycle.
123+
// Assert: no snapshot mixes data from two different cycles.
124+
```
125+
126+
The test is the formal statement of the guarantee. If it passes, the
127+
invariant is mechanically enforced, not just documented.
128+
129+
### D-SOA-SNAP-6 — Wire `snapshot.cycle` into `QueryReference`
130+
131+
In the planner: when a query resolves a `MailboxSoaSnapshot`, thread
132+
`snapshot.cycle` through `QueryReference::hlc_tick` (or a new
133+
`QueryReference::soa_cycle: Option<u32>` field) so `deinterlace` at
134+
row scale uses the same cycle boundary as the snapshot at byte scale.
135+
136+
---
137+
138+
## Prerequisite gap fixes (order matters)
139+
140+
These mechanical fixes should land before or alongside D-SOA-SNAP-1
141+
(they settle the column shape):
142+
143+
1. Remove `MailboxSoA::emit()` + `CollapseGateEmission` from source.
144+
2. Rename `last_emission_cycle``last_active_cycle` in MailboxSoA.
145+
3. Drop `entity_type: u16` from SoA row — MailboxId IS NiblePath.
146+
4. Fix `OntologyRegistry::enumerate_first_with_entity_type_id` linear scan.
147+
5. Remove `MappingRow.thinking_style` — Kanban owns thinking styles.
148+
6. Fix `unbundle_from` in `kv_bundle.rs:29``wrapping_sub` is not the
149+
inverse of weighted-average `bundle_into`.
150+
151+
Items 1-5 settle the column shape before the Arc-swap schema is frozen.
152+
Item 6 is independent but should not be deferred (correctness bug).
153+
154+
---
155+
156+
## Non-goals
157+
158+
- No recurrence / standing wave implementation. The standing wave is the
159+
deinterlaced Lance version projection, provided by Lance versioning
160+
(O(1) 90° lookup). Do not implement it in compute.
161+
- No baton. No emission. No inter-mailbox handoff type. The snapshot is
162+
consumed in-place; nothing is transmitted.
163+
- ndarray does not learn about cycles, snapshots, or advance_phase.
164+
165+
---
166+
167+
## Cross-references
168+
169+
- `temporal.rs` (PR #468) — row-scale deinterlace (SHIPPED)
170+
- `soa_envelope.rs` (PR #477) — envelope LE contract (IN REVIEW)
171+
- `soa-three-tier-model.md` — three-tier lifecycle model
172+
- `q3-standing-wave-falsification.md` — falsification: standing wave = Lance
173+
versioning, not compute recurrence
174+
- `.claude/board/EPIPHANIES.md` E-DEINTERLACE-TWO-SCALES — the synthesis
175+
- `ndarray/src/simd_soa.rs``MultiLaneColumn` (layout-only; Arc-swap lives
176+
in lance-graph, not here)

CLAUDE.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,27 @@
11
# CLAUDE.md — lance-graph
22

3+
## P0 — AdaWorldAPI forks ONLY, NEVER crates.io upstream
4+
5+
**Always depend on the AdaWorldAPI fork of any crate that has one. NEVER use the
6+
upstream crates.io version of a forked crate.** Non-negotiable; applies to every
7+
`Cargo.toml` and every dependency decision in this repo. Every repo in this
8+
workspace is local — prefer the local/fork source over the registry, always.
9+
10+
- Crates with an `AdaWorldAPI/<name>` fork — e.g. `ndarray`, `lance` /
11+
`lancedb` / `lance-index` / `lance-linalg` / `lance-namespace`, `surrealdb`,
12+
and any other — MUST be wired via the fork (`path` / `git` / `[patch.crates-io]`),
13+
never the registry version.
14+
- If a fork's coordinates (git URL, branch/tag, feature flag) are unknown,
15+
**STOP and ask**. Do NOT fall back to crates.io as a convenience or to make a
16+
build pass.
17+
- `"warning: Patch <crate> ... was not used in the crate graph"` is a policy
18+
alert. It can indicate missing fork wiring OR a transitive semver mismatch
19+
that prevents the patch from applying. Do not ignore it: verify direct
20+
`Cargo.toml` patch entries and `Cargo.lock` wiring, then track/resolve any
21+
transitive blocker explicitly before closing the issue.
22+
- crates.io is permitted ONLY for crates that have no AdaWorldAPI fork / no local
23+
source.
24+
325
> **Updated**: 2026-04-21 (categorical-algebraic inference click)
426
> **Role**: The obligatory spine — query engine, codec stack, semantic transformer, and orchestration contract
527
> **Status**: 22 crates, 7 in workspace, 15 excluded (standalone/DTO), Phases 1-2 DONE, Phases 6-7 DONE (grammar + governance), Phase 3 IN PROGRESS

crates/lance-graph-contract/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ pub mod scheduler;
9393
pub mod sensorium;
9494
pub mod sigma_propagation;
9595
pub mod sla;
96+
pub mod soa_envelope;
9697
pub mod soa_view;
9798
pub mod splat;
9899
pub mod tax;

0 commit comments

Comments
 (0)