Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 59 additions & 7 deletions .claude/knowledge/pr-x12-canon-resolutions-delta.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,16 @@

## 0. What's actually new

The merged canon (`bc9da4ad`) argued the architecture; canon-resolutions makes it falsifiable. Five categories of novel content survive the delta filter:
The merged canon (`bc9da4ad`) argued the architecture; canon-resolutions makes it falsifiable. Six categories of novel content survive the delta filter:

1. **Concrete trait signatures** — R-1 (`Basis<T>` + `LinearReduce` split), §8 surface (`PredictiveSignal`, `CurveOrder<const N>`, `RdoMetric`)
2. **Quantified budgets** — R-3 LoC envelope per sub-card / per consumer + audit rule; R-4 four Plan G thresholds; R-11 4K@60fps latency budget
3. **Math identities** — R-6 SSD-via-VNNI (`||A||² - 2A·B + ||B||²`), R-7 tropical-GEMM partition (`O(4^d) → O(d²)`)
3. **Math identities** — R-6 SSD-via-VNNI (`||A||² - 2A·B + ||B||²`), R-7 tropical-GEMM partition (`O(4^d) → O(d²)`, kernel at `bgz17::scalar_sparse::tropical_spmv`)
4. **Type-level invariants** — R-2 bit-15/bit-14 split, R-9 topology-FREE codec
5. **Phasing patterns** — R-8 confidence-gate framing, R-13 Option-A-then-B for federated codebook
5. **Phasing patterns** — R-8 confidence-gate framing, R-13 Option-A-then-B for federated codebook (primitives: `cam_pq` + `bgz-hhtl-d` + `dn_tree` + `merkle_tree`)
6. **Formal-correctness + stream lane (post-merge)** — R-14 (`jc::pflug` Pillar 10 + `jc::hambly_lyons` Pillar 11), R-15 (`SignatureBasis<DEPTH>` as fifth Plan G lane)

Plus the synthesis layer: §9 falsifiability matrix (24 rows), §10 sequencing with named gates, §12 compaction-preservation contract.
Plus the synthesis layer: §9 falsifiability matrix (24+3 rows including R-14/R-15), §10 sequencing with named gates, §12 compaction-preservation contract.

---

Expand Down Expand Up @@ -216,7 +217,9 @@ Tropical-semiring (+, min) formulation:

At 4K 132K CTUs/frame: ~4 ms vs ~64 ms just for partition RDO. At 60 fps, the difference between fitting and missing budget.

**Dep direction:** `ndarray-codec → lance-graph::blasgraph` (tropical-GEMM kernels live in blasgraph). Allowed post-Plan-H because ndarray-codec is a sibling crate, not the bottom.
**Dep direction:** `ndarray-codec → lance-graph::blasgraph` (tropical-GEMM kernels nominally live in blasgraph). Allowed post-Plan-H because ndarray-codec is a sibling crate, not the bottom.

**Actual kernel home (current):** `lance-graph::bgz17::scalar_sparse::tropical_spmv`. The `blasgraph` namespace is the eventual abstraction; until that lands, ndarray-codec depends on bgz17 directly. Cite the symbol when wiring A6, not the namespace.

**Plan A6 (1 week) ships this.** λ-RDO knob scales edge weights; tropical-GEMM relaxation computes optimal mode tree.

Expand Down Expand Up @@ -292,6 +295,16 @@ Pattern: ship simplest-that-works, measure, escalate. Don't pick best-in-theory

Wire-format hook for Option A: `WorkerId: u16` + `CodebookHash: u64` in frame header.

**Implementation primitives** (already exist; PR-X12 only adds the wire format + `CodebookHandle` trait):

| Concern | Crate / module |
|---|---|
| Codebook training (k-means + CAM-PQ) | `ndarray::hpc::cam_pq::CamCodebook` |
| Deployed encoding format | `lance-graph::bgz-tensor::Codebook4096` / `bgz-hhtl-d` |
| Online plastic updates (SharedClusterWide) | `ndarray::hpc::dn_tree` |
| Integrity proof (Blake3-48 Merkle root, xor_diff) | `ndarray::hpc::merkle_tree` |
| Gossip protocol | `q2` (external) |

### 5.3 Streaming flush granularity (R-12)

Per-CTU default. `FlushUnit` 2-bit tag in frame header:
Expand Down Expand Up @@ -405,9 +418,48 @@ Citation IDs (R-1..R-13) stable. Canon IDs (M:E-*, M:H-*, M:H-NEW-*, M:T-*, A:E-

---

## 11. The single load-bearing paragraph (§13)
## 11. Formal-correctness layer (R-14) — post-merge addition

The substrate-binding doc (`pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md`) surfaced two formal proofs in `lance-graph::jc` that the codec inherits without re-proving:

| Pillar | Crate / module | What it proves | Status |
|---|---|---|---|
| **Pillar 10** (Pflug-Pichler) | `jc::pflug` | Nested-distance Lipschitz on Sigma DN-trees: CAM-PQ tree quantization preserves FreeEnergy within Lε | Active in default zero-dep build |
| **Pillar 11** (Hambly-Lyons) | `jc::hambly_lyons` | Signature uniqueness on tree-quotient: any path of bounded variation is uniquely determined by its truncated signature up to tree-like equivalence (Annals 171(1), arXiv:math/0507536) | Active under `--features hambly-lyons` (PR #348, 2026-05-07); probe passes (forward<1e-9, converse>0.05, ratio≥1e6) |

R-4's quality-floor rows for video / KV / gradient inherit Pillar 10's Lipschitz bound. R-15's signature lane gates on Pillar 11.

**Open work (G-4):** PR #350 corrects `sigker::signature_kernel_pde`'s known Goursat-PDE math bug; Pillar 11's probe deliberately uses `signature_truncated` (tensor-algebra) until PR #350 lands. Production-scale benchmarking pending.

---

## 12. Stream-signal codec lane (R-15) — post-merge addition

`SignatureBasis<const DEPTH: usize>: Basis<f32>` is the fifth concrete `Basis<T>` impl, complementing the four lanes in §1's table:

```rust
// New: ndarray::hpc::signature (~1 wk, wraps sigker::signature_truncated)
impl<const DEPTH: usize> Basis<f32> for SignatureBasis<DEPTH> {
fn dim(&self) -> usize { /* truncated tensor-algebra dim */ }
fn apply(&self, path: &[f32], signature: &mut [f32]) {
// iterated-integral truncation via sigker::signature_truncated
}
fn invert(&self, _sig: &[f32], _path: &mut [f32]) {
unimplemented!("path-from-signature is unique only up to tree-like \
equivalence per R-14 Pillar 11")
}
}
```

**Plan G gets a fifth lane: "stream signal"** — audio waveforms / time-series / gesture / handwriting paths. Codec is `SignatureBasis<DEPTH=3>` + standard rANS over the four-mode taxonomy; quality floor inherits from Pillar 11 (R-14); compression target ~10× over raw f32 path samples (calibrate during Plan G).

**Why `signature_truncated` not `signature_kernel_pde`:** the PDE form ships a known divergence bug (PR #350). The tensor-algebra path is correct today and is what Pillar 11 cites.

---

## 13. The single load-bearing paragraph (canon-resolutions §13)

> *The merged canon committed to the right architectural synthesis (M:E-A, M:E-D, M:E-G, M:E-I) but left the load-bearing contracts unsigned. Canon-resolutions commits them: `Basis<T>` + `LinearReduce` are two traits not one (R-1); bit 14 of the leaf header is consumer-typed and bit 15 universal (R-2); generic codec body ≤1500 LoC with ≤200 LoC per consumer (R-3); four threshold pairs gate Plan G's pass criteria (R-4); the trajectory is Plan G (2 wks) → Plan A7 critical path (1.5 wks) → Phase 2 consumers parallel (3 wks); end state is one binary, four loads, ~2 KLoC stack demonstrating M:H-NEW-1 in ~10.5 weeks of wall-clock. Every claim in §9 has a test; Plan G's bench-harness binary is the audit. The falsifiability is the point.*
> *The merged canon committed to the right architectural synthesis (M:E-A, M:E-D, M:E-G, M:E-I) but left the load-bearing contracts unsigned. Canon-resolutions commits them: `Basis<T>` + `LinearReduce` are two traits not one (R-1); bit 14 of the leaf header is consumer-typed and bit 15 universal (R-2); generic codec body ≤1500 LoC with ≤200 LoC per consumer (R-3); four threshold pairs gate Plan G's pass criteria (R-4); the trajectory is Plan G (2 wks) → Plan A7 critical path (1.5 wks) → Phase 2 consumers parallel (3 wks); end state is one binary, four loads, ~2 KLoC stack demonstrating M:H-NEW-1 in ~10.5 weeks of wall-clock. Every claim in §9 has a test; Plan G's bench-harness binary is the audit. The falsifiability is the point. The substrate-binding follow-up (R-14, R-15) adds a formal-correctness layer via `jc` pillars and a fifth stream-signal lane via `SignatureBasis<DEPTH>`.*

---

Expand Down
164 changes: 155 additions & 9 deletions .claude/knowledge/pr-x12-substrate-canon-resolutions.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,11 @@ were raised in review:
(R-5 through R-7 restorations)
- **§6** — three pieces of detail from session B the merge underrepresented
(R-8 through R-10 restorations)
- **§7** — three commitments missing from both originals and from the
merge (R-11 through R-13 new specs)
- **§7** — five commitments missing from both originals and from the
merge: R-11 through R-13 (latency, flush granularity, federated
codebook) plus R-14 (formal correctness via `jc` pillars) and R-15
(`SignatureBasis<DEPTH>` as fifth Plan G lane), the latter two
surfaced post-merge by the substrate-binding docs

Then five integration pieces that make the resolutions actionable:

Expand All @@ -36,9 +39,10 @@ Then five integration pieces that make the resolutions actionable:
- **§11** — end-state + trajectory (think it from the end)
- **§12** — compaction-preservation contract

Citation IDs: `R-1` through `R-13` for resolutions. Canon IDs (`M:E-*`,
`A:E-*`, `B:E-*`, `M:H-*`, `M:T-*`) remain stable; this doc adds, does
not renumber.
Citation IDs: `R-1` through `R-15` for resolutions (R-14, R-15
appended post-merge from the substrate-binding doc; numbering remains
append-only). Canon IDs (`M:E-*`, `A:E-*`, `B:E-*`, `M:H-*`, `M:T-*`)
remain stable; this doc adds, does not renumber.

Sister docs (read order):

Expand Down Expand Up @@ -543,6 +547,14 @@ ships tropical-GEMM kernels. No new code in ndarray; cross-repo dep
from ndarray-codec → lance-graph::blasgraph (after Plan H extraction,
this is dep-allowed because ndarray-codec is a sibling, not the bottom).

**Actual kernel home (current).** The tropical-GEMM kernel lives today
at `lance-graph::bgz17::scalar_sparse::tropical_spmv` — NOT in an
abstract `blasgraph` namespace. The codec's tropical-GEMM call is
`bgz17::scalar_sparse::tropical_spmv(edge_weights, dag)`. The
`lance-graph::blasgraph` name above is the eventual abstraction layer
(post-Plan-H extraction); until that lands, ndarray-codec depends on
bgz17 directly. Cite the symbol, not the namespace, when wiring A6.

**Plan A6 RDO (1 week) ships this.** The λ-RDO knob (per A:§10.3) and
the tropical-GEMM partition solver are the same kernel: λ scales the
edge weights, the relaxation computes the optimal mode tree.
Expand Down Expand Up @@ -935,10 +947,135 @@ empirically; v3 (research-grade) tries Option C.
R-4 gradient threshold (8× compression at <0.5% loss delta). At that
point, Plan F v1 escalates to Option B in a follow-up PR.

**Implementation primitives (current substrate, no new code required):**

| Concern | Crate / module |
|---------|----------------|
| Codebook training (k-means + CAM-PQ) | `ndarray::hpc::cam_pq::CamCodebook` (`train_geometric` / `train_semantic` / `train_hybrid`) |
| Deployed encoding format (per-shard) | `lance-graph::bgz-tensor::Codebook4096` and the `bgz-hhtl-d` shared-palette variant |
| Online plastic updates (`SharedClusterWide`) | `ndarray::hpc::dn_tree` (quaternary plastic memory, partial-Hamming descent) |
| Integrity proof for distributed updates | `ndarray::hpc::merkle_tree` (Blake3-48-bit, 1 KB root, `xor_diff` panCAKES compression) |
| Gossip protocol (cluster-wide) | `q2` (external — implements the wire protocol) |

The four policy modes (`LocalEphemeral` / `SharedClusterWide` /
`SharedRegional` / `PretrainedStatic`) compose these primitives
differently; the codec body exposes a `CodebookHandle` trait, and the
primitives plug in via that trait. **PR-X12 contributes the wire format
+ trait + Option A; the primitives above already exist.**

**Cite as R-13 in Plan F PR description.**

---

### R-14 — Formal correctness via `lance-graph::jc` pillars

**Problem.** Canon and resolutions describe the codec's empirical
behaviour (R-4 thresholds, R-11 latency) but never name the formal
correctness proofs the substrate already carries. Without a citation,
"the codec is correct" is unverifiable; with citations, the codec
inherits machine-checked guarantees from existing crates.

**Resolution.** Pin both pillars and what each proves.

**Two formal proofs in `lance-graph::jc`:**

- **Quantization correctness (Pillar 10, Pflug-Pichler):**
nested-distance Lipschitz on Sigma DN-trees. Proves that CAM-PQ tree
quantization preserves the FreeEnergy functional within a Lipschitz
factor Lε. **This is the proof PR-X12 cites for "wire-format
quantization is faithful."** Implementation: `jc::pflug` (active in
default build, zero-dep).
- **Path-signature correctness (Pillar 11, Hambly-Lyons):**
signature uniqueness on tree-quotient. Proves that any path of
bounded variation is uniquely determined by its truncated signature
up to tree-like equivalence (Annals of Mathematics 171(1):109–167,
arXiv:math/0507536). **This is the proof PR-X12 cites for the
`SignatureBasis<DEPTH>` lane (R-15).** Implementation:
`jc::hambly_lyons` (active under `--features hambly-lyons`, since
PR #348 landed on 2026-05-07).

**What the codec inherits.** Both pillars exist; the codec cites them
and does not reprove. R-4's "Quality floor" rows for video / KV /
gradient inherit Pillar 10's Lipschitz bound automatically. R-15's
signature-lane gates on Pillar 11.

**Status.**

- Pillar 10: active in default zero-dep build.
- Pillar 11: active under `--features hambly-lyons`; passes its probe
(forward < 1e-9, converse > 0.05, discrimination ratio ≥ 1e6 over
N=100 random pairs in d=3 at depth-2).
- Production-scale benchmarking + PR #350 (`signature_kernel_pde`
Goursat-PDE math correction) remain open — see Gap G-4 in
`pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md`. Pillar 11's
probe deliberately uses `signature_truncated` (tensor-algebra path),
not the buggy PDE form.

**Falsifies if.** Pillar 10 ever flips state (a regression in the
Pflug-Pichler proof bound) — Plan G's video / KV / gradient quality
floors lose their formal underwriting and become empirical-only.

**Cite as R-14 in any PR claiming "codec output is faithful to
input" or wiring `SignatureBasis` (R-15).**

---

### R-15 — `SignatureBasis<const DEPTH: usize>` as `Basis<f32>` impl

**Problem.** R-1 commits the `Basis<T>` shape; the canon lists three
concrete impls (`DctIIBasis<N>` for video, `EwaSplatBasis` for 3DGS,
`ShSpectralBasis<L>` for splat SH). No `Basis<T>` impl targets
*streams* — audio waveforms, time-series, gesture/handwriting paths.
Plan G has only four lanes; path-structured signals are unaddressed.

**Resolution.** Commit `SignatureBasis<const DEPTH: usize>: Basis<f32>`
as the fifth concrete impl, wrapping the path-signature kernel from
the external `lance-graph::sigker` crate.

```rust
// Concrete impl, lives in ndarray::hpc::signature (new module, ~1 wk)
impl<const DEPTH: usize> Basis<f32> for SignatureBasis<DEPTH> {
fn dim(&self) -> usize { /* truncated tensor-algebra dim at DEPTH */ }
fn apply(&self, path: &[f32], signature: &mut [f32]) {
// iterated-integral truncation against sigker::signature_truncated
}
fn invert(&self, _sig: &[f32], _path: &mut [f32]) {
// signature → path is many-to-one (tree-quotient); document as N/A
unimplemented!("signature inversion is N/A — path unique only up to \
tree-like equivalence per R-14 / Pillar 11")
}
}
```

**Why `signature_truncated` and not `signature_kernel_pde`.** The
PDE form in sigker ships a known math bug (PR #350: Goursat-PDE form
diverges from the true kernel `I₀(2·√⟨u, v⟩)` at moderate inner
products). The tensor-algebra path (`signature_truncated`) is correct
today and is what jc Pillar 11 cites. R-15 wraps the truncated path;
the PDE form becomes available after PR #350 lands.

**Plan G gets a fifth lane.** "Stream signal" mode:

- Input: audio waveform / time-series / gesture stream
- Codec: `SignatureBasis<DEPTH=3>` truncates path signature, residuals
go through standard rANS via the four-mode taxonomy
- Quality floor: signature-uniqueness preservation per Pillar 11
- Compression target: ~10× over raw f32 path samples (estimate;
calibrate during Plan G)

**Falsifies if.** `SignatureBasis<DEPTH=3>` plus rANS fails to
reconstruct the path within ε under Pillar 11's discrimination ratio.
At that point, raise DEPTH or fall back to per-block DCT-II for the
stream lane.

**Cost.** ~1 week wrapper around `sigker::signature_truncated` +
basis-trait plumbing + Plan G fifth-lane wiring.

**Cite as R-15 in any PR adding a stream-signal codec lane or
wiring `SignatureBasis`.**

---

## 8. The canonical contracts — concrete trait signatures

All three plug-points (per M:E-E) get concrete signatures here. These
Expand Down Expand Up @@ -1108,6 +1245,9 @@ that decides whether each holy-grail claim is demonstrated.
| R-11 (4K 60fps SIMD-batched) | this doc | Plan G video latency assert | Per-CTU encode time | ≤210 ns/CTU on Sapphire Rapids |
| R-12 (per-CTU flush) | this doc | A8 frame-header parse + decode | First-CTU latency | First CTU decodable before frame complete |
| R-13 (Option A per-shard) | this doc | Plan F on BERT-glue | 8× compression + accuracy | Holds; else escalate to Option B |
| R-14 (Pillar 10 active) | this doc | `cargo test -p jc` (default features) | Pflug-Pichler Lipschitz bound | Pillar 10 probe green |
| R-14 (Pillar 11 active) | this doc | `cargo test -p jc --features hambly-lyons` | Signature uniqueness probe | forward < 1e-9, converse > 0.05, ratio ≥ 1e6 |
| R-15 (SignatureBasis lane) | this doc | Plan G stream-signal lane | path reconstruction under Pillar 11 | ε within discrimination ratio of Pillar 11 |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make R-15 testable without requiring undefined inversion

This row defines R-15 success as path reconstruction under Pillar 11, but the same R-15 section declares invert as unimplemented! because signature-to-path is many-to-one up to tree-like equivalence (invert marked N/A). As written, the falsifiability gate is not operationally testable with the committed API contract, so downstream Plan G gating can become ambiguous or blocked. The metric should be reframed to something executable (e.g., reduced-path equivalence or signature-space fidelity) instead of direct path reconstruction.

Useful? React with 👍 / 👎.

Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

**Every row of this matrix is a test.** Plan G's bench harness binary
emits a JSON report containing the actual measurement for each row;
Expand Down Expand Up @@ -1234,21 +1374,27 @@ When this doc is summarised across context windows, preserve at minimum:

1. **The five "merged well" items** from §3 (M:E-A / M:E-D / M:E-G /
M:E-I / M:E-F) — these are the canon's load-bearing synthesis.
2. **The thirteen resolutions** R-1 through R-13 with their summary
2. **The fifteen resolutions** R-1 through R-15 with their summary
commitments:
- R-1: `LinearReduce<Basis>` two-trait shape
- R-2: bit 15 universal, bit 14 consumer-typed
- R-3: ≤1500 LoC generic, ≤200 LoC per consumer
- R-4: 4 threshold pairs (video, splat, kv, grad)
- R-5: DCT crossover ~64 blocks
- R-6: SSD via VNNI ≥30× over SAD
- R-7: tropical-GEMM partition O(4^d) → O(d²)
- R-7: tropical-GEMM partition O(4^d) → O(d²) via
`bgz17::scalar_sparse::tropical_spmv`
- R-8: Plan G is confidence gate
- R-9: topology-FREE codec layer
- R-10: ~4 bits/Gaussian near target, ~1 bit stretch
- R-11: 210 ns/CTU SIMD-batched encode
- R-12: per-CTU flush default; per-bucket Plan F
- R-13: Option A (per-shard codebook) for Plan F v1
- R-13: Option A (per-shard codebook) for Plan F v1; primitives are
`cam_pq` + `bgz-hhtl-d` + `dn_tree` + `merkle_tree`
- R-14: formal correctness via `jc::pflug` (Pillar 10) +
`jc::hambly_lyons` (Pillar 11, feature-gated)
- R-15: `SignatureBasis<DEPTH>: Basis<f32>` as fifth Plan G lane
(stream signal)
3. **The trajectory** from §2 — Phase 0 → A7 → parallelise → Phase 2
4. **The five-category architecture** including `ndarray-codec`
5. **The four traits** as the canonical contracts:
Expand All @@ -1258,7 +1404,7 @@ When this doc is summarised across context windows, preserve at minimum:
7. **The falsifiability matrix in §9** — every claim has a test;
not every claim will pass; that's the design

**Citation IDs in this doc** (R-1 .. R-13) are stable. Canon IDs
**Citation IDs in this doc** (R-1 .. R-15) are stable. Canon IDs
(M:E-*, M:H-*, M:H-NEW-*, M:T-*, A:E-*, A:H-*, A:T-*, B:E-*, B:HG-*,
B:D-*) remain stable per canon's §10. Append, never renumber.

Expand Down
Loading