Skip to content

Commit 5c37f0c

Browse files
committed
D1.2 rotation primitives + thinking-tissue north-star epiphany
First real kernel deliverable of Phase 1: RotationKernel trait + three impls (Identity / Hadamard / OPQ-stub) with typed RotationError. 95/95 cognitive-shader-driver tests pass under --features serve (+15 new D1.2 tests). crates/cognitive-shader-driver/src/rotation_kernel.rs (~330 LOC): RotationKernel trait — object-safe, Send+Sync+Debug: apply(&self, &mut [f32]) -> Result<(), RotationError> dim() -> u32 signature() -> u64 # feeds CodecParams::kernel_signature backend() -> &'static str # "avx512" | "stub" (never "scalar") IdentityRotation { dim } — zero-overhead pass-through; apply() is a no-op HadamardRotation { dim } — REAL in-place Sylvester butterfly, O(N log N) add/sub, no allocations — validates dim is power-of-two (Sylvester requirement) — Rule C compliance: stays at Tier-3 F32x16 (add/sub, not matmul; AMX adds no value per plan appendix §12 C) — rustc + target-cpu=x86-64-v4 already emits AVX-512 add/sub from the straight-line loop → no JIT compilation needed OpqRotationStub { matrix_blob_id, dim } — real impl plugs into D1.1b CodecKernelEngine adapter + ndarray::hpc::jitson_cranelift::JitEngine + tile_dpbf16ps AMX matmul when amx_available() — apply() returns OpqMatrixNotLoaded (typed error) until the matrix-blob loader lands build(&Rotation, dim) -> Result<Box<dyn RotationKernel>> factory — dispatches on WireCodecParams.pre_rotation variant — returns typed errors on dim mismatch or non-pow2 Hadamard Tests (15 new): Identity: noop + dim-mismatch error Hadamard: - orthogonality: H_4 · [1,0,0,0] == [1,1,1,1] (first column) - H · H = n · I (applying twice scales by n, verified at N=8) - norm² preservation up to n× scale (verified at N=16) - rejects non-pow2 dim (N=6) OPQ stub: returns OpqMatrixNotLoaded with blob_id preserved build(): identity / hadamard / hadamard-dim-mismatch / hadamard- non-pow2 / opq-stub Signatures: distinct across variants, stable for same shape, blob-id-sensitive for OPQ Board hygiene (CLAUDE.md Mandatory rule): STATUS_BOARD.md: D1.2 Queued → In PR EPIPHANIES.md PREPEND (two entries): 1. "Thinking styles ARE codecs over the semantic field" (north-star forward-looking deposit, not a work item) — codec infrastructure IS the template for production-grade thinking tissue. Mapping table documents the codec→thinking correspondence: CodecParams↔ThinkingStyleParams, kernel_signature↔style_signature, token_agreement↔ conclusion_agreement, etc. Phase 5+ drops in WireThinkCalibrate + ThinkingStyleKernelCache using the same scaffolding. Generalisation isn't porting — it's recognising thinking styles as a SPECIAL CASE of the codec pattern. 2. "D1.2 Hadamard is pure-Rust, not a JIT-necessary primitive" — narrows D1.1b scope by 30-40%. Only OPQ (matmul) needs Cranelift JIT emission; Identity (no-op) and Hadamard (butterfly) stay as plain-Rust Tier-3 F32x16 paths. Rustc's AVX-512 codegen under target-cpu=x86-64-v4 is already optimal for add/sub-structured kernels. Rules honored: Rule A — in-place &mut [f32] slice, no allocations in apply() Rule B — ndarray::simd::* not needed for these shapes; compiler emits AVX-512 from straight-line loops Rule C — Hadamard stays at Tier 3 (add/sub, no AMX benefit); OPQ stub will route to Tier 1 AMX when matrix loaded Rule D — Rotation variants come from YAML via WireRotation (D0.1) Rule E — kernel signature() + backend() are object-methods per the Wire-surface-IS-SIMD-surface pattern Rule F — no serialization anywhere; in-memory f32 buffer only https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
1 parent cf42a4a commit 5c37f0c

4 files changed

Lines changed: 463 additions & 1 deletion

File tree

.claude/board/EPIPHANIES.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,82 @@ stay as historical references.
6565

6666
## Entries (reverse chronological)
6767

68+
## 2026-04-20 — Thinking styles ARE codecs over the semantic field (north star)
69+
70+
**Status:** FINDING (forward-looking deposit — not a current work item; reference when Phase 5+ generalises)
71+
72+
A codec compresses tensor content into fingerprints; a thinking style
73+
compresses reasoning trajectories into NARS-revised beliefs. Same
74+
underlying operation — structure-preserving compression on a binary
75+
Hamming substrate. Different input/output domains, same substrate
76+
guarantees (E-SUBSTRATE-1, I-SUBSTRATE-MARKOV), same compile-and-swap
77+
machinery.
78+
79+
**The codec infrastructure IS the template for production-grade
80+
thinking tissue.** When Phase 5+ activates:
81+
82+
| Codec (shipped D0.1–D1.2, D1.1b queued) | Thinking-style analog |
83+
|---|---|
84+
| `CodecParams` | `ThinkingStyleParams { style, modulation_7d, nars_priors, fallback_chain, sigma_priority, semiring_choice }` |
85+
| `kernel_signature()` — excludes runtime drift | `style_signature()` — excludes per-cycle modulation drift |
86+
| `CodecKernelCache<H>` | `ThinkingStyleKernelCache<H>` — same generic scaffold |
87+
| JIT kernel = Cranelift-compiled decode | JIT kernel = compiled scan-walk on 36-node topology (already shipped ndarray-side via `scan_jit.rs` + `ScanParams`) |
88+
| **Token agreement** (I11 cert gate) | **Conclusion agreement** — same NARS-revised conclusions as reference style? |
89+
| Sweep grid = N codec candidates | Sweep grid = N (style × modulation × NARS fallback) candidates |
90+
| `/v1/shader/calibrate` | `/v1/shader/think-calibrate` |
91+
| `[FORMAL-SCAFFOLD]` 5 pillars | **Same scaffold** — E-SUBSTRATE-1 covers any transition under bundle |
92+
93+
**Generalisation isn't "port codec pattern to thinking"** — it's
94+
recognising thinking styles as a SPECIAL CASE of the codec pattern we
95+
just built. When Phase 5+ lands, `WireThinkCalibrate` +
96+
`ThinkingStyleKernelCache` + `conclusion_agreement` metric drop in
97+
alongside the codec versions. Same JIT engine, same tests, same
98+
board-hygiene discipline.
99+
100+
**The phrase "production-grade thinking tissue"** names the telos
101+
cleanly: once codec infra is at Phase 3 token-agreement pass rates,
102+
cloning to thinking styles yields production-grade swappable
103+
reasoning — YAML-configured, JIT-compiled, sweep-certified. No
104+
rebuild per new style, no black box, signature-keyed reproducibility.
105+
106+
**Cross-ref:** D0.6 `CodecParams` (the parameter-shape template);
107+
D1.1 `CodecKernelCache<H>` (the cache pattern — generic-over-H is the
108+
wedge for reuse); I5 (thinking IS an AdjacencyStore — already
109+
topologically unified with data graph); codec-sweep-via-lab-infra-v1.
110+
111+
---
112+
113+
## 2026-04-20 — D1.2 Hadamard is pure-Rust, not a JIT-necessary primitive
114+
115+
**Status:** FINDING
116+
117+
D1.2's HadamardRotation is implemented as a plain Rust in-place
118+
Sylvester butterfly (O(N log N) add/sub, no allocations). It does NOT
119+
need JIT compilation or Cranelift code emission because:
120+
121+
1. **Fixed shape** — the butterfly structure is identical across all
122+
power-of-two dims. Rust's compiler (under `target-cpu=x86-64-v4`)
123+
already emits AVX-512 add/sub from the straight-line loop.
124+
2. **Not matmul** — Hadamard is a pattern of adds and subtracts,
125+
never a dot product. Per Rule C polyfill hierarchy, matmul-heavy
126+
paths benefit from AMX (Tier 1); add/sub stays at Tier 3 F32x16.
127+
AMX gives no speedup here — confirmed in plan Appendix §12 C.
128+
129+
**Consequence for D1.1b (Cranelift wiring):** only OPQ rotation needs
130+
the JIT path — it's the one that's actually a learned matmul. The
131+
Cranelift integration scope narrows: we don't need to JIT-compile
132+
Identity (no-op) or Hadamard (butterfly); just OPQ (matmul) and the
133+
main codec decode loop (ADC distance with palette lookup).
134+
135+
This reduces D1.1b scope by maybe 30-40% — fewer kernel shapes to
136+
emit, only the ones that actually benefit.
137+
138+
Cross-ref: D1.2 `rotation_kernel.rs::HadamardRotation`; Rule C
139+
(polyfill hierarchy); plan Appendix B (CartanCascade harmonic
140+
compression ratios rely on real Hadamard, so this matters).
141+
142+
---
143+
68144
## 2026-04-20 — CORRECTION to D1.1 scaffold: ndarray::hpc::jitson_cranelift already ships JitEngine
69145

70146
**Status:** FINDING / CORRECTION

.claude/board/STATUS_BOARD.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ afterwards is a JIT kernel, not a rebuild. Plan path:
6363
|---|---|---|---|
6464
| D1.1 | `CodecKernelCache` — structural cache layer (generic over handle) | **In PR** | branch — `CodecKernelCache<H>` + `StubKernel` + `get_or_compile` / `try_get_or_compile` with RwLock concurrent-safe double-check + compile/hit/ratio counters + 9 tests. Scaffold ships NOW; D1.1b Cranelift IR emission follows. |
6565
| D1.1b | Adapter: `CodecKernelEngine` wrapping `ndarray::hpc::jitson_cranelift::JitEngine` with two-phase BUILD/RUN lifecycle (Arc-freeze). CodecParams → CodecScanParams adapter + codec-specific IR emission in jitson_cranelift/scan_jit analog | **Queued** | target ~250 LOC; `JitEngine` already ships (`/home/user/ndarray/src/hpc/jitson_cranelift/engine.rs`); the work is the CodecParams adapter + codec-specific JITSON template |
66-
| D1.2 | Rotation primitives: Identity / Hadamard / OPQ as JIT kernels | **Queued** | target ~190 LOC |
66+
| D1.2 | Rotation primitives: Identity / Hadamard / OPQ as `RotationKernel` impls | **In PR** | branch — `RotationKernel` trait (Send+Sync+Debug, object-safe) + `IdentityRotation` (no-op) + `HadamardRotation` (real Sylvester butterfly, O(N log N) in-place, norm²-scaling verified) + `OpqRotationStub` (matrix-blob-id placeholder for D1.1b) + `build(&Rotation, dim)` factory + `RotationError` typed errors + 15 tests. Hadamard stays at Tier-3 F32x16 (add/sub, not matmul → no AMX benefit per Rule C). |
6767
| D1.3 | Residual PQ via JIT composition | **Queued** | target ~150 LOC |
6868

6969
### Phase 2 — Token-agreement harness (I11 cert gate) — Queued

crates/cognitive-shader-driver/src/lib.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,12 @@ pub mod auto_detect;
125125
#[cfg(feature = "serve")]
126126
pub mod codec_kernel_cache;
127127

128+
// D1.2 — rotation primitives (Identity / Hadamard / OPQ-stub). LAB-ONLY.
129+
// Hadamard is real (in-place butterfly); OPQ is stub pending D1.1b's
130+
// ndarray::hpc::jitson_cranelift::JitEngine adapter + matrix-blob loader.
131+
#[cfg(feature = "serve")]
132+
pub mod rotation_kernel;
133+
128134
// Axum REST server. LAB-ONLY.
129135
#[cfg(feature = "serve")]
130136
pub mod serve;

0 commit comments

Comments
 (0)