Skip to content

Commit 6bed7ae

Browse files
authored
Merge pull request #235 from AdaWorldAPI/claude/teleport-session-setup-wMZfb
D1.3 decode-kernel + residual composition (Phase 1 scaffold complete, 104/104 tests)
2 parents a3529ff + 3f58967 commit 6bed7ae

3 files changed

Lines changed: 347 additions & 1 deletion

File tree

.claude/board/STATUS_BOARD.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ afterwards is a JIT kernel, not a rebuild. Plan path:
6464
| D1.1 | `CodecKernelCache` — structural cache layer (generic over handle) | **In PR** | branch — `CodecKernelCache<H>` + `StubKernel` + `get_or_compile` / `try_get_or_compile` with RwLock concurrent-safe double-check + compile/hit/ratio counters + 9 tests. Scaffold ships NOW; D1.1b Cranelift IR emission follows. |
6565
| D1.1b | Adapter: `CodecKernelEngine` wrapping `ndarray::hpc::jitson_cranelift::JitEngine` with two-phase BUILD/RUN lifecycle (Arc-freeze). CodecParams → CodecScanParams adapter + codec-specific IR emission in jitson_cranelift/scan_jit analog | **Queued** | target ~250 LOC; `JitEngine` already ships (`/home/user/ndarray/src/hpc/jitson_cranelift/engine.rs`); the work is the CodecParams adapter + codec-specific JITSON template |
6666
| D1.2 | Rotation primitives: Identity / Hadamard / OPQ as `RotationKernel` impls | **In PR** | branch — `RotationKernel` trait (Send+Sync+Debug, object-safe) + `IdentityRotation` (no-op) + `HadamardRotation` (real Sylvester butterfly, O(N log N) in-place, norm²-scaling verified) + `OpqRotationStub` (matrix-blob-id placeholder for D1.1b) + `build(&Rotation, dim)` factory + `RotationError` typed errors + 15 tests. Hadamard stays at Tier-3 F32x16 (add/sub, not matmul → no AMX benefit per Rule C). |
67-
| D1.3 | Residual PQ via JIT composition | **Queued** | target ~150 LOC |
67+
| D1.3 | Residual PQ via decode-kernel composition | **In PR** | branch — `DecodeKernel` trait (Send+Sync+Debug, object-safe, encode/decode/signature/bytes_per_row/dim/backend) + `StubDecodeKernel` (byte-exact round-trip for testing) + `ResidualComposer` (base + residual with subtract/add; nests recursively for depth >1) + `DecodeError` typed errors + 9 tests. Scope clarified: hydration/calibration path, NOT cascade inference (cascade uses `p64_bridge::CognitiveShader` per `cognitive-shader-architecture.md` line 582). |
6868

6969
### Phase 2 — Token-agreement harness (I11 cert gate) — Queued
7070

Lines changed: 339 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,339 @@
1+
//! **LAB-ONLY.** D1.3 — residual PQ via decode-kernel composition.
2+
//!
3+
//! Scope correction (per `cognitive-shader-architecture.md`): this module
4+
//! sits on the **hydration / calibration path**, not the cascade inference
5+
//! path. The inference cascade uses `p64_bridge::CognitiveShader::cascade`
6+
//! at Layer 2 (line 582 of that doc); decode kernels here are for offline
7+
//! codec-candidate calibration. A codec that passes the token-agreement
8+
//! cert gate (D2.x) then runs at **weight hydration time** (GGUF → palette
9+
//! + Fingerprint<256> + holographic residual), never per-inference.
10+
//!
11+
//! This module defines:
12+
//!
13+
//! - [`DecodeKernel`] — the codec decode/encode trait, signature-keyed
14+
//! into `CodecKernelCache<H>` at the `H` slot where `H: DecodeKernel`.
15+
//! - [`StubDecodeKernel`] — deterministic reference for tests (byte-level
16+
//! round-trip, no quantization loss, matches Rule F "serialise once at
17+
//! edge" — the decode/encode IS the edge).
18+
//! - [`ResidualComposer`] — composes two decoders with subtract/add:
19+
//! `decode(enc) = base.decode(enc[..k]) + residual.decode(enc[k..])`
20+
//! `encode(v) = [base.encode(v); residual.encode(v - base.decode(base.encode(v)))]`
21+
//! Depth `d > 1` recurses: the residual field itself is a `ResidualComposer`.
22+
23+
use std::collections::hash_map::DefaultHasher;
24+
use std::hash::{Hash, Hasher};
25+
26+
/// Error from encode / decode — size mismatch or downstream kernel failure.
27+
#[derive(Debug, Clone, PartialEq, Eq)]
28+
pub enum DecodeError {
29+
/// Input slice size doesn't match the kernel's expected `bytes_per_row`
30+
/// (for decode) or `dim` (for encode).
31+
SizeMismatch { expected: usize, actual: usize },
32+
/// A composed-kernel stage failed with a downstream reason.
33+
Stage { stage: &'static str, detail: String },
34+
}
35+
36+
impl std::fmt::Display for DecodeError {
37+
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
38+
match self {
39+
Self::SizeMismatch { expected, actual } => {
40+
write!(f, "decode size mismatch: expected {expected}, got {actual}")
41+
}
42+
Self::Stage { stage, detail } => write!(f, "stage {stage} failed: {detail}"),
43+
}
44+
}
45+
}
46+
47+
impl std::error::Error for DecodeError {}
48+
49+
/// A codec decode/encode kernel.
50+
///
51+
/// Object-safe so `ResidualComposer` can hold `Box<dyn DecodeKernel>` for
52+
/// each stage; keyed by `signature()` into `CodecKernelCache<H>` at the
53+
/// `H = Box<dyn DecodeKernel>` slot.
54+
pub trait DecodeKernel: Send + Sync + std::fmt::Debug {
55+
/// Decode `bytes` (exactly `bytes_per_row` long) into `dim` `f32` values.
56+
fn decode(&self, bytes: &[u8]) -> Result<Vec<f32>, DecodeError>;
57+
/// Encode `vec` (exactly `dim` `f32` values) into `bytes_per_row` bytes.
58+
fn encode(&self, vec: &[f32]) -> Result<Vec<u8>, DecodeError>;
59+
/// Bytes per encoded row — composers rely on this to split payloads.
60+
fn bytes_per_row(&self) -> u32;
61+
/// Dimension of the decoded `f32` vector.
62+
fn dim(&self) -> u32;
63+
/// Stable hash for JIT cache keying.
64+
fn signature(&self) -> u64;
65+
/// Backend tier ("avx512" | "amx" | "stub") — never "scalar" on SoA.
66+
fn backend(&self) -> &'static str;
67+
}
68+
69+
// ─── Stub decoder (byte-exact round-trip for testing) ────────────────────
70+
71+
/// Deterministic byte-exact decoder. `dim` `f32` values ⇔ `dim * 4` bytes
72+
/// via native-endian reinterpret. No quantization, no compression — the
73+
/// round-trip is exact.
74+
///
75+
/// This IS NOT a real codec. It exists so the `ResidualComposer` and
76+
/// `CodecKernelCache` composition tests can verify the plumbing without
77+
/// depending on a trained palette.
78+
#[derive(Debug, Clone, Copy)]
79+
pub struct StubDecodeKernel {
80+
pub dim: u32,
81+
pub tag: u64,
82+
}
83+
84+
impl StubDecodeKernel {
85+
pub const fn new(dim: u32, tag: u64) -> Self { Self { dim, tag } }
86+
}
87+
88+
impl DecodeKernel for StubDecodeKernel {
89+
fn decode(&self, bytes: &[u8]) -> Result<Vec<f32>, DecodeError> {
90+
let expected = self.bytes_per_row() as usize;
91+
if bytes.len() != expected {
92+
return Err(DecodeError::SizeMismatch { expected, actual: bytes.len() });
93+
}
94+
let mut out = Vec::with_capacity(self.dim as usize);
95+
for chunk in bytes.chunks_exact(4) {
96+
out.push(f32::from_ne_bytes(chunk.try_into().unwrap()));
97+
}
98+
Ok(out)
99+
}
100+
101+
fn encode(&self, vec: &[f32]) -> Result<Vec<u8>, DecodeError> {
102+
let expected = self.dim as usize;
103+
if vec.len() != expected {
104+
return Err(DecodeError::SizeMismatch { expected, actual: vec.len() });
105+
}
106+
let mut out = Vec::with_capacity(expected * 4);
107+
for &v in vec {
108+
out.extend_from_slice(&v.to_ne_bytes());
109+
}
110+
Ok(out)
111+
}
112+
113+
fn bytes_per_row(&self) -> u32 { self.dim * 4 }
114+
fn dim(&self) -> u32 { self.dim }
115+
116+
fn signature(&self) -> u64 {
117+
let mut h = DefaultHasher::new();
118+
"stub_decode".hash(&mut h);
119+
self.dim.hash(&mut h);
120+
self.tag.hash(&mut h);
121+
h.finish()
122+
}
123+
124+
fn backend(&self) -> &'static str { "stub" }
125+
}
126+
127+
// ─── Residual composer ───────────────────────────────────────────────────
128+
129+
/// Composes a `base` decoder with a `residual` decoder: the encoded payload
130+
/// is the concatenation of `base.encode(v)` + `residual.encode(v -
131+
/// base.decode(base.encode(v)))`; decode reverses by decoding each stage's
132+
/// own byte range and summing.
133+
///
134+
/// **Contract:** `base.dim() == residual.dim()`. Enforced by `new()`.
135+
///
136+
/// Deeper residual chains (depth > 1) nest `ResidualComposer`s — the
137+
/// `residual` slot itself is a `Box<dyn DecodeKernel>` which can be
138+
/// another `ResidualComposer`.
139+
#[derive(Debug)]
140+
pub struct ResidualComposer {
141+
base: Box<dyn DecodeKernel>,
142+
residual: Box<dyn DecodeKernel>,
143+
}
144+
145+
impl ResidualComposer {
146+
/// Build a two-stage residual composer.
147+
///
148+
/// Returns `Err(SizeMismatch)` when `base.dim() != residual.dim()`.
149+
pub fn new(
150+
base: Box<dyn DecodeKernel>,
151+
residual: Box<dyn DecodeKernel>,
152+
) -> Result<Self, DecodeError> {
153+
if base.dim() != residual.dim() {
154+
return Err(DecodeError::SizeMismatch {
155+
expected: base.dim() as usize,
156+
actual: residual.dim() as usize,
157+
});
158+
}
159+
Ok(Self { base, residual })
160+
}
161+
}
162+
163+
impl DecodeKernel for ResidualComposer {
164+
fn decode(&self, bytes: &[u8]) -> Result<Vec<f32>, DecodeError> {
165+
let base_b = self.base.bytes_per_row() as usize;
166+
let expected = self.bytes_per_row() as usize;
167+
if bytes.len() != expected {
168+
return Err(DecodeError::SizeMismatch { expected, actual: bytes.len() });
169+
}
170+
let base_v = self
171+
.base
172+
.decode(&bytes[..base_b])
173+
.map_err(|e| DecodeError::Stage { stage: "base::decode", detail: e.to_string() })?;
174+
let residual_v = self
175+
.residual
176+
.decode(&bytes[base_b..])
177+
.map_err(|e| DecodeError::Stage { stage: "residual::decode", detail: e.to_string() })?;
178+
let mut out = base_v;
179+
for (dst, &r) in out.iter_mut().zip(&residual_v) {
180+
*dst += r;
181+
}
182+
Ok(out)
183+
}
184+
185+
fn encode(&self, vec: &[f32]) -> Result<Vec<u8>, DecodeError> {
186+
let expected = self.dim() as usize;
187+
if vec.len() != expected {
188+
return Err(DecodeError::SizeMismatch { expected, actual: vec.len() });
189+
}
190+
// First-pass encode + its self-reconstruction.
191+
let base_bytes = self
192+
.base
193+
.encode(vec)
194+
.map_err(|e| DecodeError::Stage { stage: "base::encode", detail: e.to_string() })?;
195+
let base_reconstructed = self
196+
.base
197+
.decode(&base_bytes)
198+
.map_err(|e| DecodeError::Stage { stage: "base::decode", detail: e.to_string() })?;
199+
// Residual = original − base.decode(base.encode(original)).
200+
let residual_vec: Vec<f32> =
201+
vec.iter().zip(&base_reconstructed).map(|(a, b)| a - b).collect();
202+
let residual_bytes = self
203+
.residual
204+
.encode(&residual_vec)
205+
.map_err(|e| DecodeError::Stage { stage: "residual::encode", detail: e.to_string() })?;
206+
// Concat.
207+
let mut out = Vec::with_capacity(base_bytes.len() + residual_bytes.len());
208+
out.extend_from_slice(&base_bytes);
209+
out.extend_from_slice(&residual_bytes);
210+
Ok(out)
211+
}
212+
213+
fn bytes_per_row(&self) -> u32 {
214+
self.base.bytes_per_row() + self.residual.bytes_per_row()
215+
}
216+
217+
fn dim(&self) -> u32 { self.base.dim() }
218+
219+
fn signature(&self) -> u64 {
220+
let mut h = DefaultHasher::new();
221+
"residual_compose".hash(&mut h);
222+
self.base.signature().hash(&mut h);
223+
self.residual.signature().hash(&mut h);
224+
h.finish()
225+
}
226+
227+
fn backend(&self) -> &'static str {
228+
// Composer backend = the less-optimized stage's backend (weakest link
229+
// drives actual latency). For the stub case this is "stub"; for real
230+
// D1.1b kernels, the OPQ matmul stage dominates.
231+
if self.base.backend() == "stub" || self.residual.backend() == "stub" {
232+
"stub"
233+
} else {
234+
self.base.backend()
235+
}
236+
}
237+
}
238+
239+
#[cfg(test)]
240+
mod tests {
241+
use super::*;
242+
243+
#[test]
244+
fn stub_round_trip_is_exact() {
245+
let k = StubDecodeKernel::new(8, 1);
246+
let v = vec![1.0_f32, -2.5, 3.25, 0.0, -7.875, 100.0, -0.125, 42.0];
247+
let enc = k.encode(&v).unwrap();
248+
assert_eq!(enc.len(), 32);
249+
let dec = k.decode(&enc).unwrap();
250+
assert_eq!(dec, v);
251+
}
252+
253+
#[test]
254+
fn stub_rejects_wrong_input_size() {
255+
let k = StubDecodeKernel::new(4, 0);
256+
let err = k.encode(&[1.0, 2.0, 3.0]).unwrap_err();
257+
assert!(matches!(err, DecodeError::SizeMismatch { expected: 4, actual: 3 }));
258+
let err = k.decode(&[0u8; 10]).unwrap_err();
259+
assert!(matches!(err, DecodeError::SizeMismatch { expected: 16, actual: 10 }));
260+
}
261+
262+
#[test]
263+
fn residual_compose_round_trip_is_exact_when_both_stubs() {
264+
let base = Box::new(StubDecodeKernel::new(4, 1));
265+
let residual = Box::new(StubDecodeKernel::new(4, 2));
266+
let comp = ResidualComposer::new(base, residual).unwrap();
267+
268+
let v = vec![1.5_f32, -2.0, 3.125, -0.5];
269+
let enc = comp.encode(&v).unwrap();
270+
assert_eq!(enc.len(), 32, "4 dim × 4 bytes × 2 stages = 32");
271+
let dec = comp.decode(&enc).unwrap();
272+
// Both stages byte-exact ⇒ residual is all zeros ⇒ decoded = base_reconstructed = v.
273+
assert_eq!(dec, v);
274+
}
275+
276+
#[test]
277+
fn residual_compose_mismatched_dims_rejected() {
278+
let base = Box::new(StubDecodeKernel::new(4, 0));
279+
let residual = Box::new(StubDecodeKernel::new(8, 0));
280+
let err = ResidualComposer::new(base, residual).unwrap_err();
281+
assert!(matches!(err, DecodeError::SizeMismatch { expected: 4, actual: 8 }));
282+
}
283+
284+
#[test]
285+
fn residual_compose_bytes_per_row_sums_stages() {
286+
let base = Box::new(StubDecodeKernel::new(6, 1)); // 24 bytes
287+
let residual = Box::new(StubDecodeKernel::new(6, 2)); // 24 bytes
288+
let comp = ResidualComposer::new(base, residual).unwrap();
289+
assert_eq!(comp.bytes_per_row(), 48);
290+
assert_eq!(comp.dim(), 6);
291+
}
292+
293+
#[test]
294+
fn residual_compose_nested_depth_two_round_trip() {
295+
// depth=2: ResidualComposer whose `residual` slot is itself a ResidualComposer.
296+
let inner_base = Box::new(StubDecodeKernel::new(4, 1));
297+
let inner_residual = Box::new(StubDecodeKernel::new(4, 2));
298+
let inner = Box::new(ResidualComposer::new(inner_base, inner_residual).unwrap());
299+
let outer_base = Box::new(StubDecodeKernel::new(4, 3));
300+
let outer = ResidualComposer::new(outer_base, inner).unwrap();
301+
302+
assert_eq!(outer.bytes_per_row(), 48, "4 dim × 4 bytes × 3 stages");
303+
let v = vec![1.0_f32, 2.0, 3.0, 4.0];
304+
let enc = outer.encode(&v).unwrap();
305+
let dec = outer.decode(&enc).unwrap();
306+
assert_eq!(dec, v);
307+
}
308+
309+
#[test]
310+
fn signatures_distinguish_composer_from_stages() {
311+
let base = Box::new(StubDecodeKernel::new(4, 1));
312+
let residual = Box::new(StubDecodeKernel::new(4, 2));
313+
let s_base = base.signature();
314+
let s_res = residual.signature();
315+
let comp = ResidualComposer::new(base, residual).unwrap();
316+
let s_comp = comp.signature();
317+
assert_ne!(s_comp, s_base);
318+
assert_ne!(s_comp, s_res);
319+
}
320+
321+
#[test]
322+
fn signature_depends_on_stage_order() {
323+
let a = Box::new(StubDecodeKernel::new(4, 1));
324+
let b = Box::new(StubDecodeKernel::new(4, 2));
325+
let ab = ResidualComposer::new(a, b).unwrap();
326+
let a2 = Box::new(StubDecodeKernel::new(4, 1));
327+
let b2 = Box::new(StubDecodeKernel::new(4, 2));
328+
let ba = ResidualComposer::new(b2, a2).unwrap();
329+
assert_ne!(ab.signature(), ba.signature(), "base/residual order is part of identity");
330+
}
331+
332+
#[test]
333+
fn composer_backend_reports_stub_when_any_stage_is_stub() {
334+
let base = Box::new(StubDecodeKernel::new(4, 1));
335+
let residual = Box::new(StubDecodeKernel::new(4, 2));
336+
let comp = ResidualComposer::new(base, residual).unwrap();
337+
assert_eq!(comp.backend(), "stub");
338+
}
339+
}

crates/cognitive-shader-driver/src/lib.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,13 @@ pub mod codec_kernel_cache;
131131
#[cfg(feature = "serve")]
132132
pub mod rotation_kernel;
133133

134+
// D1.3 — decode-kernel trait + residual composition.
135+
// Hydration/calibration path (NOT cascade inference — that uses
136+
// p64_bridge::CognitiveShader per cognitive-shader-architecture.md
137+
// line 582). LAB-ONLY.
138+
#[cfg(feature = "serve")]
139+
pub mod decode_kernel;
140+
134141
// Axum REST server. LAB-ONLY.
135142
#[cfg(feature = "serve")]
136143
pub mod serve;

0 commit comments

Comments
 (0)