Skip to content

Commit b21c037

Browse files
authored
Merge pull request #259 from AdaWorldAPI/claude/hamming-content-cascade
feat(shader-driver): wire content-plane Hamming cascade — dispatch sees real similarity
2 parents cbddc8b + 3d960c4 commit b21c037

2 files changed

Lines changed: 221 additions & 1 deletion

File tree

.claude/board/AGENT_LOG.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -266,9 +266,35 @@ newest-first.** A `BlackboardEntry` by any other transport.
266266
**Tests:** 12 pass
267267
**Outcome:** Shipped `lance-graph-archetype` crate scaffold: Component + Processor traits, World meta-state with tick/fork/at_tick stubs, CommandBroker FIFO queue, ArchetypeError. PR #254 merged.
268268

269+
## 2026-04-24T17:20 — Content Hamming cascade wire (opus, claude/hamming-content-cascade)
270+
271+
**D-ids:** Content-plane similarity pre-pass in ShaderDriver::run()
272+
**Commit:** `2cf36ad`
273+
**Tests:** 45 pass (43 lib + 2 e2e, 3 new: content_hamming_finds_similar_rows / _skips_dissimilar / _respects_style_threshold)
274+
**Outcome:** The glove is flying. Before: dispatch() on 3 encoded rows returned `hit_count:0, confidence:0.0, admit_ignorance:true` across every style — the PaletteSemiring cascade probed a synthetic Base17 table unrelated to the encoded text, and the content plane was only read for the cycle_fp XOR fold, never compared. After: content-plane Hamming pre-pass runs BEFORE the palette cascade. For each pair in `passed_rows`, popcount XOR of `content_row(i)` vs `content_row(j)`; if `resonance = 1 - Hamming/16384 >= style.resonance_threshold`, emit `ShaderHit{predicates:0x01}`. Guard: N² sweep skipped when `passed_rows.len() > 256`.
275+
276+
**Live verification (encode 3 rows, dispatch 0..3):**
277+
- "Palantir develops surveillance systems" (row 0)
278+
- "Palantir Gotham is a surveillance platform" (row 1)
279+
- "Israel deploys military AI" (row 2)
280+
281+
| Style | Threshold | hit_count | top-1 row pair | resonance | confidence |
282+
|--------------------|-----------|-----------|----------------|-----------|------------|
283+
| Analytical (1) | 0.85 | 0 ||| 0.0 (admit_ignorance) |
284+
| Creative (4) | 0.35 | 6 | row 0 ↔ row 1 | 0.598 | 0.598 |
285+
| Peripheral (9) | 0.20 | 6 | row 0 ↔ row 1 | 0.598 | 0.598 |
286+
287+
The strongest signal (rows 0↔1, both Palantir) correctly ranks first. Rows 0↔2 (Palantir vs Israel AI) lands lowest at 0.496. Analytical's 0.85 threshold rejects all pairs — style semantics preserved.
288+
289+
**Key insight:** The Jirak 454-Hamming threshold calibrated in the 2026-04-24 EPIPHANY was for UNTILED DeepNSM encodes at density ≈ 0.016. The live encode path 32×-tiles 512-bit VSA → 16K content plane, pushing density to ≈ 0.48 and expected-random Hamming to ≈ 8000. Using an absolute bit threshold would have required per-density calibration; using `resonance >= style.resonance_threshold` is density-agnostic and reuses the existing style semantics. Style config IS the content-similarity threshold.
290+
291+
**Remaining gap:** palette cascade hits (synthetic Base17) still exist and can flood top-k when their resonance exceeds content-match resonance; see driver.rs:180 `hits.truncate(8)`. The test `content_hamming_respects_style_threshold` uses empty planes to isolate the content cascade; in production with meaningful planes, content hits will intermix with palette hits via the shared resonance sort. Option: promote content hits with a small resonance bonus if future tuning shows palette drowning content too aggressively.
292+
293+
Cross-ref: EPIPHANIES 2026-04-24 "Jirak noise floor" + "dispatch wiring audit", I-NOISE-FLOOR-JIRAK iron rule, driver.rs:93-156.
294+
269295
## 2026-04-24T17:30 — Cypher → AriGraph bridge (opus, claude/cypher-to-arigraph-wire)
270296

271297
**D-ids:** CypherBridge, /v1/shader/route lg.cypher handling
272298
**Commit:** `45fc3a4`
273299
**Tests:** 7 pass (create, match, unsupported, non-cypher, missing-reasoning, lowercase, nd-reject)
274-
**Outcome:** Phase 1 stub landed — prefix classifier over step_type="lg.cypher". CREATE and MATCH → Completed (confidence 0.5), other cypher constructs → Skipped with "unsupported cypher construct, stub in place", non-`lg.cypher``Err(DomainUnavailable)` so route_handler falls through to planner. Phase 2 (real `lance_graph::parser::parse_cypher_query` + SPO commit + BindSpace label search) deferred: pulling lance-graph core (arrow + datafusion + lance) into cognitive-shader-driver would balloon build time for what today is a test-path transport. route_handler is now a three-stage chain: CodecResearchBridge (nd.*) → CypherBridge (lg.cypher) → planner_bridge. Live curl against localhost:3001/v1/shader/route verified all four paths: CREATE→completed+0.5, MATCH→completed+0.5, DROP INDEX→skipped, lg.plan→failed (planner not compiled in, unchanged from pre-PR).
300+
**Outcome:** Phase 1 stub landed — prefix classifier over step_type="lg.cypher". CREATE and MATCH → Completed (confidence 0.5), other cypher constructs → Skipped with "unsupported cypher construct, stub in place", non-`lg.cypher``Err(DomainUnavailable)` so route_handler falls through to planner. Phase 2 (real `lance_graph::parser::parse_cypher_query` + SPO commit + BindSpace label search) deferred: pulling lance-graph core (arrow + datafusion + lance) into cognitive-shader-driver would balloon build time for what today is a test-path transport. route_handler is now a three-stage chain: CodecResearchBridge (nd.*) → CypherBridge (lg.cypher) → planner_bridge. Live curl against localhost:3001/v1/shader/route verified all four paths: CREATE→completed+0.5, MATCH→completed+0.5, DROP INDEX→skipped, lg.plan→failed (planner not compiled in, unchanged from pre-PR). PR #258 merged.

crates/cognitive-shader-driver/src/driver.rs

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,70 @@ impl ShaderDriver {
9090
let max_dist = (self.semiring.k as f32) * (self.semiring.k as f32);
9191
let mut hits = Vec::<ShaderHit>::with_capacity(passed_rows.len().min(64));
9292

93+
// ═══════════════════════════════════════════════════════════════
94+
// Content-plane Hamming pre-pass (PR: hamming-content-cascade).
95+
// Compare content fingerprint of each passed row against every
96+
// other passed row. If Hamming-resonance exceeds the style's
97+
// resonance_threshold, emit a content-match hit. This is the
98+
// wire that lets dispatch() see real text similarity, not just
99+
// edge palette distance.
100+
//
101+
// Resonance model: resonance = 1 - Hamming/16384. Rows that
102+
// share content words land at higher resonance; fully disjoint
103+
// rows land near 0.5 (density ≈ 0.48 after 32× DeepNSM tiling).
104+
// Style thresholds (UNIFIED_STYLES):
105+
// analytical 0.85 (strict) focused 0.90 (strictest)
106+
// creative 0.35 (loose) peripheral 0.20 (loosest)
107+
// Jirak-calibrated 3σ reference: Hamming < 454 at density 0.016
108+
// (untiled). For tiled encodings (current DeepNSM path) the
109+
// density-dependent baseline shifts; resonance-over-threshold
110+
// is the density-agnostic reading. See EPIPHANIES 2026-04-24
111+
// "Jirak noise floor calibrated for DeepNSM-tiled 16K-bit
112+
// fingerprints".
113+
//
114+
// Guard: skip the N² sweep if passed_rows.len() > 256 — at
115+
// 4096 rows that is 16M popcount × 256 comparisons.
116+
// ═══════════════════════════════════════════════════════════════
117+
const CONTENT_MATCH_PREDICATE: u8 = 0x01;
118+
const MAX_CONTENT_PREPASS_ROWS: usize = 256;
119+
const FP_BITS: f32 = (WORDS_PER_FP * 64) as f32;
120+
if passed_rows.len() <= MAX_CONTENT_PREPASS_ROWS {
121+
let style_cfg = &crate::engine_bridge::UNIFIED_STYLES[(style_ord % 12) as usize];
122+
let min_resonance = style_cfg.resonance_threshold;
123+
124+
for (i, &row_i) in passed_rows.iter().enumerate() {
125+
let fp_i = self.bindspace.fingerprints.content_row(row_i as usize);
126+
for (j_off, &row_j) in passed_rows.iter().enumerate().skip(i + 1) {
127+
let fp_j = self.bindspace.fingerprints.content_row(row_j as usize);
128+
// Hamming = popcount of XOR across all 256 u64 words.
129+
let hamming: u32 = fp_i.iter().zip(fp_j.iter())
130+
.map(|(a, b)| (a ^ b).count_ones())
131+
.sum();
132+
// Resonance: normalized to full bit-width; higher = more similar.
133+
let resonance = 1.0 - (hamming as f32 / FP_BITS);
134+
if resonance >= min_resonance {
135+
// Record both directions so either row can surface via top-k.
136+
hits.push(ShaderHit {
137+
row: row_i,
138+
distance: hamming.min(u16::MAX as u32) as u16,
139+
predicates: CONTENT_MATCH_PREDICATE,
140+
_pad: 0,
141+
resonance,
142+
cycle_index: i as u32,
143+
});
144+
hits.push(ShaderHit {
145+
row: row_j,
146+
distance: hamming.min(u16::MAX as u32) as u16,
147+
predicates: CONTENT_MATCH_PREDICATE,
148+
_pad: 0,
149+
resonance,
150+
cycle_index: j_off as u32,
151+
});
152+
}
153+
}
154+
}
155+
}
156+
93157
for (cycle_idx, &row) in passed_rows.iter().enumerate() {
94158
if cycle_idx as u16 >= req.max_cycles.saturating_mul(4) { break; }
95159
// Use the SPO `s_idx` of the row's edge as the query palette index.
@@ -444,6 +508,136 @@ mod tests {
444508
assert!(crystal.bus.resonance.cycles_used <= 1);
445509
}
446510

511+
/// Build a BindSpace of `n` rows with caller-supplied content fingerprints.
512+
/// Meta confidence set to (200, 200) so everything passes the prefilter.
513+
fn bindspace_with_content(rows: &[[u64; WORDS_PER_FP]]) -> BindSpace {
514+
let q = [0.0f32; QUALIA_DIMS];
515+
let mut builder = BindSpaceBuilder::new(rows.len());
516+
for (idx, content) in rows.iter().enumerate() {
517+
let meta = MetaWord::new((idx as u8).wrapping_add(1), (idx as u8).wrapping_add(1), 200, 200, 5);
518+
builder = builder.push(content, meta, 0, &q, 0, 0);
519+
}
520+
builder.build()
521+
}
522+
523+
#[test]
524+
fn content_hamming_finds_similar_rows() {
525+
// Two rows with near-identical content (differ in only 4 bits)
526+
// → resonance ≈ 0.9998, well above any style threshold.
527+
let mut a = [0u64; WORDS_PER_FP];
528+
for i in 0..250 { a[i / 64] |= 1u64 << (i % 64); }
529+
let mut b = a;
530+
b[0] ^= 0xF; // 4-bit difference → Hamming = 4
531+
// A third row with substantially different content.
532+
let mut c = [0u64; WORDS_PER_FP];
533+
for i in 8000..8250 { c[i / 64] |= 1u64 << (i % 64); }
534+
535+
let bs = Arc::new(bindspace_with_content(&[a, b, c]));
536+
let sr = Arc::new(demo_semiring());
537+
let driver = CognitiveShaderBuilder::new()
538+
.bindspace(bs).semiring(sr).planes(demo_planes()).build();
539+
540+
let req = ShaderDispatch {
541+
rows: ColumnWindow::new(0, 3),
542+
meta_prefilter: MetaFilter::ALL,
543+
layer_mask: 0xFF,
544+
radius: u16::MAX,
545+
style: StyleSelector::Ordinal(auto_style::ANALYTICAL),
546+
..Default::default()
547+
};
548+
let crystal = driver.dispatch(&req);
549+
// Top-k must contain at least one content-match hit (predicates=0x01).
550+
let content_hits: Vec<_> = crystal.bus.resonance.top_k.iter()
551+
.filter(|h| h.predicates & 0x01 != 0 && h.resonance > 0.0)
552+
.collect();
553+
assert!(!content_hits.is_empty(),
554+
"expected at least one content-match hit, got top_k={:?}",
555+
crystal.bus.resonance.top_k);
556+
// Similarity should be very high (differ in only 4/16384 bits).
557+
assert!(content_hits.iter().any(|h| h.resonance > 0.5),
558+
"content-match resonance should be > 0.5 for near-identical rows");
559+
}
560+
561+
#[test]
562+
fn content_hamming_skips_dissimilar() {
563+
// Two rows with ~10000 Hamming distance → resonance ≈ 0.39, which
564+
// is BELOW analytical threshold (0.85). Analytical must not emit
565+
// a content-match hit.
566+
let mut a = [0u64; WORDS_PER_FP];
567+
for i in 0..5000 { a[i / 64] |= 1u64 << (i % 64); }
568+
let mut b = [0u64; WORDS_PER_FP];
569+
for i in 8000..13000 { b[i / 64] |= 1u64 << (i % 64); }
570+
// Disjoint ranges → Hamming ≈ 10000.
571+
572+
let bs = Arc::new(bindspace_with_content(&[a, b]));
573+
let sr = Arc::new(demo_semiring());
574+
let driver = CognitiveShaderBuilder::new()
575+
.bindspace(bs).semiring(sr).planes(demo_planes()).build();
576+
577+
let req = ShaderDispatch {
578+
rows: ColumnWindow::new(0, 2),
579+
meta_prefilter: MetaFilter::ALL,
580+
layer_mask: 0xFF,
581+
radius: u16::MAX,
582+
style: StyleSelector::Ordinal(auto_style::ANALYTICAL),
583+
..Default::default()
584+
};
585+
let crystal = driver.dispatch(&req);
586+
let content_hits: Vec<_> = crystal.bus.resonance.top_k.iter()
587+
.filter(|h| h.predicates & 0x01 != 0 && h.resonance > 0.0)
588+
.collect();
589+
assert!(content_hits.is_empty(),
590+
"analytical style should not emit content hits when resonance < 0.85; got {:?}",
591+
content_hits);
592+
}
593+
594+
#[test]
595+
fn content_hamming_respects_style_threshold() {
596+
// Design Hamming ≈ 5000 so resonance ≈ 0.695:
597+
// * below analytical (0.85) → 0 content hits
598+
// * above creative (0.35) → ≥ 1 content hits
599+
// a = bits [0..5000), b = bits [2500..7500) → overlap 2500 bits,
600+
// disjoint 2500+2500 = 5000, Hamming ≈ 5000.
601+
let mut a = [0u64; WORDS_PER_FP];
602+
for i in 0..5000 { a[i / 64] |= 1u64 << (i % 64); }
603+
let mut b = [0u64; WORDS_PER_FP];
604+
for i in 2500..7500 { b[i / 64] |= 1u64 << (i % 64); }
605+
606+
// Use empty planes so the palette cascade produces no hits —
607+
// isolates the content pre-pass so it cannot be drowned out by
608+
// synthetic palette matches that dominate top-k truncate(8).
609+
let empty_planes = [[0u64; 64]; 8];
610+
let mk_driver = || {
611+
let bs = Arc::new(bindspace_with_content(&[a, b]));
612+
let sr = Arc::new(demo_semiring());
613+
CognitiveShaderBuilder::new()
614+
.bindspace(bs).semiring(sr).planes(empty_planes).build()
615+
};
616+
let mk_req = |style_ord: u8| ShaderDispatch {
617+
rows: ColumnWindow::new(0, 2),
618+
meta_prefilter: MetaFilter::ALL,
619+
layer_mask: 0xFF,
620+
radius: u16::MAX,
621+
style: StyleSelector::Ordinal(style_ord),
622+
..Default::default()
623+
};
624+
625+
let strict = mk_driver().dispatch(&mk_req(auto_style::ANALYTICAL));
626+
let loose = mk_driver().dispatch(&mk_req(auto_style::CREATIVE));
627+
let strict_hits = strict.bus.resonance.top_k.iter()
628+
.filter(|h| h.predicates & 0x01 != 0 && h.resonance > 0.0).count();
629+
let loose_hits = loose.bus.resonance.top_k.iter()
630+
.filter(|h| h.predicates & 0x01 != 0 && h.resonance > 0.0).count();
631+
// Monotonicity: loosening the style cannot reduce the set of
632+
// content-match hits. This is the load-bearing invariant.
633+
assert!(strict_hits <= loose_hits,
634+
"creative (loose) should emit >= analytical (strict) content hits: strict={} loose={}",
635+
strict_hits, loose_hits);
636+
assert!(loose_hits > 0,
637+
"creative (threshold 0.35) should emit content hits for resonance ≈ 0.695\nloose top_k: {:?}",
638+
loose.bus.resonance.top_k);
639+
}
640+
447641
#[test]
448642
fn sink_short_circuits_on_false() {
449643
struct Stop;

0 commit comments

Comments
 (0)