Skip to content

Commit 175696a

Browse files
committed
deepnsm: address PR #479 review — coref-keyed cam64, expectation lifecycle, P64 dedup
Three findings from codex + CodeRabbit on PR #479: 1. (codex P2, reader_state.rs) Build Cam64 from the resolved subject. For "John … He …", effective_subject was the antecedent but Cam64 was built from the raw pronoun triple, so cam64/P64/CAM4096 keyed to the pronoun bucket while truth fields pointed at John — downstream locality lookup and basin matching used the wrong entity. Now constructs an effective triple (subject = resolved antecedent) before from_triple(). No-op when there is no coreference. The code now matches the comment that already promised "use the effective triple (post-coref)". Test: cam64_keyed_to_resolved_antecedent_not_pronoun. 2. (codex P2, window.rs) Clear consumed expectations. Forward-expectation slots were never drained, so across several relative/anaphora clauses they accumulated to MAX_EXPECTED (4), after which push_expected silently dropped newer antecedents and stale expectations could win over the confirmed ring. step() now clears the expectation buffer at the start of each sentence (expectations are single-step predictions), bounding it permanently. Test: expectations_do_not_accumulate_across_sentences. 3. (CodeRabbit, sentence_transformer64.rs) P64 duplicated P64MeaningField. Both were byte-identical 8-lane u64 meaning fields. Consolidated: P64 (the richer superset) is now canonical; signed_crystal re-exports `pub use crate::sentence_transformer64::P64 as P64MeaningField`. Removes drift risk. Behaviour-preserving (both from_cam64_and_nsm were identical). Also removed three now-unused imports surfaced by the consolidation (NO_ROLE in signed_crystal; MorphFlags + SpoTriple in sentence_transformer64). 217 deepnsm tests pass (was 215; +2 new regression tests). https://claude.ai/code/session_0147hSzjmWZDuy2MSQNrhEK5
1 parent 0c67aaf commit 175696a

4 files changed

Lines changed: 83 additions & 84 deletions

File tree

crates/deepnsm/src/reader_state.rs

Lines changed: 67 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ use crate::episodic_spo::{
3939
use crate::morphology::MorphFlags;
4040
use crate::parser::SentenceStructure;
4141
use crate::pos::PoS;
42+
use crate::spo::SpoTriple;
4243
use crate::window::{ExpectedReason, SentenceWindow, WindowEntry};
4344

4445
// ── Left-corner trigger ───────────────────────────────────────────────────
@@ -226,6 +227,12 @@ impl ReadingState {
226227
return (frames, next);
227228
}
228229

230+
// Forward expectations are single-step predictions: each `step` rebuilds
231+
// them fresh. Clearing here bounds the expectation buffer (it can never
232+
// accumulate stale slots across sentences until `MAX_EXPECTED` fills and
233+
// `push_expected` silently drops newer antecedents).
234+
next.window.clear_expected();
235+
229236
// Left-corner trigger from the first triple's features sets the frame.
230237
let first_feat = features.get(0);
231238
next.active_trigger = first_feat.left_corner_trigger;
@@ -286,11 +293,17 @@ impl ReadingState {
286293
};
287294

288295
// ── Build Cam64 locality code ────────────────────────────────
289-
// Use the effective triple (post-coref) for the entity bucket.
290-
// The basin lane incorporates the left-corner trigger.
296+
// Build from the EFFECTIVE triple (subject replaced by the resolved
297+
// antecedent) so the locality key, P64, and CAM4096 are keyed to the
298+
// real entity. Otherwise "John … He …" emits a frame whose truth
299+
// fields point at John but whose cam64 is bucketed on the pronoun.
300+
// When there is no coreference, effective_subject == triple.subject()
301+
// so this is a no-op.
302+
let effective_triple =
303+
SpoTriple::new(effective_subject, triple.predicate(), triple.object());
291304
let stack_depth = next.entity_stack_len.min(127) as u8;
292305
let base_cam64 = Cam64::from_triple(
293-
triple,
306+
&effective_triple,
294307
morph,
295308
stack_depth,
296309
coref_resolved,
@@ -423,7 +436,6 @@ impl ReadingState {
423436
#[cfg(test)]
424437
mod tests {
425438
use super::*;
426-
use crate::spo::SpoTriple;
427439

428440
fn sentence_one_triple(s: u16, p: u16, o: u16) -> SentenceStructure {
429441
SentenceStructure {
@@ -628,4 +640,55 @@ mod tests {
628640
assert_eq!(frames[0].refers_to_candidate_id, 50);
629641
assert_eq!(frames[0].subject_candidate_id, 50);
630642
}
643+
644+
#[test]
645+
fn cam64_keyed_to_resolved_antecedent_not_pronoun() {
646+
// "John(70) ... He(5) ...": the frame's cam64 entity lane must bucket
647+
// the resolved antecedent (70), NOT the pronoun rank (5).
648+
let rs = ReadingState::new(0);
649+
let s1 = sentence_one_triple(50, 60, 70); // most-recent head = 70
650+
let (_, rs2) = rs.step(&s1, &plain_features());
651+
652+
let s2 = sentence_one_triple(5, 80, 90); // subject is a pronoun
653+
let feat = SentenceFeatures {
654+
per_triple: vec![TripleFeatures {
655+
subject_is_pronoun: true,
656+
..Default::default()
657+
}],
658+
};
659+
let (frames, _) = rs2.step(&s2, &feat);
660+
// Resolved antecedent is 70 (most recent prior head).
661+
assert_eq!(frames[0].subject_candidate_id, 70);
662+
// Entity lane must be bucketed on 70, not on the pronoun rank 5.
663+
assert_eq!(frames[0].cam64.entity_state(), (70u16 >> 5) as u8);
664+
assert_ne!(frames[0].cam64.entity_state(), (5u16 >> 5) as u8);
665+
}
666+
667+
#[test]
668+
fn expectations_do_not_accumulate_across_sentences() {
669+
// Many consecutive Relative-trigger sentences must NOT fill the
670+
// expectation buffer and silently drop newer antecedents: each step
671+
// clears stale slots first, so the most recent antecedent always wins.
672+
let mut rs = ReadingState::new(0);
673+
// Prime an initial subject.
674+
let (_, next) = rs.step(&sentence_one_triple(1000, 1, 2), &plain_features());
675+
rs = next;
676+
677+
// Run 8 relative-trigger sentences (> MAX_EXPECTED = 4).
678+
for i in 0..8u16 {
679+
let subj = 2000 + i * 10;
680+
let s = sentence_one_triple(subj, 1, 2);
681+
let feat = SentenceFeatures {
682+
per_triple: vec![TripleFeatures {
683+
left_corner_trigger: LeftCornerTrigger::Relative,
684+
..Default::default()
685+
}],
686+
};
687+
let (_, next) = rs.step(&s, &feat);
688+
rs = next;
689+
// After each step the buffer holds exactly this step's single
690+
// expectation — never accumulating toward the MAX_EXPECTED drop point.
691+
assert_eq!(rs.window.iter_expected().len(), 1);
692+
}
693+
}
631694
}

crates/deepnsm/src/sentence_transformer64.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,8 +80,7 @@
8080
8181
use crate::cam64::Cam64;
8282
use crate::episodic_spo::{DependencyRole, EpisodicSpoFrame};
83-
use crate::morphology::MorphFlags;
84-
use crate::spo::{SpoTriple, NO_ROLE};
83+
use crate::spo::NO_ROLE;
8584

8685
// ── P64 ──────────────────────────────────────────────────────────────────────
8786

crates/deepnsm/src/signed_crystal.rs

Lines changed: 12 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,6 @@
5858
//! fused without an explicit projection step.
5959
6060
use crate::cam64::Cam64;
61-
use crate::spo::NO_ROLE;
6261

6362
// ── HorizonPolarity (v2 stub) ─────────────────────────────────────────────────
6463

@@ -282,84 +281,22 @@ impl Crystal4096 {
282281
}
283282
}
284283

285-
// ── P64MeaningField ───────────────────────────────────────────────────────────
284+
// ── P64MeaningField (alias of the canonical P64) ───────────────────────────────
286285

287286
/// 8-lane grammar/semantic/discourse meaning field.
288287
///
289-
/// Derived from `Cam64` + NSM prime mask, this is the *meaning-field projection*
290-
/// — the composite signal that summarises what this sentence is *about*, in the
291-
/// grammar of the reading state machine.
288+
/// **This is an alias of [`crate::sentence_transformer64::P64`]** — the single
289+
/// canonical meaning-field type. The two were introduced separately but are
290+
/// byte-identical (8-lane `u64`, `from_cam64_and_nsm` / `lane` / `bind` /
291+
/// `agreement` / `popcount` / `raw`); consolidating removes the drift risk of
292+
/// maintaining two copies.
292293
///
293-
/// **`P64MeaningField` is NOT `Cam64`.** `Cam64` is a reading-state locality key
294-
/// (fast index, not truth). `P64MeaningField` is the output of the DeepNSM
295-
/// grammar lens onto the P64 meaning lattice — a different interpretive layer.
296-
///
297-
/// Lane layout (same physical encoding as Cam64 but different semantic contract):
298-
/// ```text
299-
/// byte 0 — primary entity bucket (vocabulary rank >> 5)
300-
/// byte 1 — predicate bucket
301-
/// byte 2 — object bucket (0 if absent)
302-
/// byte 3 — morphology / NSM prime composite (low byte)
303-
/// byte 4 — NSM prime composite (high byte, top-16 primes)
304-
/// byte 5 — discourse / coreference marker
305-
/// byte 6 — causal / temporal / episodic marker
306-
/// byte 7 — basin / novelty / wisdom / epiphany marker
307-
/// ```
308-
#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, Hash)]
309-
#[repr(transparent)]
310-
pub struct P64MeaningField {
311-
pub bits: u64,
312-
}
313-
314-
impl P64MeaningField {
315-
/// Construct from a `Cam64` locality code and the NSM prime mask.
316-
///
317-
/// The NSM prime mask (64-bit, up to 63 primes) is folded into lanes 3-4
318-
/// via XOR — this makes the meaning field sensitive to semantic prime
319-
/// coverage without losing the grammar-lane signals.
320-
#[inline]
321-
pub fn from_cam64_and_nsm(cam: Cam64, nsm_prime_mask: u64) -> Self {
322-
// Fold low 16 bits of NSM mask into lanes 3-4 (the morphology lanes).
323-
let nsm_low = (nsm_prime_mask & 0xFF) as u64;
324-
let nsm_high = ((nsm_prime_mask >> 8) & 0xFF) as u64;
325-
let nsm_xor = nsm_low | (nsm_high << 8); // into bits 24-39
326-
327-
Self {
328-
bits: cam.raw() ^ (nsm_xor << 24),
329-
}
330-
}
331-
332-
/// Extract one meaning-field lane (0-7).
333-
#[inline]
334-
pub fn lane(self, i: usize) -> u8 {
335-
debug_assert!(i < 8);
336-
(self.bits >> (i * 8)) as u8
337-
}
338-
339-
/// XOR bind with another meaning field (VSA binding).
340-
#[inline]
341-
pub fn bind(self, other: P64MeaningField) -> P64MeaningField {
342-
P64MeaningField { bits: self.bits ^ other.bits }
343-
}
344-
345-
/// Popcount — number of active bits in the meaning field.
346-
#[inline]
347-
pub fn popcount(self) -> u32 {
348-
self.bits.count_ones()
349-
}
350-
351-
/// Shared bits with another field (XNOR popcount = agreement measure).
352-
#[inline]
353-
pub fn agreement(self, other: P64MeaningField) -> u32 {
354-
64 - (self.bits ^ other.bits).count_ones()
355-
}
356-
357-
/// Raw u64.
358-
#[inline]
359-
pub fn raw(self) -> u64 {
360-
self.bits
361-
}
362-
}
294+
/// The name `P64MeaningField` is retained here because the holograph-bridge
295+
/// framing in this module reads more clearly with the longer name: it is the
296+
/// *meaning-field projection* (what the sentence is *about*), distinct from
297+
/// `Cam64` (a reading-state locality key, not truth). Same bits, different
298+
/// interpretive contract.
299+
pub use crate::sentence_transformer64::P64 as P64MeaningField;
363300

364301
// ── SignedSentenceCrystal ─────────────────────────────────────────────────────
365302

docs/architecture/deepnsm-reader-design.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -259,12 +259,12 @@ FORBIDDEN INTERNAL PATH (absent by omission):
259259
| `cam64` | 13 (5 new: basin continuation) |
260260
| `episodic_spo` | 8 |
261261
| `window` | 11 (6 new: expectation slots) |
262-
| `reader_state` | 12 (2 new: trigger wiring) |
263-
| `signed_crystal` | 18 |
262+
| `reader_state` | 14 (4 new: trigger wiring + coref-keyed cam64 + no expectation accumulation) |
263+
| `signed_crystal` | 18 (`P64MeaningField` is now an alias of the canonical `P64`) |
264264
| `sentence_transformer64` | 26 |
265265
| `crystal_neighborhood` | 16 |
266266
| **Existing deepnsm tests** | 104 (unchanged) |
267-
| **Total** | **215** |
267+
| **Total** | **217** |
268268

269269
---
270270

0 commit comments

Comments
 (0)