Skip to content

Commit e1c37a5

Browse files
committed
docs(deepnsm): psychometric validation framework + vertical HHTL bundling spec
Two architectural concepts saved for dedicated implementation sessions: 1. Psychometric validation for DeepNSM measurement instrument: - Cronbach's α across 128 projections (2³ SPO × 2⁴ HHTL) - Split-half reliability: Strategy A vs Strategy B distance - IRT item parameters: per-word difficulty + discrimination - Factor analysis: do 74 primes factor into 16 NsmCategory? - Construct/convergent/discriminant validity across codec chain - Polysemy detection via α drop across projections - P-values with 128 independent measurements per pair 2. Vertical HHTL bundling (studio mixing analogy): - Leaves → bundle → Twigs → bundle → Branches → bundle → Hip - Each level = majority vote denoising (background noise removal) - Unbind bottom-up to verify reconstruction (information loss audit) - Combined SPO × HHTL = 128-way factorial decomposition - Cascade as psychometric filter: discrimination, factor analysis, composite reliability, SEM, residual analysis Key insight: NARS confidence IS measurement reliability (formalized). Every similarity judgment gets a confidence interval backed by 128 independent projection measurements. https://claude.ai/code/session_01Y69Vnw751w75iVSBRws7o7
1 parent d76c631 commit e1c37a5

1 file changed

Lines changed: 59 additions & 0 deletions

File tree

src/hpc/deepnsm.rs

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1316,3 +1316,62 @@ mod eval_tests {
13161316
//
13171317
// TODO: implement NsmDecompositionSoA with category-padded [f32; 256] storage
13181318
// ============================================================================
1319+
1320+
// ============================================================================
1321+
// FUTURE CONCEPT: Psychometric validation framework for DeepNSM
1322+
// ============================================================================
1323+
//
1324+
// The vocabulary IS a measurement instrument. Each word is a test item.
1325+
// Each prime weight is a factor loading. Psychometric theory validates
1326+
// whether the decomposition measures what it claims to measure.
1327+
//
1328+
// RELIABILITY:
1329+
// - Test-retest: bundle → unbundle → re-bundle → compare (bit-reproducible)
1330+
// - Cronbach's α: correlation across 2³ SPO × 2⁴ HHTL = 128 projections
1331+
// High α (>0.7) = projections agree = construct is coherent
1332+
// Low α (<0.5) = bundling destroys information at that level
1333+
// - Split-half: Strategy A distance vs Strategy B distance for same pair
1334+
// Pearson r between them = reliability of the dual encoding
1335+
//
1336+
// VALIDITY:
1337+
// - Construct: do primes factor into 16 NsmCategory groups? (PCA/FA)
1338+
// - Convergent: SpoBase17 ≈ CausalEdge64 ≈ VsaVec ≈ DeepNSM cosine
1339+
// for same pair. All should rank similarly.
1340+
// - Discriminant: "dog bites man" ≠ "man bites dog" across all encodings
1341+
// - Criterion: OSINT extraction quality against known-true datasets
1342+
//
1343+
// ITEM RESPONSE THEORY (IRT):
1344+
// - Per-word difficulty: how many primes cleanly decompose this word?
1345+
// "think" = easy (2 primes), "justice" = hard (6+ primes)
1346+
// - Per-word discrimination: does this word reliably separate concepts?
1347+
// "good" = high (separates Evaluator), "the" = zero
1348+
// - Per-prime reliability: does this prime contribute consistently?
1349+
//
1350+
// HHTL CASCADE AS PSYCHOMETRIC FILTER:
1351+
// HEEL: drop items with discrimination < 0.3 (bad test items)
1352+
// HIP: factor analyze → extract latent structure → compare with theory
1353+
// BRANCH: composite reliability per factor (α per NsmCategory)
1354+
// TWIG: structural equation model → path coefficients = causal relations
1355+
// LEAF: residual variance → noise OR undiscovered factor → NARS abduction
1356+
//
1357+
// VERTICAL BUNDLING (studio mixing analogy):
1358+
// Leaves → bundle → Twigs → bundle → Branches → bundle → Hip
1359+
// Each level = majority vote denoising
1360+
// Unbind bottom-up to verify reconstruction
1361+
// Hamming(unbind(Hip, branch_role), actual_branch) = information loss
1362+
// Combined with SPO: 2³ × 2⁴ = 128 projections, each an "item"
1363+
// Cronbach's α across 128 items = total measurement reliability
1364+
//
1365+
// P-VALUES:
1366+
// 128 independent measurements per pair → statistical power for p < 0.001
1367+
// Every similarity judgment comes with a confidence interval
1368+
// NARS confidence IS measurement reliability (formalized)
1369+
//
1370+
// POLYSEMY DETECTION:
1371+
// Word with high α in context = disambiguated (reliable measurement)
1372+
// Word with low α across projections = polysemous (unreliable item)
1373+
// α drop localizes WHERE the ambiguity lives in the HHTL tree
1374+
//
1375+
// TODO: implement CronbachAlpha, SplitHalfReliability, FactorAnalysis,
1376+
// ItemDifficulty, ItemDiscrimination, MeasurementInvariance
1377+
// ============================================================================

0 commit comments

Comments
 (0)