|
| 1 | +# Paper Landscape — Grammar Parsing × VSA × Active Inference |
| 2 | + |
| 3 | +> **READ BY:** integration-lead, truth-architect, family-codec-smith, |
| 4 | +> any agent touching deepnsm, grammar, AriGraph, or the free-energy |
| 5 | +> resolution pipeline. |
| 6 | +> |
| 7 | +> **Created:** 2026-04-21 |
| 8 | +> **Scope:** Maps 14 recent papers onto the lance-graph grammar stack |
| 9 | +> (DeepNSM + RoleKey VSA + FreeEnergy active inference + AriGraph). |
| 10 | +> Each entry: citation, one-line finding, what it validates/challenges |
| 11 | +> in our architecture, and the specific code cross-reference. |
| 12 | +
|
| 13 | +--- |
| 14 | + |
| 15 | +## Tier 1 — Foundational (directly validates our algebraic substrate) |
| 16 | + |
| 17 | +### Shaw, Furlong, Anderson & Orchard (2501.05368) — VSA Category Theory Foundation |
| 18 | + |
| 19 | +**Finding:** Right Kan extensions prove that dimension-preserving |
| 20 | +binding/bundling MUST be element-wise operations. Division ring |
| 21 | +structure required for full reversibility. Co-presheaf generalization |
| 22 | +decouples index category (dimensional compression) from value category |
| 23 | +(ring structure). |
| 24 | + |
| 25 | +**Validates:** |
| 26 | +- `RoleKey::bind` (element-wise XOR on contiguous slices) is |
| 27 | + categorically optimal — not a design choice, a theorem consequence. |
| 28 | +- XOR on GF(2)^d IS a division ring → full reversibility holds. |
| 29 | +- Our slice-addressing scheme (ℐ = disjoint intervals [0..2000), |
| 30 | + [2000..4000), ...) is an instance of their index category with |
| 31 | + monoidal product = disjoint union. |
| 32 | + |
| 33 | +**Key equations:** |
| 34 | +- Kan extension: `(Ran_e v⊗̄w)_i = ∫_{jk} ℐ(i,e(j,k)) ⋔ (v_j·w_k)` |
| 35 | +- Simplifies to element-wise: `v⊗w = ∫_i v_i · w_i` |
| 36 | +- Role-filler: `w = (first ⊗ v_1) ⊕ (second ⊗ v_2)` with |
| 37 | + recovery `v_1 ∼ first ⊘ w` — our RoleKey::bind + unbind. |
| 38 | +- Braiding ρ for sequences: `list(x_1,...,x_n) = x_1 ⊕ ρx_2 ⊕ ρρx_3 ⊕ ...` |
| 39 | + — this IS `vsa_permute` per position in the Markov bundler (D5). |
| 40 | +- Non-commutative binding needed for hierarchical structure — validates |
| 41 | + why we use DIFFERENT role keys for S/P/O. |
| 42 | + |
| 43 | +**Cross-ref:** `contract::grammar::role_keys::{RoleKey::bind, unbind, vsa_xor}`. |
| 44 | + |
| 45 | +--- |
| 46 | + |
| 47 | +### Kleyko, Davies, Frady, Kanerva et al. (2106.05268) — VSA/HDC Survey Part II |
| 48 | + |
| 49 | +**Finding:** VSA's algebraic structure enables "computing in |
| 50 | +superposition" — efficient solutions to combinatorial search via |
| 51 | +high-dimensional distributed representations. Computational |
| 52 | +universality established. |
| 53 | + |
| 54 | +**Validates:** Our XOR-superposition of N role bindings (tested at |
| 55 | +5 simultaneous roles recovering at margin 1.0) IS computing in |
| 56 | +superposition. The combinatorial search problem they describe = |
| 57 | +our counterfactual hypothesis enumeration in `Resolution::from_ranked`. |
| 58 | + |
| 59 | +**Cross-ref:** `contract::grammar::role_keys::vsa_xor`, `free_energy::Resolution`. |
| 60 | + |
| 61 | +--- |
| 62 | + |
| 63 | +### Gallant & Okaywe (1501.07627) — MBAT: Objects, Relations, Sequences |
| 64 | + |
| 65 | +**Finding:** Matrix binding (MBAT) satisfies machine-learning |
| 66 | +constraints for VSA: similar structures → similar vectors. Phrases |
| 67 | +should be binding-sums. Three-stage learning: representation → |
| 68 | +association → inference. |
| 69 | + |
| 70 | +**Validates:** Our three-stage pipeline mirrors theirs: |
| 71 | +1. Representation = RoleKey::bind (content → role-indexed VSA) |
| 72 | +2. Association = Markov ±5 bundling (context accumulation) |
| 73 | +3. Inference = FreeEnergy resolution (hypothesis selection) |
| 74 | + |
| 75 | +Their "phrases as binding-sums" = our SPO triple as |
| 76 | +`SUBJECT_KEY.bind(s) ⊕ PREDICATE_KEY.bind(p) ⊕ OBJECT_KEY.bind(o)`. |
| 77 | + |
| 78 | +**Cross-ref:** Plan D5 `MarkovBundler`, `Trajectory`. |
| 79 | + |
| 80 | +--- |
| 81 | + |
| 82 | +## Tier 2 — Empirical validation of the grammar tier |
| 83 | + |
| 84 | +### Graichen, de-Dios-Flores & Boleda (2601.19926) — "Grammar of Transformers" (337-article systematic review) |
| 85 | + |
| 86 | +**Finding:** TLMs handle formal syntax well (agreement >85% BLiMP) |
| 87 | +but show weak, variable performance on syntax-semantics interface |
| 88 | +(<75% on binding, coreference, quantifier scope, island effects). |
| 89 | +Severe English dominance (69%). Mechanistic methods underutilized. |
| 90 | + |
| 91 | +**Validates:** Our tiered routing — DeepNSM handles the >85% formal |
| 92 | +syntax locally; FreeEnergy + counterfactual resolves the <75% |
| 93 | +syntax-semantics interface. Their call for "syntax-semantics interface |
| 94 | +investigation + mechanistic methods" = exactly what our active- |
| 95 | +inference stack provides. |
| 96 | + |
| 97 | +**Cross-ref:** `contract::grammar::ticket::FailureTicket` (escalation |
| 98 | +for the <75% tail), `free_energy::Resolution`. |
| 99 | + |
| 100 | +--- |
| 101 | + |
| 102 | +### Jian & Manning (2603.17475 / EACL 2026) — Abstraction-First Language Learning |
| 103 | + |
| 104 | +**Finding:** GPT-2 learns class-level verb behavior BEFORE item- |
| 105 | +specific behavior. Sequential emergence: syntactic subcategorization |
| 106 | +(t<100) → semantic argument structure (t>100) → non-local |
| 107 | +dependencies (t>1000). Count-based exemplar baseline is strictly |
| 108 | +worse. |
| 109 | + |
| 110 | +**Validates:** |
| 111 | +- `GrammarStyleConfig::nars.primary = Deduction` (class-level rules |
| 112 | + first) IS the abstraction-first policy. |
| 113 | +- Sequential emergence maps to Markov radius scaling: ±1 captures |
| 114 | + subcategorization, ±3 captures argument structure, ±5 captures |
| 115 | + non-local. WeightingKernel::MexicanHat emphasizes local first. |
| 116 | +- Their 4 verb classes (to-dative / motion / reciprocal / spray-load) |
| 117 | + = rows in our 144-verb taxonomy with characteristic TEKAMOLO priors. |
| 118 | +- Exemplar-first baseline fails = Markov bundling without role-key |
| 119 | + structure is class-blind. Role keys ARE the abstraction mechanism. |
| 120 | + |
| 121 | +**Cross-ref:** `contract::grammar::thinking_styles::NarsPriorityChain`, |
| 122 | +`context_chain::WeightingKernel::MexicanHat`. |
| 123 | + |
| 124 | +--- |
| 125 | + |
| 126 | +### Schulz, Mitropolsky & Poggio (2510.02524) — How LMs Learn CFGs |
| 127 | + |
| 128 | +**Finding:** KL divergence over PCFG decomposes as sum over |
| 129 | +subgrammar contributions (Theorem 4.3). Transformers learn all |
| 130 | +subgrammar levels in PARALLEL. Models FAIL on deep recursion |
| 131 | +despite handling long shallow contexts. |
| 132 | + |
| 133 | +**Validates:** |
| 134 | +- Our `FreeEnergy { likelihood, kl_divergence, total }` decomposition |
| 135 | + mirrors their KL-over-subgrammars. Each role-key slice IS a |
| 136 | + "subgrammar" in the VSA decomposition. |
| 137 | +- Recursion failure = why we use Markov ±5 contextual coherence |
| 138 | + instead of recursive parsing. Deep recursion becomes "does this |
| 139 | + nested structure cohere with ±5 context?" — a flat comparison. |
| 140 | +- Parallel subgrammar learning = our FSM handles all PoS categories |
| 141 | + simultaneously. |
| 142 | + |
| 143 | +**Cross-ref:** `contract::grammar::free_energy::FreeEnergy`. |
| 144 | + |
| 145 | +--- |
| 146 | + |
| 147 | +### Alpay & Senturk (2603.05540) — Grammar-Constrained LLM Decoding |
| 148 | + |
| 149 | +**Finding:** Doob h-transform: `p(v|y<t) = p(v|y<t) · h(y<tv)/h(y<t)`. |
| 150 | +Grammar survival probability modulates base LLM distribution. |
| 151 | +Structural Ambiguity Cost (SAC): right-recursive O(1)/token, |
| 152 | +concatenative Θ(t²)/token. Lower bound: Ω(t²) for parse-preserving |
| 153 | +engines. |
| 154 | + |
| 155 | +**Validates:** |
| 156 | +- Their grammar-conditional is the dual of our free-energy: both |
| 157 | + are multiplicative modulations of a base distribution by structural |
| 158 | + constraint. |
| 159 | +- SAC = our counterfactual branch count. Pearl 2³ mask reduces SAC |
| 160 | + by committing causal bits from morphology. |
| 161 | +- Their Ω(t²) lower bound does NOT apply to us: we don't preserve |
| 162 | + the full parse forest. Active inference commits to argmin_F and |
| 163 | + discards (or marks epiphany). We trade parse-preservation for |
| 164 | + decision speed. |
| 165 | + |
| 166 | +**Cross-ref:** `contract::grammar::free_energy::Resolution` (commit |
| 167 | +discards losers), `EPIPHANY_MARGIN` (preserves runner-up only when |
| 168 | +margin is tight). |
| 169 | + |
| 170 | +--- |
| 171 | + |
| 172 | +## Tier 3 — Supporting evidence for specific design choices |
| 173 | + |
| 174 | +### Starace et al. (2310.18696, EMNLP 2023) — Joint Encoding of Linguistic Categories |
| 175 | + |
| 176 | +**Finding:** Related grammatical categories share overlapping |
| 177 | +encodings in LLMs; pattern holds cross-lingually. |
| 178 | + |
| 179 | +**Validates:** Role-key slice adjacency for morphologically-related |
| 180 | +cases (Finnish Adessive and LOKAL_KEY map to overlapping TEKAMOLO |
| 181 | +slots). Cross-lingual bundling works because categories are shared |
| 182 | +at the representational level. |
| 183 | + |
| 184 | +**Cross-ref:** `contract::grammar::role_keys::FINNISH_SLICES`, |
| 185 | +`contract::grammar::role_keys::LOKAL_KEY`. |
| 186 | + |
| 187 | +--- |
| 188 | + |
| 189 | +### Tjuatja, Liu, Levin & Neubig (2305.18185) — Agentivity Probe |
| 190 | + |
| 191 | +**Finding:** Optionally transitive verbs test agent-vs-patient role |
| 192 | +assignment. GPT-3 outperforms corpus statistics. |
| 193 | + |
| 194 | +**Validates:** Pearl 2³ bit 0 = agency. Optionally transitive verbs |
| 195 | += exact Wechsel case ("The door opened" vs "John opened the door"). |
| 196 | +Their dataset = potential eval benchmark for `Resolution::resolve`. |
| 197 | + |
| 198 | +**Cross-ref:** `contract::grammar::ticket::CausalAmbiguity::plausible_mask`, |
| 199 | +`contract::grammar::free_energy::Hypothesis::causal_mask`. |
| 200 | + |
| 201 | +--- |
| 202 | + |
| 203 | +### Petit, Corro & Yvon (2310.14124) — Supertagging + ILP |
| 204 | + |
| 205 | +**Finding:** Supertagging (per-token category) + integer linear |
| 206 | +program for structural consistency = compositional generalization. |
| 207 | + |
| 208 | +**Validates:** Our PoS tagging (supertag) + `TekamoloPolicy::require_fillable` |
| 209 | +(structural consistency). ILP = our Markov ±5 coherence (both prevent |
| 210 | +locally-plausible but globally-inconsistent parses). |
| 211 | + |
| 212 | +**Cross-ref:** `contract::grammar::tekamolo::TekamoloSlots`, |
| 213 | +`thinking_styles::TekamoloPolicy`. |
| 214 | + |
| 215 | +--- |
| 216 | + |
| 217 | +### Sultana & Ahmed (2602.20749) — Grammar–Semantic Feature Fusion |
| 218 | + |
| 219 | +**Finding:** 11 explicit grammar features + frozen BERT = 2-15% |
| 220 | +improvement. Grammar as explicit inductive bias, not learnable module. |
| 221 | + |
| 222 | +**Validates:** Grammar-as-inductive-bias is the right framing. Their |
| 223 | +11 features are a shallow version of our TEKAMOLO slot-filling + |
| 224 | +SPO extraction. Full role-indexed VSA bundling should exceed their |
| 225 | +2-15% improvement substantially. |
| 226 | + |
| 227 | +**Cross-ref:** `contract::grammar::tekamolo`, `role_keys`. |
| 228 | + |
| 229 | +--- |
| 230 | + |
| 231 | +### Shaikh, Ziems et al. (2306.02475, ACL 2023) — Cultural Codes |
| 232 | + |
| 233 | +**Finding:** Sociocultural background characteristics significantly |
| 234 | +improve pragmatic reference resolution. |
| 235 | + |
| 236 | +**Validates:** `GrammarStyleAwareness` as per-style empirical prior. |
| 237 | +Different thinking styles resolve the same ambiguity differently |
| 238 | +because their priors over signal-profile frequency differ — exactly |
| 239 | +the cultural-prior effect they measure. |
| 240 | + |
| 241 | +**Cross-ref:** `contract::grammar::thinking_styles::GrammarStyleConfig`. |
| 242 | + |
| 243 | +--- |
| 244 | + |
| 245 | +### Perez-Beltrachini et al. (2301.12217) — Conversational Semantic Parsing |
| 246 | + |
| 247 | +**Finding:** Multi-turn QA grounded to SPARQL over large-vocab KGs. |
| 248 | +Challenges: entity grounding, conversation context, generalization. |
| 249 | + |
| 250 | +**Validates:** AriGraph triplet-graph + ContextChain = our equivalent. |
| 251 | +Their "conversation context" = our ±5 Markov chain. We don't need |
| 252 | +SPARQL because SPO triples are queried directly via |
| 253 | +`TripletGraph::nodes_matching`. |
| 254 | + |
| 255 | +**Cross-ref:** `arigraph::triplet_graph`, `grammar::context_chain`. |
| 256 | + |
| 257 | +--- |
| 258 | + |
| 259 | +### Hussein (2602.14238) — CFG/GPSG Parser |
| 260 | + |
| 261 | +**Finding:** CFG+GPSG parser producing dependency + constituency |
| 262 | +trees; handles noise; UAS 54.5%. |
| 263 | + |
| 264 | +**Validates:** Our baseline to beat. Their noise tolerance = |
| 265 | +our `PartialParse` + `FailureTicket`. UAS 54.5% should be |
| 266 | +significantly exceeded by adding Markov coherence + role-key binding. |
| 267 | + |
| 268 | +**Cross-ref:** `contract::grammar::ticket::PartialParse`. |
| 269 | + |
| 270 | +--- |
| 271 | + |
| 272 | +## The unclaimed intersection |
| 273 | + |
| 274 | +**No paper in this landscape combines:** |
| 275 | + |
| 276 | +1. Structural parsing (rule-based, not neural) |
| 277 | +2. Active-inference ambiguity resolution (free-energy, not attention) |
| 278 | +3. Role-indexed distributed representation (VSA with Kan-extension- |
| 279 | + justified element-wise ops) |
| 280 | +4. NARS-revised epistemic awareness (per-parse revision, not gradient) |
| 281 | + |
| 282 | +Shaw et al. provide the algebraic foundation (Tier 1). Graichen |
| 283 | +et al. identify the target (syntax-semantics interface, Tier 2). |
| 284 | +Jian & Manning validate the dispatch order (abstraction-first, Tier 2). |
| 285 | +Alpay & Senturk formalize the grammar-conditional dual (Tier 2). |
| 286 | + |
| 287 | +Our stack sits at the intersection. The closest prior art is |
| 288 | +Shaw's category-theoretic VSA + Petit's supertagging+ILP, but |
| 289 | +neither has the active-inference free-energy loop or the NARS- |
| 290 | +revised epistemic awareness layer. |
| 291 | + |
| 292 | +--- |
| 293 | + |
| 294 | +## Papers not yet fully retrieved |
| 295 | + |
| 296 | +- **biorxiv 2022.02.22.481380v3** — PDF too large for WebFetch. |
| 297 | + Likely a neuroscience paper on VSA / neural binding. |
| 298 | +- **ResearchGate VSA-for-CFGs (Mitropolsky?)** — 403 forbidden. |
| 299 | + This is likely the 2003.05171 paper already cited in the plan |
| 300 | + (VSA encoding of Chomsky-normal-form CFGs via Fock space). |
0 commit comments