Skip to content

Commit e1269d7

Browse files
committed
transformerless_lm: split-brain mixer -> golden-weighted arithmetic
v73 geometric mean (sqrt(p_math * p_lang)) was over-conservative. Required both hemispheres to consent; valid spikes from one hemisphere alone got cancelled. v74 mixer: (phi * p_math + p_lang) / (phi + 1) Math gets phi=1.618 weight (older substrate foundation = primary). Lang gets 1.0 weight (modulator). Both contribute additively in probability space. High-confidence proposals from either come through without requiring agreement. Substrate-canonical weights (golden ratio).
1 parent 8d72769 commit e1269d7

1 file changed

Lines changed: 12 additions & 8 deletions

File tree

experiments/transformerless_lm/train_self_recursive.py

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1327,29 +1327,33 @@ def _omniweight_apply(base_probs: torch.Tensor,
13271327
def _omniweight_apply_split(base_probs: torch.Tensor,
13281328
math_delta: torch.Tensor,
13291329
lang_delta: torch.Tensor) -> torch.Tensor:
1330-
"""SPLIT-BRAIN omniweight: two registers, geometric-mean mixer.
1330+
"""SPLIT-BRAIN omniweight: two registers, golden-weighted mixer.
13311331
13321332
Math hemisphere: bigram, recency, substrate sampling, anti-stag,
13331333
bigram-saturation. Frequency / decay primitives.
13341334
13351335
Language hemisphere: iambic, anaphora, need-fill, phonotactics,
1336-
rhyme, agreement, word-spacing, char-cascade, pronunciation,
1336+
rhyme, agreement, word-spacing, char-cascade, pronounceability,
13371337
subject-threading, theme. Purpose / structure primitives.
13381338
13391339
Each hemisphere builds its own fluid delta via tanh-scaled
1340-
substrate reserve. Final distribution = geometric mean of the
1341-
two -- a token survives only if both hemispheres consent.
1342-
1343-
Pure substrate (phi^pi reserve, sqrt mixing = Bayesian PoE).
1340+
substrate reserve. Final distribution = golden-weighted arithmetic
1341+
mean: (phi * p_math + p_lang) / (phi + 1).
1342+
1343+
Math gets phi=1.618 weight (older substrate foundation, primary
1344+
signal). Lang gets 1.0 weight (modulator). Both contribute --
1345+
high-confidence proposals from either come through. Less
1346+
restrictive than geometric mean which required both-consent
1347+
(v73 was over-conservative).
13441348
"""
13451349
math_fluid = _OMNIWEIGHT_RESERVE * torch.tanh(math_delta / _OMNIWEIGHT_RESERVE)
13461350
lang_fluid = _OMNIWEIGHT_RESERVE * torch.tanh(lang_delta / _OMNIWEIGHT_RESERVE)
13471351
p_math = base_probs * torch.exp(math_fluid)
13481352
p_lang = base_probs * torch.exp(lang_fluid)
13491353
p_math = p_math / (p_math.sum() + 1e-8)
13501354
p_lang = p_lang / (p_lang.sum() + 1e-8)
1351-
# Geometric mean (Bayesian product of experts).
1352-
p_final = torch.sqrt(p_math * p_lang)
1355+
phi = _PHI_FOR_SAMPLING
1356+
p_final = (phi * p_math + p_lang) / (phi + 1.0)
13531357
return p_final / (p_final.sum() + 1e-8)
13541358

13551359

0 commit comments

Comments
 (0)