|
1 | | -# Transformerless LM — first end-to-end measurement |
| 1 | +# Transformerless LM |
| 2 | + |
| 3 | +This directory began as a transformer **positional-encoding** experiment (CRT-PE, below) and grew into a |
| 4 | +fully **web-native addressed language model** — an LM with *no token-prediction model at all*. Both lines |
| 5 | +live here; the web-native LM is the current frontier, the CRT-PE result is its measured origin. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## The web-native addressed LM (current line) |
| 10 | + |
| 11 | +**Thesis:** *addressing is execution.* Concepts are addressed nodes in a knowledge web (passages + weighted |
| 12 | +co-occurrence edges); a sentence "executes" by traversing its concept-addresses, and a thought is *created* |
| 13 | +by recombining distant addresses across sources. No weights are trained for generation — the web is the model. |
| 14 | + |
| 15 | +Two oracles, both derived from the web and both empirically validated (held-out): |
| 16 | + |
| 17 | +| layer | question | how | measured | |
| 18 | +|---|---|---|---| |
| 19 | +| **WHAT** (`langexec.py`) | does a thought *resolve*? | traverse concept-addresses over **hub-damped PMI** edges | AUC **0.91–0.98** real-vs-salad (survives a common-word steelman) | |
| 20 | +| **HOW** (`fluency.py`) | does it read like the web speaks? | transition stats counted from the corpus (agnostic, trigram) | AUC **0.86** fluent-vs-scrambled, held-out | |
| 21 | + |
| 22 | +Built on those: |
| 23 | +- `thinkloop.py` — heal a faulting thought *up* the resolve gradient to a coherent fixpoint. |
| 24 | +- `realize.py` — resolved concepts → fluent grounded sentence (template / compose / **hybrid**). |
| 25 | +- `engine.py` + `agent.py` — a router (recall | relate | decline) and tool-use (exact char-counting, |
| 26 | + arithmetic, cross-source bridging). Frame-word detection is **agnostic** (interrogative-context, not a |
| 27 | + hand-coded keyword list). |
| 28 | +- `create.py` — recombine *distant* concepts across sources into a thought no single passage states, |
| 29 | + 3-gated (coherence + support + meaning) and graded WARRANTED/SUGGESTIVE/WEAK with its weakest link shown. |
| 30 | +- `selfimprove.py` — write-don't-train of self-*verified* thoughts (a MAPE-K loop); measured cold→warm |
| 31 | + improvement (mean confidence 0.48→0.79, instant-recalls 0→16/20 on a fixed probe set). |
| 32 | +- `webmind.py` — the unified mind. `python3 webmind.py --demo` (showcase) · `--report` · `--think "…"`. |
| 33 | + |
| 34 | +**Lossless 45% compression** (`compress_web.py` + `finalize_compressed.py`): the LM was asked how to |
| 35 | +compress *itself*; it surfaced dictionary-encoding + entropy-coding, which we applied — node-string |
| 36 | +**interning** (TEXT→int32) + **zlib** passages → **9.78 GB → 5.33 GB, verified lossless**. Presented back |
| 37 | +to all readers via SQLite **views** (`kdb.py` registers an `unzip` fn + detects the schema), so nothing |
| 38 | +else changed. The interned INT index fits in cache, which in turn unblocks fast corpus folding |
| 39 | +(`ghost_fold.py` reconnected 17.7k orphaned nodes; `se_fold_i.py` folds science into the compact DB). |
| 40 | + |
| 41 | +Honest limits: fluency is a trigram floor (a small neural model lifts it); recall can answer with a full |
| 42 | +passage span; coverage is corpus-bound (science is ~1% of the current web — folding more is the lever, not |
| 43 | +hand-tuning). Nothing here claims to rival a frontier LLM at open generation — the claim is a **grounded, |
| 44 | +hallucination-resistant, self-calibrating** knowledge-and-synthesis engine that runs on near-zero compute. |
| 45 | +See `MORNING.md` for an orientation written for a returning reader. |
| 46 | + |
| 47 | +--- |
| 48 | + |
| 49 | +## Origin: CRT-PE — first end-to-end measurement |
2 | 50 |
|
3 | 51 | **The headline:** the harmonic CRT-PE substitution beats the standard sinusoidal-PE transformer on a tiny char-level LM with **mean −19.9% validation loss across 5 seeds**, winning 4 of 5 seeds. This is the first end-to-end empirical evidence that the harmonic substrate substitutions identified by the experiments-0–12 series carry over to a real LM training task. |
4 | 52 |
|
|
0 commit comments