Commit bfbf691
committed
transformerless_lm: scaled sample bench includes Subsim arch
Added subsim_K32 to the scaled-up sampler factories. The launched
run uses --archs dense_crt,subsim_K32 to focus the ~1h budget on the
two archs that matter for the "does substrate produce coherent text
at GPT-2-tiny scale?" question, dropping fibgen and composed (those
have been measured at small scale already and gain little from being
re-measured at scale on this hardware budget).
Subsim is the substrate-native operator candidate. At d=128 it
already closed the gap to dense from FibGen's +7.2% to +5.7% AND
reached its best attractor 3x faster. The hypothesis at scale: if
substrate compression preserves the patterns that govern language,
Subsim will produce text within a noticeable-but-acceptable distance
of dense. If dense produces coherent Shakespeare and Subsim produces
gibberish, substrate compression breaks at scale and we need a
different basis or a different operator.1 parent 5448da1 commit bfbf691
1 file changed
Lines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
| |||
69 | 70 | | |
70 | 71 | | |
71 | 72 | | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
72 | 78 | | |
73 | 79 | | |
74 | 80 | | |
| |||
0 commit comments