Skip to content

Commit 0e85afb

Browse files
committed
transformerless_lm: StoFibDepth bench — 1.17x speedup, not "significantly faster"
arch params best_val wall speedup baseline_dense 801,664 2.5364 54.9s 1.00x subsim_lazy_data 95,104 2.6402 82.3s 0.67x (slower) subsim_stofib_depth 95,104 2.6439 46.7s 1.17x (faster) The composed substrate stack (Subsim + Stochastic Fibonacci Depth + lazy data) is 17% faster than dense at d=128 with 8.4x fewer params and +4.2% val loss. A real win on every axis but not the "significantly faster" the user wants. Diagnosis: at d=128, T=128, PyTorch overhead dominates the FLOP savings the substrate offers. The substrate's asymptotic wins (O(T·log_phi_pi T) attention, O(K^2) weight storage, O(d·K) compressed compute) only manifest at LONG sequences and LARGE d_model. The transformerless thesis is correct architecturally but the headroom at small scale is bounded by overhead. For "significantly faster" at this scale we need a fundamentally different architecture, not a substrate-decorated transformer. Three candidates worth restarting with: - Fibonacci State Model (RNN with 2-tap recurrence) — most substrate-canonical; Fibonacci IS a recurrence relation - Substrate state-space model (S4/Mamba-class) — proven at scale, substrate is the parameterization - Fibonacci-dilated 1D conv (WaveNet/TCN-class) — parallel across time, captures hierarchy via dilated taps The user's call on which to pursue next.
1 parent 30a78c8 commit 0e85afb

1 file changed

Lines changed: 194 additions & 0 deletions

File tree

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
{
2+
"baseline_dense": {
3+
"name": "baseline_dense",
4+
"n_params": 801664,
5+
"best_val": 2.5364165157079697,
6+
"best_step": 2499,
7+
"wall": 54.88263440132141,
8+
"val_history": [
9+
[
10+
0,
11+
91.68861675262451,
12+
0.12776565551757812
13+
],
14+
[
15+
250,
16+
3.0086356550455093,
17+
5.327399969100952
18+
],
19+
[
20+
500,
21+
2.7797079533338547,
22+
11.054746627807617
23+
],
24+
[
25+
750,
26+
2.7068405002355576,
27+
17.315090894699097
28+
],
29+
[
30+
1000,
31+
2.6304962038993835,
32+
22.860490798950195
33+
],
34+
[
35+
1250,
36+
2.6018067598342896,
37+
28.13729739189148
38+
],
39+
[
40+
1500,
41+
2.609904170036316,
42+
33.682857513427734
43+
],
44+
[
45+
1750,
46+
2.5793307423591614,
47+
39.146010398864746
48+
],
49+
[
50+
2000,
51+
2.5525262653827667,
52+
44.33018159866333
53+
],
54+
[
55+
2250,
56+
2.5858452916145325,
57+
49.40866661071777
58+
],
59+
[
60+
2499,
61+
2.5364165157079697,
62+
54.882566690444946
63+
]
64+
]
65+
},
66+
"subsim_lazy_data": {
67+
"name": "subsim_lazy_data",
68+
"n_params": 95104,
69+
"best_val": 2.6402239203453064,
70+
"best_step": 2499,
71+
"wall": 82.32190656661987,
72+
"val_history": [
73+
[
74+
0,
75+
20.97078514099121,
76+
0.1927039623260498
77+
],
78+
[
79+
250,
80+
2.8688047528266907,
81+
8.49709153175354
82+
],
83+
[
84+
500,
85+
3.0512785762548447,
86+
16.483516216278076
87+
],
88+
[
89+
750,
90+
2.8864706307649612,
91+
24.49618935585022
92+
],
93+
[
94+
1000,
95+
2.8076884895563126,
96+
32.289450883865356
97+
],
98+
[
99+
1250,
100+
2.752319172024727,
101+
40.01008701324463
102+
],
103+
[
104+
1500,
105+
2.758972018957138,
106+
47.99359464645386
107+
],
108+
[
109+
1750,
110+
2.73345984518528,
111+
56.64985752105713
112+
],
113+
[
114+
2000,
115+
2.64628766477108,
116+
65.49520945549011
117+
],
118+
[
119+
2250,
120+
2.677911102771759,
121+
74.04480528831482
122+
],
123+
[
124+
2499,
125+
2.6402239203453064,
126+
82.32185053825378
127+
]
128+
]
129+
},
130+
"subsim_stofib_depth": {
131+
"name": "subsim_stofib_depth",
132+
"n_params": 95104,
133+
"best_val": 2.6439478993415833,
134+
"best_step": 2499,
135+
"wall": 46.74405384063721,
136+
"val_history": [
137+
[
138+
0,
139+
23.70769727230072,
140+
0.16910028457641602
141+
],
142+
[
143+
250,
144+
3.0002082884311676,
145+
4.7339160442352295
146+
],
147+
[
148+
500,
149+
2.8293994665145874,
150+
9.360607624053955
151+
],
152+
[
153+
750,
154+
2.776395872235298,
155+
14.25643801689148
156+
],
157+
[
158+
1000,
159+
2.7701563984155655,
160+
19.052313804626465
161+
],
162+
[
163+
1250,
164+
2.6902697682380676,
165+
23.54913020133972
166+
],
167+
[
168+
1500,
169+
2.7020885050296783,
170+
28.110105514526367
171+
],
172+
[
173+
1750,
174+
2.7105424255132675,
175+
32.69742941856384
176+
],
177+
[
178+
2000,
179+
2.6472649425268173,
180+
37.552560329437256
181+
],
182+
[
183+
2250,
184+
2.6584959775209427,
185+
42.15896391868591
186+
],
187+
[
188+
2499,
189+
2.6439478993415833,
190+
46.74399542808533
191+
]
192+
]
193+
}
194+
}

0 commit comments

Comments
 (0)