You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: posts/2025-09-13-recursive-self-improvement-explosion-optimization.qmd
+50-18Lines changed: 50 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -18,11 +18,49 @@ execute:
18
18
19
19
# Summary
20
20
21
+
22
+
Capabilities are growing rapidly, but they're hard to quantify.
23
+
: There's no consensus for a *scale* in AI capabilities. We do see fairly steady growth across many metrics: time horizon, average benchmark scores (ECI), effective compute, and predictive loss.
24
+
25
+
Algorithmic progress seems to be around 4X/year.
26
+
: We can quantify algorithmic progress with compute efficiency, i.e. the reduction in cost required to reach a given capability score.
27
+
28
+
AI researchers have been making a very consistent series of discoveries, typically estimated at increasing compute efficiency by around 4X/year (with many qualifications, discussed below).
29
+
30
+
The 3X/year increase in algorithmic efficiency seems to be coming from a roughly 2X/year increase in researchers.
31
+
32
+
AI speedups will cause a loop, but unclear how strong.
AI systems are now contributing to algorithmic progress.
43
+
: Until recently most AI R&D was done without help by LLMs, but we now see evidence for two channels:
44
+
45
+
1. _Augmenting AI researchers:_ AI researchers self-report big efficiency gains, e.g. @anthropic2025claude_work self-report approximately 50% productivity gains.
46
+
2. _Automating AI research:_ E.g. AlphaEvolve, TTT-Discover, autoresearch.
47
+
48
+
Both of these effects are hard to measure, & we have a great deal of uncertainty.
49
+
50
+
51
+
52
+
53
+
54
+
55
+
56
+
57
+
# Summary (OLD)
58
+
21
59
1.**Baseline model:**
60
+
- Frontier model capability is growing at 9X/year (measured in effective compute)
22
61
- Frontier training compute is growing at 3X/year
23
62
- Algorithmic efficiency is growing at 3X/year
24
-
- Frontier model capability is growing at 9X/year (measured in effective compute)
25
-
- R&D staff growing at 2X/year.
63
+
- R&D staff is growing at 2X/year.
26
64
27
65
2.**Two ways we can get RSI:** (1) augmentation of AI R&D; (2) automation of AI R&D.
28
66
@@ -120,17 +158,14 @@ Q: where is AI likely to help?
120
158
121
159
122
160
123
-
# Data
161
+
# Data on Compute Growth
124
162
125
-
My best estimates.
126
-
127
-
1. Training compute expenditure ($) has been growing around 3X/year, but will slow to 1.1X/year over 2026-2030.
128
-
2. Training compute (FLOP) has been growing around 4X/year, but will
129
-
3. Algorithmic efficiency has been growing around 3X/year, not clear if it will slow down or accelerate.
130
-
131
-
Takeaway: the outside-view seems roughly like `4-5X/year` for frontier training compute, `~3.5X/year` for training cost, `~2.3X/year` for installed compute stock, and `~1.37X/year` for hardware price-performance.
163
+
Best estimates.
164
+
:
165
+
1. Training compute expenditure has been growing around 3X/year, but will gradually fall to 1.1X/year over 2026-2030.
166
+
2. Training compute (FLOP) has been growing around 4X/year, but will fall to around 1.5X/year.
167
+
3. Algorithmic efficiency has been growing around 3X/year, it is hard to forecast future trends.
@@ -223,9 +258,6 @@ Takeaway: the outside-view seems roughly like `4-5X/year` for frontier training
223
258
224
259
The interactive graph currently shows about 4.6x/year growth in FLOPs of notable models, over 2020 - July 2025 (the latest datapoint).
225
260
226
-
@epoch2026canaicompaniesprofitable "Can AI companies become profitable?"
227
-
:
228
-
Useful mainly as a reminder that frontier-model economics are not just final training compute.
229
261
230
262
@you2025openaicomputespend "Most of OpenAI's 2024 compute went to experiments"
231
263
:
@@ -234,12 +266,12 @@ Takeaway: the outside-view seems roughly like `4-5X/year` for frontier training
234
266
- only a minority of R&D compute appears to have gone to the final training runs of released models
235
267
- GPT-4.5 final training run was only a modest share of the total R&D bucket
236
268
237
-
This seems relevant to the RSI story: training runs are scaling fast, but frontier labs are also spending enormous amounts on experiments, unreleased models, and inference.
238
-
239
269
240
-
##Estimates of LLM Algorithmic Progress
270
+
# Data on Algorithmic Progress
241
271
242
-
These estimates are also not all measuring the same thing. Some are narrow pre-training efficiency estimates; others are all-in software-progress estimates; others are small-scale case studies.
272
+
Best estimates.
273
+
:
274
+
- Around 4X/year, including the entire stack (GPU, pretraining, posttraining, elicitation).
0 commit comments