You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/posts/2025-09-13-recursive-self-improvement-explosion-optimization.html
+8-7Lines changed: 8 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -409,15 +409,16 @@ <h1>Apple-Picking Model / Low Hanging</h1>
409
409
<li><p>LLMs are suddenly able to optimize algorithms pretty well — maybe recursive-self-improvement has just kicked off.</p></li>
410
410
<li><p>But the critical question is whether the fruit that it’s picking are low-hanging. If all the optimizations are routine, then it’ll run out pretty quickly.</p></li>
<li>Agents are starting to autonomously optimize real-world functions, beating the human state-of-the-art.</li>
417
-
<li>In most models of RSI, agents only automate subtasks in R&D, but now they seem to be doing the whole thing autonomously.</li>
418
-
<li>Intuitively they’re only doing <em>shallow</em> optimizations, not deep optimizations.</li>
419
-
<li>We want a model so that we can define what it means to only do shallow optimizations, and test whether agents obey these predictions.</li>
420
-
<li>Testable predictions: (A) agents asymptote to a lower value than humans; (B) agents are relatively more useful for problems which have had less human time on them; (C) agents will asymptote to higher values if they are given better human starting points.</li>
416
+
<li><p>Agents are showing the ability to autonomously optimize AI training, beating the human state-of-the-art.</p></li>
417
+
<li><p>Where does this end? If you can pay some tokens to improve the training function by <spanclass="math inline">\(x\)</span>% better, then ….</p></li>
418
+
<li><p>In most models of RSI, agents only automate subtasks in R&D, but now they seem to be doing the whole thing autonomously.</p></li>
419
+
<li><p>Intuitively they’re only doing <em>shallow</em> optimizations, not deep optimizations.</p></li>
420
+
<li><p>We want a model so that we can define what it means to only do shallow optimizations, and test whether agents obey these predictions.</p></li>
421
+
<li><p>Testable predictions: (A) agents asymptote to a lower value than humans; (B) agents are relatively more useful for problems which have had less human time on them; (C) agents will asymptote to higher values if they are given better human starting points.</p></li>
0 commit comments