Skip to content

Commit 7079195

Browse files
committed
apple picking
1 parent f030251 commit 7079195

4 files changed

Lines changed: 112 additions & 9 deletions

File tree

_freeze/posts/2025-09-13-recursive-self-improvement-explosion-optimization/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

docs/posts/2025-09-13-recursive-self-improvement-explosion-optimization.html

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -409,15 +409,16 @@ <h1>Apple-Picking Model / Low Hanging</h1>
409409
<li><p>LLMs are suddenly able to optimize algorithms pretty well — maybe recursive-self-improvement has just kicked off.</p></li>
410410
<li><p>But the critical question is whether the fruit that it’s picking are low-hanging. If all the optimizations are routine, then it’ll run out pretty quickly.</p></li>
411411
</ol>
412-
<section id="apple-picking-model" class="level2">
413-
<h2 class="anchored" data-anchor-id="apple-picking-model">Apple-Picking Model</h2>
412+
<section id="bernoulli-apple-picking-model" class="level2">
413+
<h2 class="anchored" data-anchor-id="bernoulli-apple-picking-model">Bernoulli Apple-Picking Model</h2>
414414
<p><strong>Motivation:</strong></p>
415415
<ul>
416-
<li>Agents are starting to autonomously optimize real-world functions, beating the human state-of-the-art.</li>
417-
<li>In most models of RSI, agents only automate subtasks in R&amp;D, but now they seem to be doing the whole thing autonomously.</li>
418-
<li>Intuitively they’re only doing <em>shallow</em> optimizations, not deep optimizations.</li>
419-
<li>We want a model so that we can define what it means to only do shallow optimizations, and test whether agents obey these predictions.</li>
420-
<li>Testable predictions: (A) agents asymptote to a lower value than humans; (B) agents are relatively more useful for problems which have had less human time on them; (C) agents will asymptote to higher values if they are given better human starting points.</li>
416+
<li><p>Agents are showing the ability to autonomously optimize AI training, beating the human state-of-the-art.</p></li>
417+
<li><p>Where does this end? If you can pay some tokens to improve the training function by <span class="math inline">\(x\)</span>% better, then ….</p></li>
418+
<li><p>In most models of RSI, agents only automate subtasks in R&amp;D, but now they seem to be doing the whole thing autonomously.</p></li>
419+
<li><p>Intuitively they’re only doing <em>shallow</em> optimizations, not deep optimizations.</p></li>
420+
<li><p>We want a model so that we can define what it means to only do shallow optimizations, and test whether agents obey these predictions.</p></li>
421+
<li><p>Testable predictions: (A) agents asymptote to a lower value than humans; (B) agents are relatively more useful for problems which have had less human time on them; (C) agents will asymptote to higher values if they are given better human starting points.</p></li>
421422
</ul>
422423
<p><strong>Setup:</strong></p>
423424
<ul>

0 commit comments

Comments
 (0)