Skip to content

Commit 93c3df9

Browse files
committed
apple
1 parent 3296c54 commit 93c3df9

File tree

6 files changed

+48
-33
lines changed

6 files changed

+48
-33
lines changed

_freeze/posts/2026-03-13-apple-picking-ai/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.
Binary file not shown.
-1.22 KB
Loading

docs/posts/2026-03-13-apple-picking-ai.html

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -272,14 +272,14 @@ <h1 class="title">An Apple-Plucking Model of Agents</h1>
272272
</style>
273273

274274
<div class="no-row-height column-margin column-container"><div class="">
275-
<p>Thanks to Nate Rush, Manish Shetty, Basil Halperin for helpful comments.</p>
275+
<p>Thanks to Nate Rush, Manish Shetty, Basil Halperin, &amp; Parker Whitfill for helpful comments.</p>
276276
</div></div><dl>
277277
<dt>An apple-picking model of AI work.</dt>
278278
<dd>
279279
<p>Here’s a simple model useful for thinking about AI’s contribution to solving problems.</p>
280280
<p>In short: an agent helping you with programming is like a robot helping you pick apples. It will take care of all the apples up to a certain height, and find apples you haven’t found, but there will still be apples out of its reach.</p>
281281
<p><img src="images/2026-03-12-11-32-34.png" class="img-fluid"></p>
282-
<p>The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&amp;D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a <em>shallow</em> improvement, and the apple tree model tries to distinguish between shallow and deep improvements.</p>
282+
<p>The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&amp;D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a <em>shallow</em> improvement, and the apple-picking model tries to distinguish between shallow and deep improvements.</p>
283283
</dd>
284284
<dt>Implications of the apple-picking model.</dt>
285285
<dd>
@@ -352,11 +352,12 @@ <h1>Basic Model</h1>
352352
</div>
353353
</div>
354354
</dd>
355-
<dt>Implication: agent value and asymptote depends on starting point.</dt>
355+
<dt>Implication: agent asymptote depends on the starting point.</dt>
356356
<dd>
357-
<p>The plot below shows agent trajectories, each starting after a different amount of human work.</p>
358-
<p>If you start an agent from scratch, it will have high marginal value, but a low asymptote.</p>
359-
<p>If you start an agent after some human optimization, it will have lower marginal value, but it will be able to achieve a higher asymptote (below, starting after 1 unit of human search raises the agent asymptote from 0.5 to 0.7).</p>
357+
<p>The plot below shows a variety of agent trajectories, each starting after a different amount of human work.</p>
358+
<p>You could interpret this as starting an agent at different points in the history of optimizing some algorithm, e.g.&nbsp;nanoGPT.</p>
359+
<p>The model implies that if you start an agent from the original unoptimized version of an algorithm it will quickly make high gains, but asymptote to a value well below the human state-of-the-art.</p>
360+
<p>If you start an agent after a small amount of human optimization, it will have smaller initial value (some of the apples have already been picked), but it will be able to achieve a higher asymptote.</p>
360361
<div class="cell">
361362
<div class="cell-output-display">
362363
<div>
@@ -368,19 +369,26 @@ <h1>Basic Model</h1>
368369
</div>
369370
</dd>
370371
</dl>
371-
<section id="closed-apple-picking-model" class="level2">
372-
<h2 class="anchored" data-anchor-id="closed-apple-picking-model">Closed Apple-Picking Model</h2>
373-
<p><strong>Setup:</strong></p>
372+
</section>
373+
<section id="closed-apple-picking-model" class="level1">
374+
<h1>Closed Apple-Picking Model</h1>
375+
<dl>
376+
<dt>Now let the robot’s height depend on apples harvested.</dt>
377+
<dd>
378+
<p>The previous model applied to agents working on an arbitrary problem. Now we focus on agents working on AI R&amp;D. We make two changes:</p>
374379
<ol type="1">
375-
<li>Apples sit at heights in <span class="math inline">\([0,\infty)\)</span>; human reach is normalized to 1. The agent has reach <span class="math inline">\(\lambda_t \ge 0\)</span> and picks everything below it.</li>
376-
<li>Humans pick in the band <span class="math inline">\((\lambda_t, 1]\)</span> at a rate governed by <span class="math inline">\(p\)</span>.</li>
377-
<li>Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold (<span class="math inline">\(\bar{a}\)</span>), and then linear in <span class="math inline">\(a\)</span>.</li>
380+
<li>We assume that the agent’s ability (<span class="math inline">\(\lambda\)</span>) is itself a function of AI R&amp;D progress (the robot is eating the apples and getting taller). It turns out that we can get a simple closed-form solution when this function is linear. To add a touch of realism we assume that agents have no meaningful ability until algorithmic progress passes some minimum threshold (<span class="math inline">\(\bar{a}\)</span>).</li>
381+
<li>We assume that agents pick <em>all</em> the apples available to them each period. This makes things easier to model (the state of the tree can be summarized with just two variables, <span class="math inline">\(\lambda\)</span> and <span class="math inline">\(a\)</span>), but it also seems a reasonable assumption: AI research labs will keep spending money on agent-optimizing their algorithms until they hit low returns.</li>
378382
</ol>
379-
<p><strong>Implications:</strong></p>
383+
</dd>
384+
<dt>Implications:</dt>
385+
<dd>
380386
<ol type="1">
381387
<li>Agents will get taller than humans iff <span class="math inline">\(\alpha + \beta(1-\bar{a}) &gt; 1\)</span></li>
382388
<li>Agent height will be explosive iff <span class="math inline">\(\beta &gt;1\)</span>, i.e.&nbsp;if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height <span class="math inline">\(\lambda^*\)</span>.</li>
383389
</ol>
390+
</dd>
391+
</dl>
384392
<section id="state-variables-and-dynamics" class="level3">
385393
<h3 class="anchored" data-anchor-id="state-variables-and-dynamics">1. State variables and dynamics</h3>
386394
<p>Normalize human reach to 1.</p>
@@ -417,7 +425,6 @@ <h3 class="anchored" data-anchor-id="crisp-conditions">2. Crisp conditions</h3>
417425

418426

419427

420-
</section>
421428
</section>
422429
</section>
423430

-1.22 KB
Loading

posts/2026-03-13-apple-picking-ai.qmd

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ execute:
2828
<!-- https://tecunningham.github.io/posts/2026-03-13-apple-picking-ai.html -->
2929

3030
::: {.column-margin}
31-
Thanks to Nate Rush, Manish Shetty, Basil Halperin for helpful comments.
31+
Thanks to Nate Rush, Manish Shetty, Basil Halperin, & Parker Whitfill for helpful comments.
3232
:::
3333

3434

@@ -40,7 +40,7 @@ An apple-picking model of AI work.
4040

4141
![](images/2026-03-12-11-32-34.png)
4242

43-
The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a *shallow* improvement, and the apple tree model tries to distinguish between shallow and deep improvements.
43+
The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a *shallow* improvement, and the apple-picking model tries to distinguish between shallow and deep improvements.
4444

4545
Implications of the apple-picking model.
4646
:
@@ -164,18 +164,20 @@ Implication: agents can improve on human SoTA, but only by a limited amount.
164164
```
165165

166166

167-
Implication: agent value and asymptote depends on starting point.
167+
Implication: agent asymptote depends on the starting point.
168168
:
169-
The plot below shows agent trajectories, each starting after a different amount of human work.
169+
The plot below shows a variety of agent trajectories, each starting after a different amount of human work.
170+
171+
You could interpret this as starting an agent at different points in the history of optimizing some algorithm, e.g. nanoGPT.
170172

171-
If you start an agent from scratch, it will have high marginal value, but a low asymptote.
173+
The model implies that if you start an agent from the original unoptimized version of an algorithm it will quickly make high gains, but asymptote to a value well below the human state-of-the-art.
172174

173-
If you start an agent after some human optimization, it will have lower marginal value, but it will be able to achieve a higher asymptote (below, starting after 1 unit of human search raises the agent asymptote from 0.5 to 0.7).
175+
If you start an agent after a small amount of human optimization, it will have smaller initial value (some of the apples have already been picked), but it will be able to achieve a higher asymptote.
174176

175177
```{tikz}
176178
\begin{tikzpicture}[x=1.1cm, y=5cm]
177179
\def\rH{0.2} \def\rA{0.9} \def\lam{0.5}
178-
\draw[-] (0,0) -- (5.5,0) node[midway,below] {total search time}
180+
\draw[-] (0,0) -- (5.5,0) node[midway,below] {search expenditure}
179181
-- (5.5,1) -- (0,1) -- (0,0) node[midway,rotate=90,above] {share found};
180182
\node[above] at (2.75,1) {$r_H=0.2,\; r_A=0.9,\; \lambda=0.5$};
181183
\draw[teal, thick, domain=0:5.3, samples=120]
@@ -190,27 +192,33 @@ Implication: agent value and asymptote depends on starting point.
190192
plot (\x, {1 - (1-\lam)*exp(-3*\rH) - \lam*exp(-3*\rH - \rA*(\x-3))});
191193
\draw[orange, thick, loosely dashed, domain=4:5.3, samples=120]
192194
plot (\x, {1 - (1-\lam)*exp(-4*\rH) - \lam*exp(-4*\rH - \rA*(\x-4))});
193-
\node[teal, right] at (4.2, {1 - exp(-\rH*4.2) + 0.03}) {human-only};
194-
\node[orange, right] at (4.15, {1 - \lam*exp(-\rA*4.15) - (1-\lam) - 0.03}) {$t_H=0,1,2,3,4$};
195+
\node[teal, right] at (5.5, {1 - exp(-\rH*5)}) {human-only};
196+
%\node[orange, right] at (4.15, {1 - \lam*exp(-\rA*4.15) - (1-\lam) - 0.03}) {$t_H=0,1,2,3,4$};
195197
\end{tikzpicture}
196198
```
197199

198200

199201

200202

201203

202-
## Closed Apple-Picking Model
204+
# Closed Apple-Picking Model
203205

204-
**Setup:**
206+
Now let the robot's height depend on apples harvested.
207+
:
208+
The previous model applied to agents working on an arbitrary problem. Now we focus on agents working on AI R&D. We make two changes:
205209

206-
1. Apples sit at heights in $[0,\infty)$; human reach is normalized to 1. The agent has reach $\lambda_t \ge 0$ and picks everything below it.
207-
2. Humans pick in the band $(\lambda_t, 1]$ at a rate governed by $p$.
208-
3. Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold ($\bar{a}$), and then linear in $a$.
210+
1. We assume that the agent's ability ($\lambda$) is itself a function of AI R&D progress (the robot is eating the apples and getting taller). It turns out that we can get a simple closed-form solution when this function is linear. To add a touch of realism we assume that agents have no meaningful ability until algorithmic progress passes some minimum threshold ($\bar{a}$).
211+
2. We assume that agents pick *all* the apples available to them each period. This makes things easier to model (the state of the tree can be summarized with just two variables, $\lambda$ and $a$), but it also seems a reasonable assumption: AI research labs will keep spending money on agent-optimizing their algorithms until they hit low returns.
212+
213+
Implications:
214+
:
215+
1. Agents will get taller than humans iff $\alpha + \beta(1-\bar{a}) > 1$
216+
2. Agent height will be explosive iff $\beta >1$, i.e. if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height $\lambda^*$.
209217

210-
**Implications:**
211218

212-
1. Agents will get taller than humans iff $\alpha + \beta(1-\bar{a}) > 1$
213-
2. Agent height will be explosive iff $\beta >1$, i.e. if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height $\lambda^*$.
219+
<!-- 1. Apples sit at heights in $[0,\infty)$; human reach is normalized to 1. The agent has reach $\lambda_t \ge 0$ and picks everything below it.
220+
2. Humans pick in the band $(\lambda_t, 1]$ at a rate governed by $p$.
221+
3. Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold ($\bar{a}$), and then linear in $a$. -->
214222

215223

216224
### 1. State variables and dynamics

0 commit comments

Comments
 (0)