tecunningham
diff --git a/‎_freeze/posts/2026-03-13-apple-picking-ai/execute-results/html.json‎
Lines changed: 2 additions & 2 deletions b/‎_freeze/posts/2026-03-13-apple-picking-ai/execute-results/html.json‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎_freeze/posts/2026-03-13-apple-picking-ai/figure-html/unnamed-chunk-4-1.pdf‎
-246 Bytes b/‎_freeze/posts/2026-03-13-apple-picking-ai/figure-html/unnamed-chunk-4-1.pdf‎
-246 Bytes
diff --git a/‎_freeze/posts/2026-03-13-apple-picking-ai/figure-html/unnamed-chunk-4-1.png‎
-1.22 KB b/‎_freeze/posts/2026-03-13-apple-picking-ai/figure-html/unnamed-chunk-4-1.png‎
-1.22 KB
diff --git a/‎docs/posts/2026-03-13-apple-picking-ai.html‎
Lines changed: 21 additions & 14 deletions b/‎docs/posts/2026-03-13-apple-picking-ai.html‎
Lines changed: 21 additions & 14 deletions
diff --git a/‎docs/posts/2026-03-13-apple-picking-ai_files/figure-html/unnamed-chunk-4-1.png‎
-1.22 KB b/‎docs/posts/2026-03-13-apple-picking-ai_files/figure-html/unnamed-chunk-4-1.png‎
-1.22 KB
diff --git a/‎posts/2026-03-13-apple-picking-ai.qmd‎
Lines changed: 25 additions & 17 deletions b/‎posts/2026-03-13-apple-picking-ai.qmd‎
Lines changed: 25 additions & 17 deletions
@@ -272,14 +272,14 @@ <h1 class="title">An Apple-Plucking Model of Agents</h1>
 </style>
 
 <div class="no-row-height column-margin column-container"><div class="">
-<p>Thanks to Nate Rush, Manish Shetty, Basil Halperin for helpful comments.</p>
+<p>Thanks to Nate Rush, Manish Shetty, Basil Halperin, &amp; Parker Whitfill for helpful comments.</p>
 </div></div><dl>
 <dt>An apple-picking model of AI work.</dt>
 <dd>
 <p>Here’s a simple model useful for thinking about AI’s contribution to solving problems.</p>
 <p>In short: an agent helping you with programming is like a robot helping you pick apples. It will take care of all the apples up to a certain height, and find apples you haven’t found, but there will still be apples out of its reach.</p>
 <p><img src="images/2026-03-12-11-32-34.png" class="img-fluid"></p>
-<p>The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&amp;D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a <em>shallow</em> improvement, and the apple tree model tries to distinguish between shallow and deep improvements.</p>
+<p>The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&amp;D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a <em>shallow</em> improvement, and the apple-picking model tries to distinguish between shallow and deep improvements.</p>
 </dd>
 <dt>Implications of the apple-picking model.</dt>
 <dd>
@@ -352,11 +352,12 @@ <h1>Basic Model</h1>
 </div>
 </div>
 </dd>
-<dt>Implication: agent value and asymptote depends on starting point.</dt>
+<dt>Implication: agent asymptote depends on the starting point.</dt>
 <dd>
-<p>The plot below shows agent trajectories, each starting after a different amount of human work.</p>
-<p>If you start an agent from scratch, it will have high marginal value, but a low asymptote.</p>
-<p>If you start an agent after some human optimization, it will have lower marginal value, but it will be able to achieve a higher asymptote (below, starting after 1 unit of human search raises the agent asymptote from 0.5 to 0.7).</p>
+<p>The plot below shows a variety of agent trajectories, each starting after a different amount of human work.</p>
+<p>You could interpret this as starting an agent at different points in the history of optimizing some algorithm, e.g.&nbsp;nanoGPT.</p>
+<p>The model implies that if you start an agent from the original unoptimized version of an algorithm it will quickly make high gains, but asymptote to a value well below the human state-of-the-art.</p>
+<p>If you start an agent after a small amount of human optimization, it will have smaller initial value (some of the apples have already been picked), but it will be able to achieve a higher asymptote.</p>
 <div class="cell">
 <div class="cell-output-display">
 <div>
@@ -368,19 +369,26 @@ <h1>Basic Model</h1>
 </div>
 </dd>
 </dl>
-<section id="closed-apple-picking-model" class="level2">
-<h2 class="anchored" data-anchor-id="closed-apple-picking-model">Closed Apple-Picking Model</h2>
-<p><strong>Setup:</strong></p>
+</section>
+<section id="closed-apple-picking-model" class="level1">
+<h1>Closed Apple-Picking Model</h1>
+<dl>
+<dt>Now let the robot’s height depend on apples harvested.</dt>
+<dd>
+<p>The previous model applied to agents working on an arbitrary problem. Now we focus on agents working on AI R&amp;D. We make two changes:</p>
 <ol type="1">
-<li>Apples sit at heights in <span class="math inline">\([0,\infty)\)</span>; human reach is normalized to 1. The agent has reach <span class="math inline">\(\lambda_t \ge 0\)</span> and picks everything below it.</li>
-<li>Humans pick in the band <span class="math inline">\((\lambda_t, 1]\)</span> at a rate governed by <span class="math inline">\(p\)</span>.</li>
-<li>Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold (<span class="math inline">\(\bar{a}\)</span>), and then linear in <span class="math inline">\(a\)</span>.</li>
+<li>We assume that the agent’s ability (<span class="math inline">\(\lambda\)</span>) is itself a function of AI R&amp;D progress (the robot is eating the apples and getting taller). It turns out that we can get a simple closed-form solution when this function is linear. To add a touch of realism we assume that agents have no meaningful ability until algorithmic progress passes some minimum threshold (<span class="math inline">\(\bar{a}\)</span>).</li>
+<li>We assume that agents pick <em>all</em> the apples available to them each period. This makes things easier to model (the state of the tree can be summarized with just two variables, <span class="math inline">\(\lambda\)</span> and <span class="math inline">\(a\)</span>), but it also seems a reasonable assumption: AI research labs will keep spending money on agent-optimizing their algorithms until they hit low returns.</li>
 </ol>
-<p><strong>Implications:</strong></p>
+</dd>
+<dt>Implications:</dt>
+<dd>
 <ol type="1">
 <li>Agents will get taller than humans iff <span class="math inline">\(\alpha + \beta(1-\bar{a}) &gt; 1\)</span></li>
 <li>Agent height will be explosive iff <span class="math inline">\(\beta &gt;1\)</span>, i.e.&nbsp;if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height <span class="math inline">\(\lambda^*\)</span>.</li>
 </ol>
+</dd>
+</dl>
 <section id="state-variables-and-dynamics" class="level3">
 <h3 class="anchored" data-anchor-id="state-variables-and-dynamics">1. State variables and dynamics</h3>
 <p>Normalize human reach to 1.</p>
@@ -417,7 +425,6 @@ <h3 class="anchored" data-anchor-id="crisp-conditions">2. Crisp conditions</h3>
 
 
 
-</section>
 </section>
 </section>
 
 
@@ -28,7 +28,7 @@ execute:
 <!-- https://tecunningham.github.io/posts/2026-03-13-apple-picking-ai.html -->
 
 ::: {.column-margin}
-   Thanks to Nate Rush, Manish Shetty, Basil Halperin for helpful comments.
+   Thanks to Nate Rush, Manish Shetty, Basil Halperin, & Parker Whitfill for helpful comments.
 :::
 
 
@@ -40,7 +40,7 @@ An apple-picking model of AI work.
 
     ![](images/2026-03-12-11-32-34.png)
 
-    The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a *shallow* improvement, and the apple tree model tries to distinguish between shallow and deep improvements.
+    The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a *shallow* improvement, and the apple-picking model tries to distinguish between shallow and deep improvements.
 
 Implications of the apple-picking model.
 : 
@@ -164,18 +164,20 @@ Implication: agents can improve on human SoTA, but only by a limited amount.
     ```
 
 
-Implication: agent value and asymptote depends on starting point.
+Implication: agent asymptote depends on the starting point.
 : 
-    The plot below shows agent trajectories, each starting after a different amount of human work.
+    The plot below shows a variety of agent trajectories, each starting after a different amount of human work.
+    
+    You could interpret this as starting an agent at different points in the history of optimizing some algorithm, e.g. nanoGPT.
 
-    If you start an agent from scratch, it will have high marginal value, but a low asymptote.
+    The model implies that if you start an agent from the original unoptimized version of an algorithm it will quickly make high gains, but asymptote to a value well below the human state-of-the-art.
 
-    If you start an agent after some human optimization, it will have lower marginal value, but it will be able to achieve a higher asymptote (below, starting after 1 unit of human search raises the agent asymptote from 0.5 to 0.7).
+    If you start an agent after a small amount of human optimization, it will have smaller initial value (some of the apples have already been picked), but it will be able to achieve a higher asymptote.
 
     ```{tikz}
     \begin{tikzpicture}[x=1.1cm, y=5cm]
     \def\rH{0.2} \def\rA{0.9} \def\lam{0.5}
-    \draw[-] (0,0) -- (5.5,0) node[midway,below] {total search time}
+    \draw[-] (0,0) -- (5.5,0) node[midway,below] {search expenditure}
         -- (5.5,1) -- (0,1) -- (0,0) node[midway,rotate=90,above] {share found};
     \node[above] at (2.75,1) {$r_H=0.2,\; r_A=0.9,\; \lambda=0.5$};
     \draw[teal, thick, domain=0:5.3, samples=120]
@@ -190,27 +192,33 @@ Implication: agent value and asymptote depends on starting point.
         plot (\x, {1 - (1-\lam)*exp(-3*\rH) - \lam*exp(-3*\rH - \rA*(\x-3))});
     \draw[orange, thick, loosely dashed, domain=4:5.3, samples=120]
         plot (\x, {1 - (1-\lam)*exp(-4*\rH) - \lam*exp(-4*\rH - \rA*(\x-4))});
-    \node[teal, right] at (4.2, {1 - exp(-\rH*4.2) + 0.03}) {human-only};
-    \node[orange, right] at (4.15, {1 - \lam*exp(-\rA*4.15) - (1-\lam) - 0.03}) {$t_H=0,1,2,3,4$};
+    \node[teal, right] at (5.5, {1 - exp(-\rH*5)}) {human-only};
+    %\node[orange, right] at (4.15, {1 - \lam*exp(-\rA*4.15) - (1-\lam) - 0.03}) {$t_H=0,1,2,3,4$};
     \end{tikzpicture}
     ```
 
 
 
 
 
-##          Closed Apple-Picking Model
+#          Closed Apple-Picking Model
 
-**Setup:**
+Now let the robot's height depend on apples harvested.
+: 
+    The previous model applied to agents working on an arbitrary problem. Now we focus on agents working on AI R&D. We make two changes:
 
-1. Apples sit at heights in $[0,\infty)$; human reach is normalized to 1. The agent has reach $\lambda_t \ge 0$ and picks everything below it.
-2. Humans pick in the band $(\lambda_t, 1]$ at a rate governed by $p$.
-3. Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold ($\bar{a}$), and then linear in $a$.
+    1. We assume that the agent's ability ($\lambda$) is itself a function of AI R&D progress (the robot is eating the apples and getting taller). It turns out that we can get a simple closed-form solution when this function is linear. To add a touch of realism we assume that agents have no meaningful ability until algorithmic progress passes some minimum threshold ($\bar{a}$).
+    2. We assume that agents pick *all* the apples available to them each period. This makes things easier to model (the state of the tree can be summarized with just two variables, $\lambda$ and $a$), but it also seems a reasonable assumption: AI research labs will keep spending money on agent-optimizing their algorithms until they hit low returns.
+
+Implications:
+: 
+    1. Agents will get taller than humans iff $\alpha + \beta(1-\bar{a}) > 1$
+    2. Agent height will be explosive iff $\beta >1$, i.e. if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height $\lambda^*$.
 
-**Implications:**
 
-1. Agents will get taller than humans iff $\alpha + \beta(1-\bar{a}) > 1$
-2. Agent height will be explosive iff $\beta >1$, i.e. if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height $\lambda^*$.
+<!-- 1. Apples sit at heights in $[0,\infty)$; human reach is normalized to 1. The agent has reach $\lambda_t \ge 0$ and picks everything below it.
+2. Humans pick in the band $(\lambda_t, 1]$ at a rate governed by $p$.
+3. Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold ($\bar{a}$), and then linear in $a$. -->
 
 
 ### 1. State variables and dynamics