Skip to content

Commit ea8632f

Browse files
committed
apple plucing
1 parent 0ca83c6 commit ea8632f

File tree

5 files changed

+21
-14
lines changed

5 files changed

+21
-14
lines changed

_freeze/posts/2026-03-13-apple-picking-ai/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

docs/posts/2026-03-13-apple-picking-ai.html

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -274,12 +274,11 @@ <h1 class="title">An Apple-Plucking Model of Agents</h1>
274274
<div class="no-row-height column-margin column-container"><div class="">
275275
<p>Thanks to Nate Rush, Manish Shetty, Basil Halperin, &amp; Parker Whitfill for helpful comments.</p>
276276
</div></div><dl>
277-
<dt>An apple-picking model of AI work.</dt>
277+
<dt>A simple model for AI R&amp;D.</dt>
278278
<dd>
279-
<p>Here’s a simple model useful for thinking about AI’s contribution to solving problems.</p>
280-
<p>In short: an agent helping you with programming is like a robot helping you pick apples. It will take care of all the apples up to a certain height, and find apples you haven’t found, but there will still be apples out of its reach.</p>
279+
<p>In short: an agent helping you with optimizing an algorithm is like a robot helping you pick apples. It will take care of all the apples up to a certain height, and find apples you haven’t found, but there will still be apples out of its reach.</p>
281280
<p><img src="images/2026-03-12-11-32-34.png" class="img-fluid"></p>
282-
<p>The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&amp;D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the 1% improvement is a <em>shallow</em> improvement, and the apple-picking model tries to distinguish between shallow and deep improvements.</p>
281+
<p>The specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&amp;D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the agents have been discovering <em>shalow</em> improvements to algorithms, and the apple-picking model tries to distinguish between shallow and deep improvements.</p>
283282
</dd>
284283
<dt>Implications of the apple-picking model.</dt>
285284
<dd>

docs/search.json

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,5 +236,12 @@
236236
"title": "2026-02-10 | model of labs for tkwa",
237237
"section": "",
238238
"text": "1 Reuters: CFO says annualized revenue crossed $20B in late 2025. (link)\n2 TechCrunch: leaked docs; $8.65B inference payments to Azure in first 9 months of 2025. (link)\n3 Epoch AI: 2024 compute breakdown — ~$7B total (~$5B R&D, ~$2B inference); most R&D was experiments, not final training. (data insight, substack)\n5 Fortune: internal docs shared with investors; $22B spending vs $13B sales for full-year 2025; $9B net loss. (link)\n[6] The Information (via Reuters/ainvest): H1 2025 financials — $4.3B revenue, $6.7B R&D, $2B S&M, $2.5B SBC, $2.5B cash burn. (Reuters, ainvest)\n7 WSJ: OpenAI paying employees more than any major tech startup — average SBC ~$1.5M/employee; ~4,000 headcount. (link)\n8 Reuters/NYT: data licensing deals — News Corp $250M/5yr, Reddit $60M, plus AP, FT, Axel Springer. (Reuters, NYT)\n9 CoinCodex: compute margin reached ~70% by Oct 2025 (up from 52% end-2024); overall gross margin ~48%. (link)"
239+
},
240+
{
241+
"objectID": "posts/2026-03-13-apple-picking-ai.html",
242+
"href": "posts/2026-03-13-apple-picking-ai.html",
243+
"title": "An Apple-Plucking Model of Agents",
244+
"section": "",
245+
"text": "Thanks to Nate Rush, Manish Shetty, Basil Halperin, & Parker Whitfill for helpful comments.\n\nA simple model for AI R&D.\n\nIn short: an agent helping you with optimizing an algorithm is like a robot helping you pick apples. It will take care of all the apples up to a certain height, and find apples you haven’t found, but there will still be apples out of its reach.\n\nThe specific motivation was to help think through the implications of recent evidence that AI can push forward the frontier on various optimization and AI R&D problems. If you can pay $10 to increase the efficiency of an algorithm by 1% then, on its surface, this look like the path to self-improvement, and you can replace humans with AI agents. But realistically the agents have been discovering shalow improvements to algorithms, and the apple-picking model tries to distinguish between shallow and deep improvements.\n\nImplications of the apple-picking model.\n\n\nAgents can autonomously advance the state-of-the-art on an optimization problem, yet still not be a perfect substitute for human labor (they can find the low apples that haven’t been picked yet).\nAgent contribution to a problem, as you scale the expenditure, will be higher than human contribution, but then asymptote to a lower maximum value (they’ll pick a lot of apples, but will never be able to pick them all).\nAgents will have relatively bigger value, relative to humans, for problems that are not yet heavily optimized (robots are useful for trees that have never been picked).\nAgents will asymptote to higher points if they are given better human starting points (if a tree is partly picked).\nTo gauge the ability of agents we want to test not just for their ability to improve performance, but their reach (we want to benchmark robots not on how many apples they can pick, but the height of the highest apple they can reach).\n\n\nRelation to other literature.\n\nMost existing models of AI R&D assume that AI accelerates or multiples human reserachers, e.g. by automating some of their tasks. I believe this is roughly true for Aghion, Jones, and Jones (2019), Davidson (2021), Erdil et al. (2025), Davidson et al. (2026), Jones (2025), Kwa (2026). These models then calibrate the effect through (1) how much does AI accelerate R&D workers; (2) how much do R&D workers contribute to our stock of knowledge. But this is hard to reconcile with evidence that AI is already autonomously speeding-up AI research. A critical distinction is that these models all summarize the level of productivity with a scalar. This means that a 1% improvement in efficiency is equal, there is no distinction between a shallow and deep speedup.\nThere are two models I’m aware of that allow autonomous AI actions, but distinguish the quality of those actions.\n\nKokotajlo et al. (2025) model AI and human “research taste”, though I don’t have a clear idea exactly how taste is aggregated within a lab.\nIde and Talamas (2024) model AI with different levels of human ability.\n\n\nOther notes / things to add.\n\nRelation to time horizon. You can think of the high apples as long-time-horizon tasks.\nLandscape. A more general version would model the entire landscape. You can represent an optimization problem as \\(y=f(\\bm{x})\\), where you’re trying to choose an \\(\\bm{x}\\) to maximize \\(y\\), given some unknown \\(f(\\cdot)\\). (talk about non-additivity of optimizations; talk about conditions under which landscape is separable, and so each subspace is an independent apple).\nPath dependence.\nShape of the tree. (examples of trees with naturally more low-hanging fruit; examples of trees that are now pretty bare low-down).\nOther bottlenecks. – e.g. compute bottlenecks, experiment bottlenecks.\nExisting models of AI R&D.\nImplications for AI R&D.\n\nLLMs are suddenly able to optimize algorithms pretty well — maybe recursive-self-improvement has just kicked off. But the critical question is whether the fruit that it’s picking are low-hanging. If all the optimizations are routine, then it’ll run out pretty quickly.\nLLM training is a big stack of algorithms, which we’ve been optimizing at perhaps 10X/year. [add some speculation about which parts of the stack have low-hanging fruit]\n\n\n\n\nBasic Model\n\nSetup.\n\nThere are a continuum of apples spready uniformly on the real line.\nA human can find apples over \\([0,1]\\), but agent can only find apples over \\([0,\\lambda]\\), with \\(\\lambda&lt;1\\) (at least for now).\nHumans find apples at rate \\(r_H\\), agents find apples at rate \\(r_A\\), and we use \\(t_H\\) and \\(t_A\\) to represent the time humans and agents spend searching (you can also interpret \\(t_H\\) and \\(t_A\\) as expenditure on the problem).\n\n\nWe can then derive apples found:\n\\[\\text{share apples found}= \\underbrace{\\lambda(1-e^{-r_Ht_H+r_At_A})}_{\\text{apples from bottom of tree}}+\\underbrace{(1-\\lambda)(1-e^{-r_Ht_H})}_{\\text{apples from top of tree}}.\\]\n\nImplication: agents asymptote to a lower level than humans.\n\nHere we illustrate agent-only and human-only search curves: the agent curve rises more quickly (\\(r_A&gt;r_H\\)), but asymptotes to a lower level (\\(\\lambda&lt;1\\)).\n\n\n\n\n\n\n\n\n\nThe shape of these curves is a good match for what we see across tasks, e.g. in METR (2024). For most tasks either (1) an agent can do it much cheaper than a human; or (2) an agent can’t do it at all.\n\n\nImplication: agents can improve on human SoTA, but only by a limited amount.\n\nWe can now draw the We can see that agents can improve on the humans’ SoTA performance, but in each case the value asymptotes.\n\n\n\n\n\n\n\n\n\n\nImplication: agent asymptote depends on the starting point.\n\nThe plot below shows a variety of agent trajectories, each starting after a different amount of human work.\nYou could interpret this as starting an agent at different points in the history of optimizing some algorithm, e.g. nanoGPT.\nThe model implies that if you start an agent from the original unoptimized version of an algorithm it will quickly make high gains, but asymptote to a value well below the human state-of-the-art.\nIf you start an agent after a small amount of human optimization, it will have smaller initial value (some of the apples have already been picked), but it will be able to achieve a higher asymptote.\n\n\n\n\n\n\n\n\n\n\n\n\n\nClosed Apple-Picking Model\n\nNow let the robot’s height depend on apples harvested.\n\nThe previous model applied to agents working on an arbitrary problem. Now we focus on agents working on AI R&D. We make two changes:\n\nWe assume that the agent’s ability (\\(\\lambda\\)) is itself a function of AI R&D progress (the robot is eating the apples and getting taller). It turns out that we can get a simple closed-form solution when this function is linear. To add a touch of realism we assume that agents have no meaningful ability until algorithmic progress passes some minimum threshold (\\(\\bar{a}\\)).\nWe assume that agents pick all the apples available to them each period. This makes things easier to model (the state of the tree can be summarized with just two variables, \\(\\lambda\\) and \\(a\\)), but it also seems a reasonable assumption: AI research labs will keep spending money on agent-optimizing their algorithms until they hit low returns.\n\n\nImplications:\n\n\nAgents will get taller than humans iff \\(\\alpha + \\beta(1-\\bar{a}) &gt; 1\\)\nAgent height will be explosive iff \\(\\beta &gt;1\\), i.e. if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height \\(\\lambda^*\\).\n\n\n\n\n1. State variables and dynamics\nNormalize human reach to 1.\n\n\\(\\lambda_t \\ge 0\\): agent reach (how high the AI can pick).\n\\(h_t \\in [0,1]\\): human coverage of the human-only band \\((\\lambda_t, 1]\\) (fraction of that band already picked by humans).\n\nHuman dynamics (one parameter \\(p \\in (0,1)\\)): per period, a fraction \\(1-p\\) of the remaining human-level band gets picked, so \\[h_{t+1} = 1 - p(1-h_t), \\qquad h_0 = 0.\\] (Equivalently \\(h_t = 1 - p^t\\); the recursion keeps the model closed and autonomous.)\nApples harvested (agent + humans, with clipping at 1): \\[a_t = \\lambda_t + (1-\\lambda_t)_+ \\, h_t, \\qquad (x)_+ \\equiv \\max\\{x,0\\}.\\] Agent gets everything up to \\(\\lambda_t\\); humans only contribute on the band of length \\((1-\\lambda_t)_+\\), of which fraction \\(h_t\\) is covered by time \\(t\\).\nSelf-improvement (activation threshold \\(\\bar{a}\\), then affine in \\(a_t\\)): \\[\\lambda_{t+1} = \\begin{cases} 0, & a_t &lt; \\bar{a} \\\\ \\alpha + \\beta(a_t - \\bar{a}), & a_t \\ge \\bar{a}. \\end{cases}\\]\nParameters: \\(p\\) (human speed), \\(\\bar{a}\\) (minimum progress to “turn on” the agent), \\(\\alpha\\) (baseline capability once on), \\(\\beta\\) (strength of recursive improvement). Initial condition \\(\\lambda_0\\) (typically 0). Four parameters plus \\(\\lambda_0\\).\n\n\n\n2. Crisp conditions\nA) Activation. With \\(\\lambda_t = 0\\), \\(a_t = h_t \\to 1\\). So the agent can ever turn on iff \\(\\bar{a} &lt; 1\\). If \\(\\bar{a} \\ge 1\\), \\(\\lambda_t \\equiv 0\\) forever. Activation-time approximation: \\(h_t = 1 - p^t \\ge \\bar{a}\\) \\(\\Leftrightarrow\\) \\(t \\ge \\ln(1-\\bar{a})/\\ln p\\); \\(p\\) mainly shifts when activation happens.\nB) Crossing human level. As \\(t \\to \\infty\\), \\(h_t \\to 1\\). If \\(\\lambda_t &lt; 1\\), \\(a_t \\to 1\\); if \\(\\lambda_t \\ge 1\\), \\(a_t = \\lambda_t\\). So asymptotically \\(\\lambda_{t+1} \\to f(1)\\) with \\(f(1) = 0\\) if \\(1 &lt; \\bar{a}\\), and \\(f(1) = \\alpha + \\beta(1-\\bar{a})\\) if \\(1 \\ge \\bar{a}\\). So takeoff past human level (eventually \\(\\lambda &gt; 1\\)) iff \\[\\boxed{\\alpha + \\beta(1-\\bar{a}) &gt; 1.}\\] Interpretation: “If the orchard were fully human-level (\\(a=1\\)), would the next agent be at least human-level?” If not, the system stays below 1. This condition is essentially independent of \\(p\\) (timing, not whether).\nC) Above human level: runaway vs saturation. For \\(\\lambda_t \\ge 1\\), \\(a_t = \\lambda_t\\) and \\[\\lambda_{t+1} = \\alpha + \\beta(\\lambda_t - \\bar{a}).\\]\n\nRunaway / hard takeoff iff \\(\\boxed{\\beta &gt; 1}\\) (roughly geometric growth in \\(\\lambda_t\\)).\nSoft takeoff / saturation iff \\(\\boxed{\\beta &lt; 1}\\): convergence to \\[\\lambda^* = \\frac{\\alpha - \\beta\\bar{a}}{1-\\beta}\\] (provided the system crosses 1 first).\nKnife-edge \\(\\beta = 1\\): linear growth.\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nReferences\n\nAghion, Philippe, Benjamin F. Jones, and Charles I. Jones. 2019. “Artificial Intelligence and Economic Growth.” In The Economics of Artificial Intelligence: An Agenda, edited by Ajay Agrawal, Joshua Gans, and Avi Goldfarb, 237–90. Chicago: University of Chicago Press. https://doi.org/10.7208/9780226613475-011.\n\n\nDavidson, Tom. 2021. “Could Advanced AI Drive Explosive Economic Growth.” Open Philanthropy 25. https://www.semanticscholar.org/search?q=Could%20advanced%20AI%20drive%20explosive%20economic%20growth.\n\n\nDavidson, Tom, Basil Halperin, Thomas Houlden, and Anton Korinek. 2026. “When Does Automating AI Research Produce Explosive Growth?” https://www.basilhalperin.com/papers/shs.pdf.\n\n\nErdil, Ege, Andrei Potlogea, Tamay Besiroglu, Edu Roldan, Anson Ho, Jaime Sevilla, Matthew Barnett, Matej Vrzla, and Robert Sandler. 2025. “GATE: An Integrated Assessment Model for AI Automation.” arXiv Preprint arXiv:2503.04941. https://arxiv.org/pdf/2503.04941.pdf.\n\n\nIde, Enrique, and Eduard Talamas. 2024. “Artificial Intelligence in the Knowledge Economy.” https://doi.org/10.1086/737233.\n\n\nJones, Benjamin F. 2025. “Artificial Intelligence in Research and Development.” NBER Working Paper 34312. National Bureau of Economic Research. https://doi.org/10.3386/w34312.\n\n\nKokotajlo, Daniel, Eli Lifland, Brendan Halstead, and Alex Kastner. 2025. “AI Futures Model: Dec 2025 Update.” https://blog.ai-futures.org/p/ai-futures-model-dec-2025-update.\n\n\nKwa, Thomas. 2026. “Research Note: A Simpler AI Timelines Model Predicts 99% AI r&d Automation in ~2032.” https://www.lesswrong.com/posts/uy6B5rEPvcwi55cBK/research-note-a-simpler-ai-timelines-model-predicts-99-ai-r.\n\n\nMETR. 2024. “An Update on Our General Capability Evaluations.” https://metr.org/blog/2024-08-06-update-on-evaluations/."
239246
}
240247
]

docs/sitemap.xml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,8 @@
2828
<loc>tecunningham.github.io/posts/2026-02-10-model-of-labs.html</loc>
2929
<lastmod>2026-03-10T17:19:54.348Z</lastmod>
3030
</url>
31+
<url>
32+
<loc>tecunningham.github.io/posts/2026-03-13-apple-picking-ai.html</loc>
33+
<lastmod>2026-03-13T16:36:38.188Z</lastmod>
34+
</url>
3135
</urlset>

0 commit comments

Comments
 (0)