You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/index.xml
+26-6Lines changed: 26 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@
31
31
<p>Many people are talking about how AI is autonomously able to contribute to frontier R&D, yet it’s only picking low-hanging fruit: <a href="https://x.com/karpathy/status/2031135152349524125">Andrej Karpathy</a>, <a href="https://www.theatlantic.com/technology/2026/02/ai-math-terrance-tao/686107/">Terence Tao</a>, <a href="https://www.interconnects.ai/p/lossy-self-improvement">Nathan Lambert</a>, <a href="https://www.lesswrong.com/posts/dKpC6wHFqDrGZwnah/ais-can-now-often-do-massive-easy-to-verify-swe-tasks-and-i">Ryan Greenblatt</a>.</p>
32
32
<p>In this note we describe a simple apple-picking model of AI R&D, to help measure the contribution that autonomous agents are making.</p>
33
33
<p>The model takes the metaphor literally: an AI agent helping you to optimize an algorithm is like a robot helping you pick apples. It will take care of all the apples up to a certain height, and it may find apples you haven’t found yet, but there will still be apples out of its reach.</p>
34
-
<p>The model implies that agents can push the frontier forward, but the returns will sharply diminish. It also implies we can meaure an agent’s ability with a human-equivalent time horizon, e.g. as of March 2026 agents seem to be able to find optimizations on frontier algorithms worth about 1 week of professional human labor, yet those effects are not additive.<sup>1</sup> You can’t get two weeks labor by running an agent twice.</p>
34
+
<p>The model implies that agents can push the frontier forward, but the returns will sharply diminish. It also implies we can measure an agent’s ability with a human-equivalent time horizon, e.g. as of March 2026 agents seem to be able to find optimizations on frontier algorithms worth about 1 week of professional human labor, yet those effects are not additive.<sup>1</sup> You can’t get two weeks labor by running an agent twice.</p>
35
35
<p>The basic ideas can all be seen in the drawing below. Here the human and robot are both picking trees. The robot is cheaper to run, but it can only reach the low apples. In the illustration they have both picked 4 apples, yet left the tree in a very different state, such that the robot isn’t ready to replace the human yet:</p>
36
36
<div class="cell" data-layout-align="center">
37
37
<div class="cell-output-display">
@@ -74,7 +74,7 @@
74
74
<p>The model sits between a few different literatures in economics, but as far as I can tell doesn’t already exist:</p>
75
75
<ul>
76
76
<li>Models of tasks or knowledge hierarchies.<sup>2</sup> The apple-picking model differs in that tasks are performed <em>cumulatively</em>, instead of a fresh set of tasks each period.</li>
77
-
<li>Models of discovery.<sup>3</sup> The apple-picking models the discovery landscape in a different way (as far as I’m aware) which gives very simple clear results.</li>
77
+
<li>Models of discovery.<sup>3</sup> The apple-picking model models the discovery landscape in a different way (as far as I’m aware) which gives very simple clear results.</li>
78
78
<li>Models of knowledge accumulation.<sup>4</sup> The apple-picking model differs in that the state of knowledge is not represented by a single scalar, e.g. the stock of ideas, instead the state is the combination of low-hanging and high-hanging ideas.</li>
79
79
</ul>
80
80
</dd>
@@ -98,7 +98,7 @@
98
98
</dd>
99
99
<dt>Traditional models of AI-assisted R&D imply a permanent acceleration.</dt>
100
100
<dd>
101
-
<p>There are two standard models of AI’s effect on R&D used in the RSI literature: (1) it multiplies the effectiveness of human researchers; (2) it replaces human researchers with computer researchers (see more discussion below). Both models imply that when AI capabilities get better it’ll an acceleration in the rate of progress, which persists:</p>
101
+
<p>There are two standard models of AI’s effect on R&D used in the RSI literature: (1) it multiplies the effectiveness of human researchers; (2) it replaces human researchers with computer researchers (see more discussion below). Both models imply that when AI capabilities get better it’ll lead to an acceleration in the rate of progress, which persists:</p>
102
102
<div class="cell" data-layout-align="center">
103
103
<div class="cell-output-display">
104
104
<div class="quarto-figure quarto-figure-center">
@@ -170,7 +170,7 @@
170
170
<dd>
171
171
<ul>
172
172
<li><p><em>Uplift.</em> In practice AI will both (A) accelerate humans; (B) replace humans. The apple-picking model only focuses on the second. In the long-run we expect the second effect to dominate, but it’s not clear where we are now.</p></li>
173
-
<li><p><em>Directed search.</em> We assumed that the probability of finding an apple is independent of other apples already found. Realistically people have an ability to direct their attention, it’s not clear whether the implications would significantly changed.</p></li>
173
+
<li><p><em>Directed search.</em> We assumed that the probability of finding an apple is independent of other apples already found. Realistically people have an ability to direct their attention, it’s not clear whether the implications would significantly change.</p></li>
174
174
<li><p><em>Bottlenecks.</em> The discussion has assumed that progress is purely a function of thinking. In practice there are other bottlenecks, most concretely for AI R&D the reliance on using scarce compute to run experiments. It’s arguable how important compute scarcity is, see for example <span class="citation" data-cites="whitfill2025bottlenecks">Whitfill and Wu (2025)</span>.</p></li>
175
175
</ul>
176
176
</dd>
@@ -236,7 +236,7 @@
236
236
<dl>
237
237
<dt>Apple-picking is a simplification of a landscape-navigation problem.</dt>
238
238
<dd>
239
-
<p>We think of the apple-picking model as a simplification of a more general landscape-navigation model, where you are trying to find a mininmum over a bumpy landscape. You can represent an optimization problem as <img src="https://latex.codecogs.com/png.latex?y=f(%5Cbm%7Bx%7D)">, where you’re trying to choose an <img src="https://latex.codecogs.com/png.latex?%5Cbm%7Bx%7D"> to minimize <img src="https://latex.codecogs.com/png.latex?y">, given some unknown <img src="https://latex.codecogs.com/png.latex?f(%5Ccdot)">. An immediate observation from landscape-navigation problem is that the current elevation is not a sufficient statistic for your state, unless you have a degenerate landscape, like the random landscapes in <span class="citation" data-cites="kortum1997research">Kortum (1997)</span>.</p>
239
+
<p>We think of the apple-picking model as a simplification of a more general landscape-navigation model, where you are trying to find a minimum over a bumpy landscape. You can represent an optimization problem as <img src="https://latex.codecogs.com/png.latex?y=f(%5Cbm%7Bx%7D)">, where you’re trying to choose an <img src="https://latex.codecogs.com/png.latex?%5Cbm%7Bx%7D"> to minimize <img src="https://latex.codecogs.com/png.latex?y">, given some unknown <img src="https://latex.codecogs.com/png.latex?f(%5Ccdot)">. An immediate observation from landscape-navigation problem is that the current elevation is not a sufficient statistic for your state, unless you have a degenerate landscape, like the random landscapes in <span class="citation" data-cites="kortum1997research">Kortum (1997)</span>.</p>
240
240
<p>A reasonable conjecture seems to be that AI agents are good at <em>local</em> optimization, e.g. climbing small hills, worse at <em>global</em> optimization, i.e. finding distant hills.</p>
241
241
<p><a href="https://www.dwarkesh.com/p/terence-tao">Terence Tao has a similar landscape-navigation metaphor</a></p>
242
242
<blockquote class="blockquote">
@@ -394,7 +394,7 @@ Apples are distributed with cumulative distribution function <img src="https://l
394
394
<dt>Human-only labor.</dt>
395
395
<dd>
396
396
<p>With human labor only then apples will grow with diminishing returns: <img src="https://latex.codecogs.com/png.latex?a_t=(1-p%5Et)F(1)."></p>
397
-
<p>This will characterize progress prior to the point at which we can build the first useful robot, <img src="https://latex.codecogs.com/png.latex?a_t%20%5Cgeq%20%5Cbar%7Ba%7D">, after which progress begings to accelerate.</p>
397
+
<p>This will characterize progress prior to the point at which we can build the first useful robot, <img src="https://latex.codecogs.com/png.latex?a_t%20%5Cgeq%20%5Cbar%7Ba%7D">, after which progress begins to accelerate.</p>
398
398
</dd>
399
399
<dt>Recursive self-improvement.</dt>
400
400
<dd>
@@ -410,6 +410,26 @@ With robot labor only (holding <img src="https://latex.codecogs.com/png.latex?h_
<h2 class="anchored" data-anchor-id="embedding-apple-picking-into-jones-model-rd">Embedding Apple-Picking into Jones-model R&D</h2>
416
+
<p>We can embed the apple-picking model into the workhorse R&D function from <span class="citation" data-cites="jones1995rd">C. I. Jones (1995)</span> as follows, with the following implications:</p>
417
+
<ol type="1">
418
+
<li>Human-only effort has constant elasticity.</li>
419
+
<li>Agent effort has <em>declining</em> elasticity.</li>
420
+
<li>Human & agent effort are complements up to a point.</li>
<p>This can be rewritten in a cumulative form, where <img src="https://latex.codecogs.com/png.latex?%5Cbar%7BR%7D_t"> represents the total research (adjusted for congestion) up to time <img src="https://latex.codecogs.com/png.latex?t">, then (assuming <img src="https://latex.codecogs.com/png.latex?A_0"> is small):</p>
<p>We can compare this to our apple-picking function, with only human effort <img src="https://latex.codecogs.com/png.latex?x">: <img src="https://latex.codecogs.com/png.latex?%5Cut%7Ba(x)%7D%7Bapples%7D%20=%201-e%5E%7B-rx%7D"></p>
427
+
<p>These can be reconciled if we assume ideas is a nonlinear function of apples-picked (<img src="https://latex.codecogs.com/png.latex?a">): <img src="https://latex.codecogs.com/png.latex?A(a)=%5Cln%5Cleft(%5Cfrac%7B1%7D%7B1-a(x)%7D%5Cright)%5E%7B1/%5Cbeta%7D."></p>
428
+
<p>We can then consider progress in ideas with both humans and robots picking apples, <img src="https://latex.codecogs.com/png.latex?a(x_H,x_L)">: <img src="https://latex.codecogs.com/png.latex?A(a)=%5Cleft%5B%5Cln%5Cleft(%5Cfrac%7B1%7D%7B%5Clambda%20e%5E%7B-r_Hx_H-r_Ax_A%7D+(1-%5Clambda)e%5E%7B-r_Hx_H%7D%7D%5Cright)%5Cright%5D%5E%7B1/%5Cbeta%7D."></p>
0 commit comments