Skip to content

Commit 5a9df18

Browse files
committed
commit
1 parent 4acae7c commit 5a9df18

97 files changed

Lines changed: 56343 additions & 754 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"hash": "4a812245c779b881abde10049b667605",
3+
"result": {
4+
"engine": "knitr",
5+
"markdown": "---\ntitle: An Apple-Picking Model of Agents\nauthor:\n - name: Tom Cunningham\n affiliation: METR\nbibliography: ai.bib\ndate: today\ndraft: true\nengine: knitr\nreference-location: document\ncitation-location: document\n#toc: true\n#toc-location: left-body\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n---\n\n\n<style>\n h1 { border-bottom: 4px solid black; }\n h2 { border-bottom: 1px solid gray; padding-bottom: 0px; color: black; }\n dl {display: grid;}\n dt {grid-column-start: 1; width: 6cm;}\n dd {grid-column-start: 2; margin-left: 2em;}\n</style>\n\n\nAn apple-picking model of AI work.\n: \n Here's a simple model which I find useful to think about agent vs human ability in solving problems.\n\n In short: an agent helping you do programming work is like having a robot help you pick apples. It will take care of all the apples up to a certain height, and find apples you haven't found, but there will still be apples out of its reach.\n\n ![](images/2026-03-12-11-32-34.png)\n\n The motivation was to help make sense of the recent ability of LLMs to autonomously push the frontier forward on various optimization and AI R&D problems. If they can make genuine discoveries themselves, at far lower cost than humans, then why aren't we seeing a complete displacement of human work? It must be because the agents' discoveries are in some sense \"shallow\", and indeed much commentary on agent optimization is that the discoveries are not yet truly novel.\n\nImplications of the apple-picking model.\n: \n 1. Agents will be able to advance the human state-of-the-art on various optimization problems.\n 2. Agents will have relatively bigger value (relative to humans) for problems that are not yet heavily optimized.\n 3. To gauge the ability of agents we want to test not just for their ability to improve performance, but their *reach*.\n\nOther notes.\n: \n *Landscape.* A more general version would model the entire *landscape*. You can represent an optimization problem as $y-f(\\bm{x})$, where you're trying to choose an $\\bm{x}$ to maximize $y$, given some unknown $f(\\cdot)$.\n\n *Existing models of AI R&D.*\n\n *Implications for AI R&D.*\n\n - LLMs are suddenly able to optimize algorithms pretty well --- maybe recursive-self-improvement has just kicked off. But the critical question is whether the fruit that it's picking are low-hanging. If all the optimizations are routine, then it'll run out pretty quickly.\n\n - LLM training is a big stack of algorithms, which we've been optimizing at perhaps 10X/year.\n\n\n| | |\n| ----------------------------------------- | ------------- |\n| AI replaces R&D workers. | Davidson |\n| AI automates a subset of general tasks | (some models) |\n| AI automates a subset of R&D tasks. | (most models) |\n| AI increases productivity of R&D workers. | ?? |\n| AI replace *low quality* R&D workers. | ?? |\n| (other mechanisms?) | |\n\n--------------------------------------------\n\n**Motivation:**\n\n- Agents are showing the ability to autonomously optimize AI training, beating the human state-of-the-art.\n- Where does this end? If you can pay some tokens to improve the training function by $x$% better, then ....\n\n- In most models of RSI, agents only automate subtasks in R&D, but now they seem to be doing the whole thing autonomously.\n- Intuitively they're only doing *shallow* optimizations, not deep optimizations.\n- We want a model so that we can define what it means to only do shallow optimizations, and test whether agents obey these predictions.\n- Testable predictions: (A) agents asymptote to a lower value than humans; (B) agents are relatively more useful for problems which have had less human time on them; (C) agents will asymptote to higher values if they are given better human starting points.\n\n# Basic Model\n\nSetup.\n: There are a continuum of apples spready uniformly on the real line.\n \n A human can find apples over $[0,1]$, but agent can only find apples over $[0,\\lambda]$, with $\\lambda<1$ (at least for now).\n \n Humans find apples at rate $r_H$, agents find apples at rate $r_A$, and we use $t_H$ and $t_A$ to represent the time humans and agents spend searching (you can also interpret $t_H$ and $t_A$ as expenditure on the problem).\n\n**We can then derive apples found:**\n\n $$\\text{share apples found}= \\underbrace{\\lambda(1-e^{-r_Ht_H+r_At_A})}_{\\text{apples from bottom of tree}}+\\underbrace{(1-\\lambda)(1-e^{-r_Ht_H})}_{\\text{apples from top of tree}}.$$\n\nImplication: agents asymptote to a lower level than humans.\n: \n Here we illustrate agent-only and human-only search curves: the agent curve rises more quickly ($r_A>r_H$), but asymptotes to a lower level ($\\lambda<1$).\n\n\n\n ::: {.cell}\n ::: {.cell-output-display}\n ![](2026-03-13-apple-picking-ai_files/figure-html/unnamed-chunk-1-1.png){width=672}\n :::\n :::\n\n\n\n The shape of these curves is a good match for what we see across tasks, e.g. in @metr2024capability. For most tasks either (1) an agent can do it much cheaper than a human; or (2) an agent can't do it at all.\n\n ![](images/2026-03-13-05-35-22.png)\n\n\n\n\n\n\n\n\n\nImplication: agents can improve on human SoTA, but only by a limited amount.\n: \n We can now draw the \n We can see that agents can improve on the humans' SoTA performance, but in each case the value asymptotes.\n\n\n\n ::: {.cell}\n ::: {.cell-output-display}\n ![](2026-03-13-apple-picking-ai_files/figure-html/unnamed-chunk-3-1.png){width=672}\n :::\n :::\n\n\n\n\nImplication: agent value and asymptote depends on starting point.\n: \n The plot below shows agent trajectories, each starting after a different amount of human work.\n\n If you start an agent from scratch, it will have high marginal value, but a low asymptote.\n\n If you start an agent after some human optimization, it will have lower marginal value, but it will be able to achieve a higher asymptote (below, starting after 1 unit of human search raises the agent asymptote from 0.5 to 0.7).\n\n\n\n ::: {.cell}\n ::: {.cell-output-display}\n ![](2026-03-13-apple-picking-ai_files/figure-html/unnamed-chunk-4-1.png){width=672}\n :::\n :::\n\n\n\n\n\n\n\n## Closed Apple-Picking Model\n\n**Setup:**\n\n1. Apples sit at heights in $[0,\\infty)$; human reach is normalized to 1. The agent has reach $\\lambda_t \\ge 0$ and picks everything below it.\n2. Humans pick in the band $(\\lambda_t, 1]$ at a rate governed by $p$.\n3. Agent reach depends on cumualtive apples harvested: they can only pick apples after some minimum threshold ($\\bar{a}$), and then linear in $a$.\n\n**Implications:**\n\n1. Agents will get taller than humans iff $\\alpha + \\beta(1-\\bar{a}) > 1$\n2. Agent height will be explosive iff $\\beta >1$, i.e. if eating all the apples in a 1-cm slice of tree causes you to grow 1cm higher. If not then you converge to a finite height $\\lambda^*$.\n\n\n### 1. State variables and dynamics\n\nNormalize human reach to 1.\n\n- **$\\lambda_t \\ge 0$**: agent reach (how high the AI can pick).\n- **$h_t \\in [0,1]$**: human coverage of the human-only band $(\\lambda_t, 1]$ (fraction of that band already picked by humans).\n\n**Human dynamics** (one parameter $p \\in (0,1)$): per period, a fraction $1-p$ of the remaining human-level band gets picked, so\n$$h_{t+1} = 1 - p(1-h_t), \\qquad h_0 = 0.$$\n(Equivalently $h_t = 1 - p^t$; the recursion keeps the model closed and autonomous.)\n\n**Apples harvested** (agent + humans, with clipping at 1):\n$$a_t = \\lambda_t + (1-\\lambda_t)_+ \\, h_t, \\qquad (x)_+ \\equiv \\max\\{x,0\\}.$$\nAgent gets everything up to $\\lambda_t$; humans only contribute on the band of length $(1-\\lambda_t)_+$, of which fraction $h_t$ is covered by time $t$.\n\n**Self-improvement** (activation threshold $\\bar{a}$, then affine in $a_t$):\n$$\\lambda_{t+1} = \\begin{cases} 0, & a_t < \\bar{a} \\\\ \\alpha + \\beta(a_t - \\bar{a}), & a_t \\ge \\bar{a}. \\end{cases}$$\n\nParameters: **$p$** (human speed), **$\\bar{a}$** (minimum progress to “turn on” the agent), **$\\alpha$** (baseline capability once on), **$\\beta$** (strength of recursive improvement). Initial condition $\\lambda_0$ (typically 0). Four parameters plus $\\lambda_0$.\n\n---\n\n### 2. Crisp conditions\n\n**A) Activation.** With $\\lambda_t = 0$, $a_t = h_t \\to 1$. So the agent can ever turn on **iff $\\bar{a} < 1$**. If $\\bar{a} \\ge 1$, $\\lambda_t \\equiv 0$ forever. Activation-time approximation: $h_t = 1 - p^t \\ge \\bar{a}$ $\\Leftrightarrow$ $t \\ge \\ln(1-\\bar{a})/\\ln p$; $p$ mainly shifts *when* activation happens.\n\n**B) Crossing human level.** As $t \\to \\infty$, $h_t \\to 1$. If $\\lambda_t < 1$, $a_t \\to 1$; if $\\lambda_t \\ge 1$, $a_t = \\lambda_t$. So asymptotically $\\lambda_{t+1} \\to f(1)$ with $f(1) = 0$ if $1 < \\bar{a}$, and $f(1) = \\alpha + \\beta(1-\\bar{a})$ if $1 \\ge \\bar{a}$. So **takeoff past human level** (eventually $\\lambda > 1$) **iff**\n$$\\boxed{\\alpha + \\beta(1-\\bar{a}) > 1.}$$\nInterpretation: “If the orchard were fully human-level ($a=1$), would the next agent be at least human-level?” If not, the system stays below 1. This condition is essentially independent of $p$ (timing, not whether).\n\n**C) Above human level: runaway vs saturation.** For $\\lambda_t \\ge 1$, $a_t = \\lambda_t$ and\n$$\\lambda_{t+1} = \\alpha + \\beta(\\lambda_t - \\bar{a}).$$\n- **Runaway / hard takeoff** iff $\\boxed{\\beta > 1}$ (roughly geometric growth in $\\lambda_t$).\n- **Soft takeoff / saturation** iff $\\boxed{\\beta < 1}$: convergence to\n$$\\lambda^* = \\frac{\\alpha - \\beta\\bar{a}}{1-\\beta}$$\n(provided the system crosses 1 first).\n- **Knife-edge** $\\beta = 1$: linear growth.\n\n---\n\n### 3. Illustrations\n\nThe figures below implement this model: four canonical trajectories, phase diagram in $(\\alpha,\\beta)$, cobweb plots of the asymptotic map, and sensitivity to $p$ and $\\bar{a}$.\n\n\n\n\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![Four regimes: no activation (<U+0101><U+2265>1), activated but stuck (f(1)<1), soft takeoff (f(1)>1, <U+03B2><1), hard takeoff (f(1)>1, <U+03B2>>1)](2026-03-13-apple-picking-ai_files/figure-html/apple-rsi-four-regimes-1.png){fig-align='center' width=672}\n:::\n:::\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-03-13-apple-picking-ai_files/figure-html/unnamed-chunk-7-1.png){width=672}\n:::\n:::\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![Phase diagram: f(1)=1 (cross human level) and <U+03B2>=1 (runaway vs saturation) at fixed <U+0101> = 0.3](2026-03-13-apple-picking-ai_files/figure-html/apple-rsi-phase-1.png){fig-align='center' width=528}\n:::\n:::\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![Cobweb: asymptotic map <U+03BB>_{t+1} = f(max(<U+03BB>_t,1)); soft (<U+03B2><1) vs hard (<U+03B2>>1)](2026-03-13-apple-picking-ai_files/figure-html/apple-rsi-cobweb-1.png){fig-align='center' width=672}\n:::\n:::\n\n::: {.cell layout-align=\"center\"}\n::: {.cell-output-display}\n![Sensitivity: effect of p (left) and of <U+0101> (right) on <U+03BB>_t trajectory](2026-03-13-apple-picking-ai_files/figure-html/apple-rsi-sensitivity-1.png){fig-align='center' width=672}\n:::\n:::\n",
6+
"supporting": [],
7+
"filters": [
8+
"rmarkdown/pagebreak.lua"
9+
],
10+
"includes": {},
11+
"engineDependencies": {},
12+
"preserve": {},
13+
"postProcess": true
14+
}
15+
}
87.6 KB
Loading
136 KB
Loading
91.8 KB
Loading
147 KB
Loading
101 KB
Loading
85.8 KB
Loading
169 KB
Loading
Binary file not shown.
46.9 KB
Loading

0 commit comments

Comments
 (0)