You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _freeze/posts/2026-05-06-paper-on-effects/execute-results/html.json
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
{
2
-
"hash": "1e0e6d4835475f40e83bdac8024a15e8",
2
+
"hash": "4609f8c41aec2deebbb14be95a1b2671",
3
3
"result": {
4
4
"engine": "knitr",
5
-
"markdown": "---\ntitle: Paper on Effects of Efficiency\nauthor: Tom Cunningham\nbibliography: ai.bib\ndate: today\ndraft: true\nengine: knitr\nreference-location: document\ncitation-location: document\n#toc: true\n#toc-location: left-body\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n# format:\n# pdf:\n# pdf-engine: pdflatex\n# include-in-header:\n# - text: |\n# \\usepackage[utf8]{inputenc}\n# \\usepackage{bm}\n# \\usepackage[all,2cell]{xy}\n# \\newcommand{\\utt}[3]{\\underbrace{#1}_{\\substack{\\text{#2}\\\\\\text{#3}}}}\n---\n\nQuestion: does automated AI R&D result in a fast takeoff?\n: \n Suppose we successfully automate AI R&D, so that we have an agent that can substitute for human AI researchers, what will be the effect on capabilities progress?\n \n What data from a lab would help answer this question?\n\nThe key data is the correlation between R&D investment and algorithmic efficiency.\n: \n If algorithmic progress is very sensitive to R&D effort then automating R&D would have a big effect, and vice versa. So the core useful data would be the following:\n\n - R&D investment (number of FTE researchers, maybe weighted by salary)\n - algorithmic efficiency\n \n\nA fuller model would split out different parts of the production.\n: \n We also want to account for:\n\n - Experiment compute\n - Data\n - (etc.)\n\nBasic data requests.\n: \n Data for each team and each quarter:\n\n - Inputs: researchers, compoute, data.\n - Outputs: algorithmic efficiency.\n \n We also want survey data giving best-estimates on substitutability. Ask them hypotheticals about how much progress they would get .\n\nOut of scope: estimating R&D automation.\n: \n There are lots of related questions about R&D automation:\n\n - How much uplift is AI giving to researchers?\n - How close substitutes are agentic and human workers?\n - Is agentic R&D less experiment-efficient than human R&D? (this is potentially in-scope).\n\n\n\n# Core Model\n\nCore model: R&D and algorithmic efficiency.\n: \n We start with this basic model:\n $$\\xymatrix@C=3em@R=1.4em{\n *++[F]{\\text{R\\&D}}\\ar[r]|(0.4)r\n & *++[F]{\\text{algorithmic}\\atop\\text{efficiency}}\n }\n $$\n\n A very simple condition:\n $$r=\\frac{\\Delta\\ln(\\text{algorithmic efficiency})}{\\Delta \\ln(\\text{R\\&D})}$$\n \n This would allow us to estimate $r$, and immediately know whether to expect a takeoff.\n\n\nLog data to collect.\n: \n For each research area (pretraining, midtraining, etc.) and in each quarter:\n \n - R&D investment (number of FTE researchers, maybe weighted by salary)\n - compute efficiency\n\n | quarter | researchers | researcher salaries | compute efficiency win |\n | ------- | :---------: | :-----------------: | :--------------------: |\n | 2025Q1 | 3 | $30M | 20% |\n | 2025Q2 | 5 | $50M | 20% |\n | ... | | | |\n \n The absolute value are sensitive, but we only need to know the relative numbers to estimate the relationship.\n\n Hypothetical scatter plot (each point is a quarter):\n\n\n ::: {.cell layout-align=\"center\"}\n ::: {.cell-output-display}\n {fig-align='center' width=384}\n :::\n :::\n\n\nSurvey data to collect.\n: \n There are many reasons why the log data will be imperfect, we can ask the following question:\n\n - Q: if you had 2X as many researchers last quarter, how much larger do you think your compute efficiency gains would be? (hold fixed experiment compute, training compute, data).\n\n\nComplication: scale-dependent efficiency.\n: \n @gundlach2025algorithmicprogressai argue that (1) many algorithms have different efficiency at different scales; (2) most algorithmic efficiency growth over the past 10 years was due to the Transformer. I don't think this is a big worry for us, because we're just looking at the last few years, and in fact model scale hasn't changed so enormously (though I'm not sure if the Gundlach paper's \"scale\" refers to parameters or to training compute).\n\nComplication: limits of compute efficiency.\n: \n Training compute efficiency can be an imperfect metric: (1) some algorithms shift the asymptote; (2) some algorithms change the inference-time efficiency.\n\n\n# Adding Experiment Compute\n\nWe can add experiment compute.\n: \n We want to know the relative importance of R&D labor and experiment compute. We can write this as follows, the $\\sigma$ refers to an elasticity of substitution.\n\n\n ::: {.cell}\n ::: {.cell-output-display}\n {width=672}\n :::\n :::\n\n\n If $\\sigma=0.5$, this means R&D and experiment-compute are strong complements, and having infinite R&D labor will only increase algorithmic efficiency by around 2X (assuming constant returns to scale).\n\nIt's hard to identify substitutability from historical data.\n: \n Suppose we had the following data:\n\n | quarter | researchers | experiment compute | |\n | ------- | ----------- | ----------------- | --- |\n | 2025Q1 | | | |\n | 2025Q2 | | | |\n | | | | |\n\n\n\n# Fuller Model of Capabilities\n\nA fuller model of AI R&D.\n: \n\n ::: {.cell}\n ::: {.cell-output-display}\n {width=672}\n :::\n :::\n\n\nSurvey data to collect.\n: \n What would your best-guess be at compute-equivalent gains in the following scenarios, over the last quarter:\n\n | researchers | experiment compute | training compute | data | **guess at gains?** |\n | :---------: | :----------------: | :--------------: | :-----: | :-----------------: |\n | 2X | (fixed) | (fixed) | (fixed) | ___ |\n | (fixed) | 2X | (fixed) | (fixed) | ___ |\n | (fixed) | (fixed) | 2X | (fixed) | ___ |\n | (fixed) | (fixed) | (fixed) | 2X | ___ |\n\n",
5
+
"markdown": "---\ntitle: Paper on Effects of Efficiency\nauthor: Tom Cunningham\nbibliography: ai.bib\ndate: today\ndraft: true\nengine: knitr\nreference-location: document\ncitation-location: document\n#toc: true\n#toc-location: left-body\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n---\n\n<!-- https://tecunningham.github.io/posts/2026-05-06-paper-on-effects.html -->\n\nQuestion: does automated AI R&D result in a fast takeoff?\n: \n Suppose we successfully automate AI R&D, so that we have an agent that can substitute for human AI researchers, what will be the effect on capabilities progress?\n \n What data from a lab would help answer this question?\n\nThe key data is the correlation between R&D investment and algorithmic efficiency.\n: \n If algorithmic progress is very sensitive to R&D effort then automating R&D would have a big effect, and vice versa. So the core useful data would be the following:\n\n - R&D investment (number of FTE researchers, maybe weighted by salary)\n - algorithmic efficiency\n \n\nA fuller model would split out different parts of the production.\n: \n We also want to account for:\n\n - Experiment compute\n - Data\n - (etc.)\n\nBasic data requests.\n: \n Data for each team and each quarter:\n\n - Inputs: researchers, compoute, data.\n - Outputs: algorithmic efficiency.\n \n We also want survey data giving best-estimates on substitutability. Ask them hypotheticals about how much progress they would get .\n\nOut of scope: estimating R&D automation.\n: \n There are lots of related questions about R&D automation:\n\n - How much uplift is AI giving to researchers?\n - How close substitutes are agentic and human workers?\n - Is agentic R&D less experiment-efficient than human R&D? (this is potentially in-scope).\n\n\n\n# Core Model\n\nCore model: R&D and algorithmic efficiency.\n: \n We start with this basic model:\n $$\\xymatrix@C=3em@R=1.4em{\n *++[F]{\\text{R\\&D}}\\ar[r]|(0.4)r\n & *++[F]{\\text{algorithmic}\\atop\\text{efficiency}}\n }\n $$\n\n A very simple condition:\n $$r=\\frac{\\Delta\\ln(\\text{algorithmic efficiency})}{\\Delta \\ln(\\text{R\\&D})}$$\n \n This would allow us to estimate $r$, and immediately know whether to expect a takeoff.\n\n\nLog data to collect.\n: \n For each research area (pretraining, midtraining, etc.) and in each quarter:\n\n | quarter | researchers | researcher salaries | compute efficiency win |\n | ------- | :---------: | :-----------------: | :--------------------: |\n | 2025Q1 | 3 | $30M | +20% |\n | 2025Q2 | 5 | $50M | +30% |\n | ... | ... | ... | ... |\n\n \n The absolute value are sensitive, but we only need to know the relative numbers to estimate the relationship. Hypothetical scatter plot (each point is a quarter):\n\n\n ::: {.cell layout-align=\"center\"}\n ::: {.cell-output-display}\n {fig-align='center' width=384}\n :::\n :::\n\n\nSurvey data to collect.\n: \n There are many reasons why the log data will be imperfect, we can ask the following question:\n\n > If you had 2X as many researchers last quarter, how much larger do you think your compute efficiency gains would be? (hold fixed experiment compute, training compute, data).\n\n\nComplication: scale-dependent efficiency.\n: \n @gundlach2025algorithmicprogressai argue that (1) many algorithms have different efficiency at different scales; (2) most algorithmic efficiency growth over the past 10 years was due to the Transformer. I don't think this is a big worry for us, because we're just looking at the last few years, and in fact model scale hasn't changed so enormously (though I'm not sure if the Gundlach paper's \"scale\" refers to parameters or to training compute).\n\nComplication: limits of compute efficiency.\n: \n Training compute efficiency can be an imperfect metric: (1) some algorithms shift the asymptote; (2) some algorithms change the inference-time efficiency.\n\n\n# Adding Experiment Compute\n\nWe can add experiment compute.\n: \n We want to know the relative importance of R&D labor and experiment compute. We can write this as follows, the $\\sigma$ refers to an elasticity of substitution.\n\n\n ::: {.cell layout-align=\"center\"}\n ::: {.cell-output-display}\n {fig-align='center' width=288}\n :::\n :::\n\n\n If $\\sigma=0.5$, this means R&D and experiment-compute are strong complements, and having infinite R&D labor will only increase algorithmic efficiency by around 2X (assuming constant returns to scale).\n\nIt's harder to identify substitutability from historical data.\n: \n Suppose we had the following data:\n\n | quarter | researcher expenditure | experiment expenditure | compute efficiency |\n | ------- | :--------------------: | :--------------------: | :----------------: |\n | 2025Q1 | $50M | $50M | +20% |\n | 2025Q2 | $50M | $50M | +20% |\n | | | | |\n\n The fact that we're spending on both researchers and experiments tells us that they're complements, but doesn't tell us how strong the complementarity it.\n\n\n# Fuller Model of Capabilities\n\nA fuller model of AI R&D.\n: \n\n ::: {.cell}\n ::: {.cell-output-display}\n {width=672}\n :::\n :::\n\n\nSurvey data to collect.\n: \n What would your best-guess be at compute-equivalent gains in the following scenarios, over the last quarter:\n\n | researchers | experiment compute | training compute | data | **guess at gains?** |\n | :---------: | :----------------: | :--------------: | :-----: | :-----------------: |\n | 2X | (fixed) | (fixed) | (fixed) | ___ |\n | (fixed) | 2X | (fixed) | (fixed) | ___ |\n | (fixed) | (fixed) | 2X | (fixed) | ___ |\n | (fixed) | (fixed) | (fixed) | 2X | ___ |\n\n",
Copy file name to clipboardExpand all lines: docs/posts/2026-05-06-paper-on-effects.html
+38-30Lines changed: 38 additions & 30 deletions
Original file line number
Diff line number
Diff line change
@@ -311,11 +311,13 @@ <h1>Core Model</h1>
311
311
<dt>Log data to collect.</dt>
312
312
<dd>
313
313
<p>For each research area (pretraining, midtraining, etc.) and in each quarter:</p>
314
-
<ul>
315
-
<li>R&D investment (number of FTE researchers, maybe weighted by salary)</li>
316
-
<li>compute efficiency</li>
317
-
</ul>
318
314
<tableclass="caption-top table">
315
+
<colgroup>
316
+
<colstyle="width: 11%">
317
+
<colstyle="width: 18%">
318
+
<colstyle="width: 32%">
319
+
<colstyle="width: 37%">
320
+
</colgroup>
319
321
<thead>
320
322
<trclass="header">
321
323
<th>quarter</th>
@@ -329,24 +331,23 @@ <h1>Core Model</h1>
329
331
<td>2025Q1</td>
330
332
<tdstyle="text-align: center;">3</td>
331
333
<tdstyle="text-align: center;">$30M</td>
332
-
<tdstyle="text-align: center;">20%</td>
334
+
<tdstyle="text-align: center;">+20%</td>
333
335
</tr>
334
336
<trclass="even">
335
337
<td>2025Q2</td>
336
338
<tdstyle="text-align: center;">5</td>
337
339
<tdstyle="text-align: center;">$50M</td>
338
-
<tdstyle="text-align: center;">20%</td>
340
+
<tdstyle="text-align: center;">+30%</td>
339
341
</tr>
340
342
<trclass="odd">
341
343
<td>…</td>
342
-
<tdstyle="text-align: center;"></td>
343
-
<tdstyle="text-align: center;"></td>
344
-
<tdstyle="text-align: center;"></td>
344
+
<tdstyle="text-align: center;">…</td>
345
+
<tdstyle="text-align: center;">…</td>
346
+
<tdstyle="text-align: center;">…</td>
345
347
</tr>
346
348
</tbody>
347
349
</table>
348
-
<p>The absolute value are sensitive, but we only need to know the relative numbers to estimate the relationship.</p>
349
-
<p>Hypothetical scatter plot (each point is a quarter):</p>
350
+
<p>The absolute value are sensitive, but we only need to know the relative numbers to estimate the relationship. Hypothetical scatter plot (each point is a quarter):</p>
350
351
<divclass="cell" data-layout-align="center">
351
352
<divclass="cell-output-display">
352
353
<divclass="quarto-figure quarto-figure-center">
@@ -360,9 +361,9 @@ <h1>Core Model</h1>
360
361
<dt>Survey data to collect.</dt>
361
362
<dd>
362
363
<p>There are many reasons why the log data will be imperfect, we can ask the following question:</p>
363
-
<ul>
364
-
<li>Q: if you had 2X as many researchers last quarter, how much larger do you think your compute efficiency gains would be? (hold fixed experiment compute, training compute, data).</li>
365
-
</ul>
364
+
<blockquoteclass="blockquote">
365
+
<p>If you had 2X as many researchers last quarter, how much larger do you think your compute efficiency gains would be? (hold fixed experiment compute, training compute, data).</p>
<p>We want to know the relative importance of R&D labor and experiment compute. We can write this as follows, the <spanclass="math inline">\(\sigma\)</span> refers to an elasticity of substitution.</p>
<p>If <spanclass="math inline">\(\sigma=0.5\)</span>, this means R&D and experiment-compute are strong complements, and having infinite R&D labor will only increase algorithmic efficiency by around 2X (assuming constant returns to scale).</p>
393
394
</dd>
394
-
<dt>It’s hard to identify substitutability from historical data.</dt>
395
+
<dt>It’s harder to identify substitutability from historical data.</dt>
<p>The fact that we’re spending on both researchers and experiments tells us that they’re complements, but doesn’t tell us how strong the complementarity it.</p>
0 commit comments