You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _freeze/2026-02-22-ai-cost-curves/execute-results/html.json
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
{
2
-
"hash": "da214efbafd5d90a693e6f00ae5f01ba",
2
+
"hash": "f38dc45ddf65490b3548aa1309e7dd09",
3
3
"result": {
4
4
"engine": "knitr",
5
-
"markdown": "---\ntitle: AI cost curves\ndraft: true\nengine: knitr\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n---\n\n\n\nIt's super useful to draw plots showing achievement vs expenditure for humans vs computers, like this:\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n{width=672}\n:::\n:::\n\n\n\nYou can read the y-axis in a few ways: (1) score on a benchmark; (2) quality of the output; (3) score on some optimization problem.\n\nSome observations:\n\n1. Today, for a given levcomputers are either cheaper than humans or .\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n{width=672}\n:::\n:::\n",
5
+
"markdown": "---\ntitle: AI cost curves\ndraft: true\nengine: knitr\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n---\n\n\n\nThree spaces\n: \n - three\n \n - three\n\nFour spaces\n: \n - four\n \n - four\n\n\n\n\nObservatinons:\n\n1. Generally, agents are cheaper but less capable.\n2. Can simplify to say agents are free, without much loss.\n3. We can see three types of growth: (A) cheaper inference; (B) ; \n4. This theory maps exactly to time horizon. \n5. Can derive these curves from a theory (???).\n\n\n\nIt's super useful to draw plots showing achievement vs expenditure for humans vs computers, like this:\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n{width=672}\n:::\n:::\n\n\n\nYou can read the y-axis in a few ways: (1) score on a benchmark; (2) quality of the output; (3) score on some optimization problem.\n\nSome observations:\n\n1. Today, for a given levcomputers are either cheaper than humans or .\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n{width=672}\n:::\n:::\n",
Copy file name to clipboardExpand all lines: docs/index.xml
+18-9Lines changed: 18 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -60,7 +60,7 @@
60
60
<p>Some potential applications for knowledge-generating inference: (1) optimize every part of the AI R&D and serving stack; (2) do drug discovery; (3) discover new algorithms which can be used in new software (e.g. new codecs, new scheduling algorithms); (4) build better trading algorithms; (5) if you have a sufficiently high-quality verifier to human preferences, then build very high-quality cultural products, e.g. movies.</p>
61
61
<p>Knowledge-creating LLMs will differ from knowledge-sharing LLMs in a number of ways:</p>
62
62
<ul>
63
-
<li><p>Knowledge-creating LLMs will have qualitatively different benchmarks: instead of seeing if they can answer questions which we already know the answer to (most existing benchmarks), we want them to answer <em>new</em> questions, e.g. solve an unsolved mathematical problem (<a href="https://epoch.ai/frontiermath/open-problems">FrontierMath Open Problems</a>) or set a new record on an optimization problem (e.g. GSO-bench, <span class="citation" data-cites="shetty2025gso">Shetty et al. (2025)</span>). We can use these new frontier benchmarks are indices for capability, but they are more challenging to interpret because the frontier is always moving.</p></li>
63
+
<li><p>Knowledge-creating LLMs will have qualitatively different benchmarks: instead of seeing if they can answer questions which we already know the answer to (most existing benchmarks), we want them to answer <em>new</em> questions, e.g. solve an unsolved mathematical problem (<a href="https://epoch.ai/frontiermath/open-problems">FrontierMath Open Problems</a>) or set a new record on an optimization problem (e.g. GSO-bench, <span class="citation" data-cites="shetty2025gso">Shetty et al. (2025)</span>). We can use these new frontier benchmarks as indices for capability, but they are more challenging to interpret because the frontier is always moving.</p></li>
64
64
<li><p>Knowledge-creating LLMs have high returns to compute on individual problems, unlike knowledge-sharing LLMs for which returns asymptote quickly. It can be worth spending billions of tokens to solve a single problem if the solution is generally applicable.</p></li>
65
65
</ul>
66
66
<ul>
@@ -370,19 +370,19 @@ You have one friend who is full of new ideas, you have another friend who can te
<td><em>“By 2040, output is only 4% higher than it would have been without the growth acceleration, and by 2060 the gain is still only 19%. A key reason for the slow acceleration is the prominence of”weak links” (an elasticity of substitution among tasks less than one).”</em></td>
<td><em>“I can see a world where A.I. brings the developed world G.D.P. growth to something like 10, 15 percent. Five, 10, 15 — I mean there’s no science of calculating these numbers. It’s a totally unprecedented thing. But it could bring it to numbers that are outside the distribution of what we saw before.”</em></td>
590
+
</tr>
585
591
</tbody>
586
592
</table>
587
593
</section>
@@ -725,6 +731,9 @@ Aghion, Philippe, Benjamin F. Jones, and Charles I. Jones. 2019. <span>“Artifi
725
731
<div id="ref-bis2024impact" class="csl-entry">
726
732
Aldasoro, Iñaki, Sebastian Doerr, Leonardo Gambacorta, and Divya Sharma. 2024. <span>“The Impact of AI on Output and Inflation.”</span> BIS Bulletin 85. Bank for International Settlements. <a href="https://www.bis.org/publ/work1179.pdf">https://www.bis.org/publ/work1179.pdf</a>.
Amodei, Dario. 2026. <span>“How to Build an a.i. Economy.”</span> February 12, 2026. <a href="https://www.nytimes.com/2026/02/12/opinion/artificial-intelligence-anthropic-amodei.html">https://www.nytimes.com/2026/02/12/opinion/artificial-intelligence-anthropic-amodei.html</a>.
Arnon, Alex. 2025. <span>“The Projected Impact of Generative AI on Future Productivity Growth.”</span> Brief. Penn Wharton Budget Model. <a href="https://budgetmodel.wharton.upenn.edu/issues/2025/9/8/projected-impact-of-generative-ai-on-future-productivity-growth">https://budgetmodel.wharton.upenn.edu/issues/2025/9/8/projected-impact-of-generative-ai-on-future-productivity-growth</a>.
<td><em>“By 2040, output is only 4% higher than it would have been without the growth acceleration, and by 2060 the gain is still only 19%. A key reason for the slow acceleration is the prominence of”weak links” (an elasticity of substitution among tasks less than one).”</em></td>
<td><em>“I can see a world where A.I. brings the developed world G.D.P. growth to something like 10, 15 percent. Five, 10, 15 — I mean there’s no science of calculating these numbers. It’s a totally unprecedented thing. But it could bring it to numbers that are outside the distribution of what we saw before.”</em></td>
Aldasoro, Iñaki, Sebastian Doerr, Leonardo Gambacorta, and Divya Sharma. 2024. <span>“The Impact of AI on Output and Inflation.”</span> BIS Bulletin 85. Bank for International Settlements. <ahref="https://www.bis.org/publ/work1179.pdf">https://www.bis.org/publ/work1179.pdf</a>.
Amodei, Dario. 2026. <span>“How to Build an a.i. Economy.”</span> February 12, 2026. <ahref="https://www.nytimes.com/2026/02/12/opinion/artificial-intelligence-anthropic-amodei.html">https://www.nytimes.com/2026/02/12/opinion/artificial-intelligence-anthropic-amodei.html</a>.
Arnon, Alex. 2025. <span>“The Projected Impact of Generative AI on Future Productivity Growth.”</span> Brief. Penn Wharton Budget Model. <ahref="https://budgetmodel.wharton.upenn.edu/issues/2025/9/8/projected-impact-of-generative-ai-on-future-productivity-growth">https://budgetmodel.wharton.upenn.edu/issues/2025/9/8/projected-impact-of-generative-ai-on-future-productivity-growth</a>.
0 commit comments