Skip to content

Commit 4825c39

Browse files
committed
commit
1 parent a640157 commit 4825c39

15 files changed

Lines changed: 659 additions & 248 deletions

File tree

2026-02-22-ai-cost-curves-tikz-helpers.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
% normal-CDF approximation: Phi(z) ~= 1/(1+exp(-1.702 z))
33
\newcommand{\AICostAxes}{%
44
\draw[black] (0,0) rectangle (2,1);
5-
\draw[->] (0,0) -- node[midway,below] {cost} (2.2,0);
5+
\draw[->] (0,0) -- node[midway,below] {inference expenditure} (2.2,0);
66
\draw[->] (0,0) -- node[above,midway,rotate=90] {achievement} (0,1.2);
77
}
88
% Args: {mu}{sigma}{asymptote}{color}{label}{yshift}

2026-02-22-ai-cost-curves.qmd

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,12 @@ Observations:
2222
5. This observation is a nice fit for time horizon (more discussion required)
2323
6. Q: can you derive these curves from a theory of task complexity?
2424

25+
TODO:
26+
27+
1. Probably plot ln(expenditure).
28+
2. Show separate graphs for inference cost & training cost.
29+
3. Observation: for many things it doesn't matter if we don't have cardinal scale of quality, as long as we have *ordinal* scale. E.g. we can still talk about cost reduction, & something about scale effects.
30+
2531
Basic plot:
2632

2733

_freeze/2026-02-22-ai-cost-curves/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
{
2-
"hash": "c8be30058109d0999d2e5f6ffb7fa512",
2+
"hash": "6f21e0650490739210a8c03810644f30",
33
"result": {
44
"engine": "knitr",
5-
"markdown": "---\ntitle: AI cost curves\ndraft: true\nengine: knitr\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n---\n\n\n\nIt's useful to draw plots showing achievement vs expenditure, comparing humans & agents.\n\nYou can read the y-axis in a few ways: (1) score on a benchmark; (2) quality of the output; (3) score on an optimization problem.\n\nObservations:\n\n1. In most cases agents are cheaper than humans but hit a ceiling in capability.\n2. Can simplify to say agents are free, without much loss.\n3. We can see three types of agent growth: (A) cheaper inference; (B) expanded capabilities; (C) test-time growth. \n4. Distillation shifts cost curves left.\n5. This observation is a nice fit for time horizon (more discussion required)\n6. Q: can you derive these curves from a theory of task complexity?\n\nBasic plot:\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-02-22-ai-cost-curves_files/figure-html/unnamed-chunk-1-1.png){width=672}\n:::\n:::\n\n\n\n\nThree types of agent change:\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-02-22-ai-cost-curves_files/figure-html/unnamed-chunk-2-1.png){width=672}\n:::\n:::\n\n\n\n# additional plots\n\nExtra plots:\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-02-22-ai-cost-curves_files/figure-html/unnamed-chunk-3-1.png){width=672}\n:::\n:::\n",
5+
"markdown": "---\ntitle: AI cost curves\ndraft: true\nengine: knitr\nexecute:\n echo: false\n warning: false\n error: false\n cache: true # caches chunk output\n---\n\n\n\nIt's useful to draw plots showing achievement vs expenditure, comparing humans & agents.\n\nYou can read the y-axis in a few ways: (1) score on a benchmark; (2) quality of the output; (3) score on an optimization problem.\n\nObservations:\n\n1. In most cases agents are cheaper than humans but hit a ceiling in capability.\n2. Can simplify to say agents are free, without much loss.\n3. We can see three types of agent growth: (A) cheaper inference; (B) expanded capabilities; (C) test-time growth. \n4. Distillation shifts cost curves left.\n5. This observation is a nice fit for time horizon (more discussion required)\n6. Q: can you derive these curves from a theory of task complexity?\n\nTODO:\n\n1. Probably plot ln(expenditure).\n2. Show separate graphs for inference cost & training cost.\n3. Observation: for many things it doesn't matter if we don't have cardinal scale of quality, as long as we have *ordinal* scale. E.g. we can still talk about cost reduction, & something about scale effects.\n\nBasic plot:\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-02-22-ai-cost-curves_files/figure-html/unnamed-chunk-1-1.png){width=672}\n:::\n:::\n\n\n\n\nThree types of agent change:\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-02-22-ai-cost-curves_files/figure-html/unnamed-chunk-2-1.png){width=672}\n:::\n:::\n\n\n\n# additional plots\n\nExtra plots:\n\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](2026-02-22-ai-cost-curves_files/figure-html/unnamed-chunk-3-1.png){width=672}\n:::\n:::\n",
66
"supporting": [],
77
"filters": [
88
"rmarkdown/pagebreak.lua"

_freeze/posts/2025-09-13-recursive-self-improvement-explosion/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.
Binary file not shown.
87.5 KB
Loading

_freeze/posts/2026-02-26-hallucinations-and-alignment/execute-results/html.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.
Binary file not shown.

docs/2026-02-22-ai-cost-curves.html

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,12 @@ <h1 class="title">AI cost curves</h1>
176176
<li>This observation is a nice fit for time horizon (more discussion required)</li>
177177
<li>Q: can you derive these curves from a theory of task complexity?</li>
178178
</ol>
179+
<p>TODO:</p>
180+
<ol type="1">
181+
<li>Probably plot ln(expenditure).</li>
182+
<li>Show separate graphs for inference cost &amp; training cost.</li>
183+
<li>Observation: for many things it doesn’t matter if we don’t have cardinal scale of quality, as long as we have <em>ordinal</em> scale. E.g. we can still talk about cost reduction, &amp; something about scale effects.</li>
184+
</ol>
179185
<p>Basic plot:</p>
180186
<div class="cell">
181187
<div class="cell-output-display">

0 commit comments

Comments
 (0)