Skip to content

Commit f895494

Browse files
committed
small update
1 parent 834fe4a commit f895494

File tree

81 files changed

+754
-4398
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+754
-4398
lines changed

_freeze/posts/2026-01-29-knowledge-creating-llms/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

docs/index.xml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,7 @@ You have one friend who is full of new ideas, you have another friend who can te
151151

152152

153153

154-
<div id="0c7fab1e" class="cell" data-results="asis" data-execution_count="2">
154+
<div id="8965c9a5" class="cell" data-results="asis" data-execution_count="1">
155155
<div class="cell-output cell-output-display cell-output-markdown">
156156
<details class="validation-checklist">
157157
<summary>
@@ -163,12 +163,12 @@ Validation Checks
163163
<li>✅ [23/23] Table rows have required fields (programmatic)</li>
164164
<li>✅ [23/23] QMD quotes match <code>posts/ai.bib</code> (programmatic)</li>
165165
<li>✅ [23/23] QMD growth values match <code>posts/ai.bib</code> (programmatic)</li>
166-
<li>✅ [33/33] Abstracts present for all cited sources (programmatic)</li>
166+
<li>⚠️ [31/33] Abstracts present for all cited sources (programmatic)</li>
167167
<li>❌ [15/18] Bib quotes present in local fulltext version (programmatic)</li>
168168
<li>⏳ Growth values consistent with quoted text (LLM-assisted, test not implemented)</li>
169169
<li>⏳ Coverage check against recent sources (manual + web search, test not implemented)</li>
170170
</ul>
171-
Last checked: 2026-01-26
171+
Last checked: 2026-02-05
172172
</details>
173173
</div>
174174
</div>

docs/posts/2020-08-06-bayesian-interpretation-of-experiments.html

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
88

99
<meta name="author" content="Tom Cunningham">
10-
<meta name="dcterms.date" content="2026-01-29">
10+
<meta name="dcterms.date" content="2026-02-05">
1111
<meta name="description" content="Tom Cunningham blog">
1212

1313
<title>The Bayesian Interpretation of Experiments | Tom Cunningham – Tom Cunningham</title>
@@ -139,7 +139,10 @@
139139
<link rel="stylesheet" href="../styles.css">
140140
<meta name="twitter:title" content="The Bayesian Interpretation of Experiments | Tom Cunningham">
141141
<meta name="twitter:description" content="Tom Cunningham blog">
142-
<meta name="twitter:card" content="summary">
142+
<meta name="twitter:image" content="tecunningham.github.io/posts/2020-08-06-bayesian-interpretation-of-experiments_files/figure-html/unnamed-chunk-1-1.png">
143+
<meta name="twitter:image-height" content="883">
144+
<meta name="twitter:image-width" content="1105">
145+
<meta name="twitter:card" content="summary_large_image">
143146
</head>
144147

145148
<body class="nav-fixed quarto-light">
@@ -212,7 +215,7 @@ <h1 class="title">The Bayesian Interpretation of Experiments</h1>
212215
<div>
213216
<div class="quarto-title-meta-heading">Published</div>
214217
<div class="quarto-title-meta-contents">
215-
<p class="date">January 29, 2026</p>
218+
<p class="date">February 5, 2026</p>
216219
</div>
217220
</div>
218221

@@ -254,10 +257,55 @@ <h1>Bayesian vs Classical Inference</h1>
254257
<li><strong>Bayesian.</strong> You set <code>posterior</code> to be a weighted average of <code>outcome</code> and of <code>prior</code>. The prior represents your best-estimate of the effect before the experiment ran. The position of the posterior between those two points depends on the relative tightness of the two distributions: if the confidence intervals from your experiment are tight relative to the uncertainty in your prior then the posterior will be closer to the outcome; if instead the confidence intervals are wide relative to your prior then the then the posterior will end up closer to your prior.</li>
255258
</ol>
256259
<p>Graphically we can show the three distributions:</p>
260+
<div class="cell" data-layout-align="center">
261+
<div class="cell-output-display">
262+
<div class="quarto-figure quarto-figure-center">
263+
<figure class="figure">
264+
<p><img src="2020-08-06-bayesian-interpretation-of-experiments_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="384"></p>
265+
</figure>
266+
</div>
267+
</div>
268+
</div>
257269
</section>
258270
<section id="classical-inference-as-constrained-prior" class="level1">
259271
<h1>Classical Inference as Constrained Prior</h1>
260272
<p>Suppose our prior was a spike at zero and otherwise uniform. This prior will cause Bayesian inference to behave similarly to classical inferences: when the outcome is small then the posterior will be heavily influenced by the spike, and so will shrink to be very near zero. When the outcome becomes larger then at some point it will escape the gravity of the central spike, and we’ll have <code>posterior~=outcome</code>.</p>
273+
<div class="cell" data-layout-align="center">
274+
<div class="cell-output-display">
275+
<div class="quarto-figure quarto-figure-center">
276+
<figure class="figure">
277+
<p><img src="2020-08-06-bayesian-interpretation-of-experiments_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="192"></p>
278+
</figure>
279+
</div>
280+
</div>
281+
</div>
282+
<div class="cell" data-layout-align="center">
283+
<div class="cell-output-display">
284+
<div class="quarto-figure quarto-figure-center">
285+
<figure class="figure">
286+
<p><img src="2020-08-06-bayesian-interpretation-of-experiments_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="192"></p>
287+
</figure>
288+
</div>
289+
</div>
290+
</div>
291+
<div class="cell" data-layout-align="center">
292+
<div class="cell-output-display">
293+
<div class="quarto-figure quarto-figure-center">
294+
<figure class="figure">
295+
<p><img src="2020-08-06-bayesian-interpretation-of-experiments_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="192"></p>
296+
</figure>
297+
</div>
298+
</div>
299+
</div>
300+
<div class="cell" data-layout-align="center">
301+
<div class="cell-output-display">
302+
<div class="quarto-figure quarto-figure-center">
303+
<figure class="figure">
304+
<p><img src="2020-08-06-bayesian-interpretation-of-experiments_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" width="192"></p>
305+
</figure>
306+
</div>
307+
</div>
308+
</div>
261309
<p><strong>The point:</strong> the two graphs at the bottom of the figure are similar: i.e., using a “stat-sig rule” is a not-too-bad approximation of Bayesian inference when you have a fat-tailed prior (and in most cases your prior should be fat-tailed).</p>
262310
</section>
263311
<section id="applications" class="level1">
35.8 KB
17.1 KB
20.1 KB
20 KB
15.6 KB

docs/posts/2022-02-26-inference-with-experimental-data.html

Lines changed: 98 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -167,9 +167,9 @@
167167
<link rel="stylesheet" href="../styles.css">
168168
<meta name="twitter:title" content="Inference on Experiments | Tom Cunningham">
169169
<meta name="twitter:description" content="Tom Cunningham blog">
170-
<meta name="twitter:image" content="tecunningham.github.io/posts/images/2022-04-08-09-34-41.png">
171-
<meta name="twitter:image-height" content="1386">
172-
<meta name="twitter:image-width" content="1800">
170+
<meta name="twitter:image" content="tecunningham.github.io/posts/2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-1-1.png">
171+
<meta name="twitter:image-height" content="787">
172+
<meta name="twitter:image-width" content="989">
173173
<meta name="twitter:card" content="summary_large_image">
174174
</head>
175175

@@ -386,10 +386,16 @@ <h1>Philosophical Remark: Confounding and Noise</h1>
386386
</section>
387387
<section id="setup" class="level1 page-columns page-full">
388388
<h1>Setup</h1>
389+
<div class="cell page-columns page-full">
389390

390-
<div class="no-row-height column-margin column-container"><div class="cell">
391-
392-
</div></div><p><strong>A single metric.</strong> We observe <span class="math inline">\(\hat{t}\)</span> which is the true treatment effect plus noise, <span class="math inline">\(\hat{t}=t+e\)</span>. Then for any given <span class="math inline">\(t\)</span> our posterior probability will be:</p>
391+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
392+
<div>
393+
<figure class="figure">
394+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="288"></p>
395+
</figure>
396+
</div>
397+
</div></div></div>
398+
<p><strong>A single metric.</strong> We observe <span class="math inline">\(\hat{t}\)</span> which is the true treatment effect plus noise, <span class="math inline">\(\hat{t}=t+e\)</span>. Then for any given <span class="math inline">\(t\)</span> our posterior probability will be:</p>
393399
<p><span class="math display">\[\begin{aligned}
394400
\ut{f(t|\hat{t})}{posterior}=\ut{f(t)}{prior}\times \ut{f(\hat{t}|t)}{likelihood}
395401
\end{aligned}
@@ -402,33 +408,75 @@ <h1>Setup</h1>
402408
<div class="no-row-height column-margin column-container"><div id="fn1"><p><sup>1</sup>&nbsp;Throughout we assume that the treatments change only the mean, not the variance, of outcomes.</p></div></div><p>The variances and covariances of <span class="math inline">\(t_1\)</span> and <span class="math inline">\(t_2\)</span> represent the experimenter’s priors, and so are often difficult to quantify. If we are willing to identify priors with some set of previously-run experiments, i.e.&nbsp;an “empirical-Bayes” technique, we can recover them from the data using this relationship between covariance matrices:</p>
403409
<p><span class="math display">\[\Sigma_{\hat{t}}= \Sigma_t + \frac{2}{N}\Sigma_u,\]</span></p>
404410
<p>where <span class="math inline">\(\Sigma_u\)</span> is the unit-level covariance matrix. The following graph illustrates a case with negatively-correlated treatment effects, positively correlated noise, and uncorrelated outcomes.</p>
411+
<div class="cell page-columns page-full">
405412

406-
<div class="no-row-height column-margin column-container"><div class="cell">
407-
408-
</div><div class="cell">
409-
410-
</div><div class="cell">
413+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
414+
<div>
415+
<figure class="figure">
416+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="192"></p>
417+
</figure>
418+
</div>
419+
</div></div></div>
420+
<div class="cell page-columns page-full">
411421

412-
</div></div>
422+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
423+
<div>
424+
<figure class="figure">
425+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="192"></p>
426+
</figure>
427+
</div>
428+
</div></div></div>
429+
<div class="cell page-columns page-full">
413430

431+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
432+
<div>
433+
<figure class="figure">
434+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="192"></p>
435+
</figure>
436+
</div>
437+
</div></div></div>
414438
<p><strong>Many metrics.</strong> If we assume that everything has a normal distribution, we have a crisp expression for how the posterior expectations depend on the observed outcomes. For an arbitrary number of outcomes we can write this as:</p>
415439
<p><span class="math display">\[\mathbb{E}[t|y]=\mu_t+\Sigma_t(\Sigma_t+\frac{1}{N}\Sigma_u)^{-1}(y-\mu_t).\]</span></p>
416440
</section>
417441
<section id="univariate-shrinkage-estimators" class="level1 page-columns page-full">
418442
<h1>Univariate Shrinkage Estimators</h1>
419443
<p>See survey of methods in <span class="citation" data-cites="montiel2019">Azevedo et al. (<a href="#ref-montiel2019" role="doc-biblioref">2019</a>)</span>. They also document the fat tailed distribution of effect-sizes.</p>
444+
<div class="cell page-columns page-full">
420445

421-
<div class="no-row-height column-margin column-container"><div class="cell">
422-
423-
</div><div class="cell">
424-
425-
</div><div class="cell">
426-
427-
</div><div class="cell">
446+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
447+
<div>
448+
<figure class="figure">
449+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="288"></p>
450+
</figure>
451+
</div>
452+
</div></div></div>
453+
<div class="cell page-columns page-full">
428454

429-
</div></div>
455+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
456+
<div>
457+
<figure class="figure">
458+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="288"></p>
459+
</figure>
460+
</div>
461+
</div></div></div>
462+
<div class="cell page-columns page-full">
430463

464+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
465+
<div>
466+
<figure class="figure">
467+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="288"></p>
468+
</figure>
469+
</div>
470+
</div></div></div>
471+
<div class="cell page-columns page-full">
431472

473+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
474+
<div>
475+
<figure class="figure">
476+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="288"></p>
477+
</figure>
478+
</div>
479+
</div></div></div>
432480
<ol type="1">
433481
<li><p><strong>James-Stein.</strong> With a Normal prior, <span class="math inline">\(t\sim N(\mu_t,\sigma_t^2)\)</span>, we get: <span class="math display">\[\mathbb{E}[t|y]=\mu_t+\utt{\frac{\sigma_t^2}{\sigma_t^2+\sigma_e^2}}{shrinkage}{factor}(y-\mu_t).\]</span> We can also use the share of experiments that are significant, <span class="math inline">\(p\)</span>: <span class="math display">\[\begin{aligned}
434482
E[t|y]=&amp; [ 1-(\frac{1}{1.96}\Phi^{-1}(\frac{p}{2}))^2 ]y.
@@ -727,22 +775,43 @@ <h1>Launch Rules</h1>
727775
</ol>
728776
<div class="no-row-height column-margin column-container"><div id="fn7"><p><sup>7</sup>&nbsp;Precisely: if the distribution of effect-sizes is Normal with zero mean then having a statistically-significant effect in 50% of your experiments implies a shrinkage rate of just 10%.</p></div></div><section id="comparing-launch-rules" class="level2 page-columns page-full">
729777
<h2 class="anchored" data-anchor-id="comparing-launch-rules">Comparing Launch Rules</h2>
778+
<div class="cell page-columns page-full">
730779

731-
<div class="no-row-height column-margin column-container"><div class="cell">
732-
733-
</div></div><p>Suppose we have two metrics, 1 and 2, and we care about them equally much: <span class="math display">\[U(t_1,t_2)=t_1+t_2.\]</span></p>
780+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
781+
<div class="quarto-figure quarto-figure-center">
782+
<figure class="figure">
783+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid figure-img" width="288"></p>
784+
<figcaption>Ship if either stat-sig positive and neither stat-sig negative.</figcaption>
785+
</figure>
786+
</div>
787+
</div></div></div>
788+
<p>Suppose we have two metrics, 1 and 2, and we care about them equally much: <span class="math display">\[U(t_1,t_2)=t_1+t_2.\]</span></p>
734789
<p>But we only observe noisy estimates <span class="math inline">\(\hat{t}_1,\hat{t}_2\)</span>.</p>
735790
<p>A stat-sig shipping rule (either stat-sig positive, neither stat-sig negative) has some strange consequences: it will recommend shipping things even with <em>negative</em> face-value utility (<span class="math inline">\(U(\hat{t}_1,\hat{t}_2)&lt;0\)</span>), when there’s a negative outcome on the relatively noisier metric. This will still hold if we evaluate utility with shrunk estimates, when there’s equal proportional shrinkage on the two metrics, but if there’s greater shrinkage on the noisier metric it will not hold.</p>
736791
<p>Kohavi, Tang &amp; Xu (2020) <em>Trustworthy Online Controlled Experiments</em> recommends a stat-sig shipping rule (p105): “(1) If no metrics are positive-significant then do not ship; (2) if some are positive-significant and none are negative-significant then ship; (3) if some are positive-significant and some are negative-significant then”decide based on the tradeoffs.” I think this is bad advice: the statistical significance of an estimate is only loosely related to the informativeness of that estimate. The decision should be made based on your best estimates of the impact on the overall goal metrics.</p>
737792
<p><br><br><br><br></p>
793+
<div class="cell page-columns page-full">
738794

739-
<div class="no-row-height column-margin column-container"><div class="cell">
740-
741-
</div></div><p><br><br><br><br></p>
742-
743-
<div class="no-row-height column-margin column-container"><div class="cell">
795+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
796+
<div class="quarto-figure quarto-figure-center">
797+
<figure class="figure">
798+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-11-1.png" class="img-fluid figure-img" width="288"></p>
799+
<figcaption>Ship if sum is positive</figcaption>
800+
</figure>
801+
</div>
802+
</div></div></div>
803+
<p><br><br><br><br></p>
804+
<div class="cell page-columns page-full">
744805

745-
</div></div></section>
806+
<div class="no-row-height column-margin column-container"><div class="cell-output-display">
807+
<div class="quarto-figure quarto-figure-center">
808+
<figure class="figure">
809+
<p><img src="2022-02-26-inference-with-experimental-data_files/figure-html/unnamed-chunk-12-1.png" class="img-fluid figure-img" width="288"></p>
810+
<figcaption>Ship if sum stat-sig positive. I have drawn it assuming that <span class="math inline">\(cov(\hat{t}_1,\hat{t}_2)=0\)</span>. With a positive covariance the threshold would be higher.</figcaption>
811+
</figure>
812+
</div>
813+
</div></div></div>
814+
</section>
746815
</section>
747816
<section id="network-and-dynamic-effects" class="level1">
748817
<h1>Network and Dynamic Effects</h1>
41.2 KB

0 commit comments

Comments
 (0)