Skip to content

Commit d56b781

Browse files
docs: clarify IV example labels (#499) (#907)
The third panel showed the posterior of chol_cov_corr[0, 1], a model-level residual correlation, but was labelled 'Correlation between Outcome and Treatment', which a reader will read as raw Y vs raw T. Rename the title and axis to 'Modelled correlation between outcome Y and instrumented treatment X-hat' / 'Posterior correlation', add a footnote naming the parameter, and rewrite the section intro so the distinction with the raw Y ~ X panel is explicit. Also fix the OlS -> OLS typo in the legend. Co-authored-by: Benjamin T. Vincent <inferencelab@gmail.com>
1 parent 390129f commit d56b781

1 file changed

Lines changed: 17 additions & 7 deletions

File tree

docs/source/notebooks/iv_pymc.ipynb

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -757,9 +757,9 @@
757757
"cell_type": "markdown",
758758
"metadata": {},
759759
"source": [
760-
"### Multivariate Outcomes and Measures of Correlation\n",
760+
"### Multivariate outcomes and modelled correlation\n",
761761
"\n",
762-
"As we stated above, one of the benefits of the Bayesian approach is that we directly measure the bivariate relationship between the instrument and the treatment. We can see (in two dimensions) a representation of how the difference in the estimated treatment coefficients skews the expected outcomes. "
762+
"One benefit of the Bayesian formulation is that we directly estimate the correlation between the outcome and the modelled (instrumented) treatment. A non-zero posterior for this correlation is the formal signal of endogeneity that motivates IV in the first place: unobserved factors driving the treatment also drive the outcome, so naive OLS is biased. The figure below puts that modelled correlation next to the OLS vs IV fits taken on the raw treatment scale, which is why the signs in the two panels need not agree: they describe different objects (raw `X` vs modelled `X̂`).\n"
763763
]
764764
},
765765
{
@@ -921,8 +921,8 @@
921921
"n_samples = min(500, len(uncertainty))\n",
922922
"uncertainty.sample(n_samples).T.plot(legend=False, color=\"orange\", alpha=0.4, ax=axs[1])\n",
923923
"axs[1].plot(x, ols, color=\"black\", label=\"OLS fit\")\n",
924-
"axs[1].set_title(\"OLS versus Instrumental Regression Fits\", fontsize=20)\n",
925-
"axs[1].legend(custom_lines, [\"IV fits\", \"OlS fit\"])\n",
924+
"axs[1].set_title(\"OLS vs IV regression fits (Y on raw X)\", fontsize=20)\n",
925+
"axs[1].legend(custom_lines, [\"IV fits\", \"OLS fit\"])\n",
926926
"axs[1].set_xlabel(\"Treatment Scale/ Risk\")\n",
927927
"axs[1].set_ylabel(\"Outcome Scale/ Log GDP\")\n",
928928
"\n",
@@ -931,9 +931,19 @@
931931
")\n",
932932
"\n",
933933
"corr = az.extract(data=iv.model.idata, var_names=[\"chol_cov_corr\"])[0, 1, :]\n",
934-
"axs[2].hist(corr, bins=30, ec=\"black\", color=\"C2\", label=\"correlation\")\n",
935-
"axs[2].set_xlabel(\"Correlation Measure\")\n",
936-
"axs[2].set_title(\"Correlation between \\n Outcome and Treatment\", fontsize=20);"
934+
"axs[2].hist(corr, bins=30, ec=\"black\", color=\"C2\", label=\"posterior\")\n",
935+
"axs[2].set_xlabel(\"Posterior correlation\")\n",
936+
"axs[2].set_title(\n",
937+
" \"Modelled correlation between \\n outcome Y and instrumented treatment X̂\",\n",
938+
" fontsize=20,\n",
939+
");"
940+
]
941+
},
942+
{
943+
"cell_type": "markdown",
944+
"metadata": {},
945+
"source": [
946+
"The right panel is the posterior of `chol_cov_corr[0, 1]`, the residual correlation in the bivariate normal likelihood that links the outcome and treatment equations.\n"
937947
]
938948
},
939949
{

0 commit comments

Comments
 (0)