Skip to content
2,813 changes: 1,580 additions & 1,233 deletions examples/causal_inference/difference_in_differences.ipynb

Large diffs are not rendered by default.

11 changes: 11 additions & 0 deletions examples/causal_inference/difference_in_differences.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,16 @@ Otherwise there are likely better suited approaches you could use.

Note that our desire to estimate the causal impact of a treatment involves [counterfactual thinking](https://en.wikipedia.org/wiki/Counterfactual_thinking). This is because we are asking "What would the post-treatment outcome of the treatment group be _if_ treatment had not been administered?" but we can never observe this.

:::{admonition} A note on "counterfactual" terminology
:class: note

This notebook uses "counterfactual" in the **potential outcomes** (Rubin) sense {cite:p}`rubin1974estimating, imbens2015causal`. The parallel trends assumption lets us use the control group's trajectory as a proxy for the treated group's unobserved potential outcome $Y(0)$ — what *would have happened* without treatment. This is standard counterfactual reasoning in the quasi-experimental literature.

This differs from Pearl's **Level 3** (unit-level) counterfactuals {cite:p}`pearl2009causality`, which require *abduction* — inferring unit-specific exogenous variables from observed data and then reasoning about what would have happened to *that particular unit* under a different action. The difference-in-differences approach operates at Level 2 (interventional) in Pearl's causal hierarchy, making "counterfactual" in the Rubin sense the appropriate term.

For a detailed discussion of the distinction between interventional (L2) and counterfactual (L3) reasoning, see the {ref}`interventional_what_if_do_operator` notebook.
:::

+++

### Example
Expand Down Expand Up @@ -438,6 +448,7 @@ Of course, when using the difference in differences approach for real applicatio
## Authors
- Authored by [Benjamin T. Vincent](https://github.com/drbenvincent) in Sept 2022 ([#424](https://github.com/pymc-devs/pymc-examples/pull/424)).
- Updated by Benjamin T. Vincent in February 2023 to run on PyMC v5
- Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in March 2026

+++

Expand Down
3,138 changes: 1,709 additions & 1,429 deletions examples/causal_inference/excess_deaths.ipynb

Large diffs are not rendered by default.

13 changes: 12 additions & 1 deletion examples/causal_inference/excess_deaths.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,16 @@ $$

Making a claim about excess deaths requires causal/counterfactual reasoning. While the reported number of deaths is nothing but a (maybe noisy and/or lagged) measure of a real observable fact in the world, _expected deaths_ is unmeasurable because these are never realised in our timeline. That is, the expected deaths is a counterfactual thought experiment where we can ask "What would/will happen if?"

:::{admonition} A note on "counterfactual" terminology
:class: note

This notebook uses "counterfactual" in the **potential outcomes** (Rubin) sense {cite:p}`rubin1974estimating, imbens2015causal`. The counterfactual here is a *forecast* from a model trained on pre-COVID data, predicting expected deaths *if nothing had changed* — the unobserved potential outcome $Y(0)$. This is the same group-level counterfactual prediction used in {ref}`interrupted time series analysis <interrupted_time_series>`.

This differs from Pearl's **Level 3** (unit-level) counterfactuals {cite:p}`pearl2009causality`, which require *abduction* — inferring unit-specific exogenous variables from observed data and then reasoning about what would have happened to *that particular unit* under a different action. The forecasting approach used here operates at Level 2 (interventional) in Pearl's causal hierarchy, making "counterfactual" in the Rubin sense the appropriate term.

For a detailed discussion of the distinction between interventional (L2) and counterfactual (L3) reasoning, see the {ref}`interventional_what_if_do_operator` notebook.
:::

+++

## Overall strategy
Expand All @@ -42,7 +52,7 @@ How do we go about this, practically? We will follow this strategy:
2. Split into `pre` and `post` covid datasets. This is an important step. We want to come up with a model based upon what we know _before_ COVID-19 so that we can construct our counterfactual predictions based on data before COVID-19 had any impact.
3. Estimate model parameters based on the `pre` dataset.
4. [Retrodict](https://en.wikipedia.org/wiki/Retrodiction) the number of deaths expected by the model in the pre COVID-19 period. This is not a counterfactual, but acts to tell us how capable the model is at accounting for the already observed data.
5. Counterfactual inference - we use our model to construct a counterfactual forecast. What would we expect to see in the future if there was no COVID-19? This can be achieved by using the famous do-operator Practically, we do this with posterior prediction on out-of-sample data.
5. Counterfactual inference we use our model to construct a counterfactual forecast. What would we expect to see in the future if there was no COVID-19? Practically, we do this with posterior prediction on out-of-sample data.
6. Calculate the excess deaths by comparing the reported deaths with our counterfactual (expected number of deaths).

+++
Expand Down Expand Up @@ -488,6 +498,7 @@ The bad news of course, is that as of the last data point (May 2022) the number
## Authors
- Authored by [Benjamin T. Vincent](https://github.com/drbenvincent) in July 2022.
- Updated by Benjamin T. Vincent in February 2023 to run on PyMC v5
- Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in March 2026

+++

Expand Down
2,457 changes: 1,400 additions & 1,057 deletions examples/causal_inference/interrupted_time_series.ipynb

Large diffs are not rendered by default.

11 changes: 11 additions & 0 deletions examples/causal_inference/interrupted_time_series.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,16 @@ For example, if a change to a website was made and you want to know the causal i

However, if the website change was rolled out to _all_ users of the website then you do not have a control group. In this case you do not have a direct measurement of the counterfactual, what _would have happened if_ the website change was not made. In this case, if you have data over a 'good' number of time points, then you may be able to make use of the interrupted time series approach.

:::{admonition} A note on "counterfactual" terminology
:class: note

This notebook uses "counterfactual" in the **potential outcomes** (Rubin) sense {cite:p}`rubin1974estimating, imbens2015causal` — the counterfactual here is a *forecast*. We extrapolate pre-intervention trends to predict the unobserved potential outcome $Y(0)$: what *would have happened* without the intervention. This is a group-level counterfactual prediction, standard in the quasi-experimental literature.

This differs from Pearl's **Level 3** (unit-level) counterfactuals {cite:p}`pearl2009causality`, which require *abduction* — inferring unit-specific exogenous variables from observed data and then reasoning about what would have happened to *that particular unit* under a different action. The forecasting approach used here operates at Level 2 (interventional) in Pearl's causal hierarchy, making "counterfactual" in the Rubin sense the appropriate term.

For a detailed discussion of the distinction between interventional (L2) and counterfactual (L3) reasoning, see the {ref}`interventional_what_if_do_operator` notebook.
:::

Interested readers are directed to the excellent textbook [The Effect](https://theeffectbook.net/) {cite:p}`huntington2021effect`. Chapter 17 covers 'event studies' which the author prefers to the interrupted time series terminology.

+++
Expand Down Expand Up @@ -347,6 +357,7 @@ There are of course many ways that the interrupted time series approach could be
## Authors
- Authored by [Benjamin T. Vincent](https://github.com/drbenvincent) in October 2022.
- Updated by Benjamin T. Vincent in February 2023 to run on PyMC v5
- Updated by [Benjamin T. Vincent](https://github.com/drbenvincent) in March 2026

+++

Expand Down
Loading