Comparison notebook: Why flexibility matters?#703
Conversation
Add a new Jupyter notebook to docs demonstrating a comparison between Interrupted Time Series (ITS) approaches and Synthetic Control methods. The notebook loads CausalPy's built-in `sc` dataset, sets up plotting and RNG seed, and walks through applying CausalImpact and CausalPy to the same data (treated unit 'actual', controls a–g, treatment at time 73) to highlight when synthetic control is the more appropriate method.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
PR SummaryLow Risk Overview Written by Cursor Bugbot for commit a1d4a88. This will update automatically on new commits. Configure here. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #703 +/- ##
=======================================
Coverage 95.32% 95.32%
=======================================
Files 96 96
Lines 15221 15221
Branches 878 878
=======================================
Hits 14510 14510
Misses 502 502
Partials 209 209 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Reviewed by Thanks Carlos — this is a useful notebook and the core demonstration (same data, three methods, one ground truth) is exactly the kind of side-by-side comparison that teaches people what to reach for. The numbers work out, the data is well chosen, and the final three-panel summary is a strong close. The comments below are a mix of pedagogy and figure-quality notes. I've tried to prioritise: the first group changes how the notebook feels as teaching material, the second tightens technical correctness and Sphinx/MyST rendering, and the third is polish. 1. Reframe the narrative: "match method to data" rather than "tool X is bad"The current framing reads like a head-to-head between libraries ("CausalImpact gets it wrong", "CausalPy wins"). The more durable lesson is the one already implicit in the notebook: when you have a donor pool of unaffected controls, modelling cross-unit structure beats extrapolating the treated unit's own history. CausalImpact isn't "wrong" here so much as being asked to answer a question with one-seventh of the available information. Reframing this way lands better pedagogically — acknowledge the surprising result, reassure the reader, then connect it to the theory that predicts it — and it's also more defensible, since CausalImpact-with-controls is a legitimate CITS analysis, just not the one we run here. Concrete suggestions:
2. Add a reflection prompt at the endMost of CausalPy's tutorial notebooks close with a takeaway that ties back to the reader's own work. The current close is a marketing bullet list for CausalPy. Suggest adding, after "What We've Learned", something like:
3. Fix the
|
Reframes the notebook from a library head-to-head to a method-selection
walkthrough, fixes several correctness and rendering issues flagged in
review, and aligns with CausalPy's glossary/citation conventions.
Correctness and rendering (blocking):
- Cell 18: wrap "The key difference" in :::{important} so the orphan
closing fence no longer renders literally in the Sphinx build.
- Cell 29: compute true 95% HDIs via az.hdi instead of quantile(0.025/0.975)
so the label matches the computation; stack panels 3x1 at figsize=(10, 10)
with sharey=True so magnitudes compare directly; add zorder=10 on the
true-effect reference line; use named semantic colour constants.
- Cells 14, 23: drop non-portable nutpie/jax/FAST_COMPILE sampler kwargs
and progressbar=True; use default sampler with draws=500, tune=500,
target_accept=0.95 for a docs-friendly notebook.
- Cells 9, 13, 14, 23: remove Quarto `# | warning: false` pragmas that
don't do anything in the nbsphinx build.
- Cells 29, 31: relabel "CausalPy (BSTS State Space)" as "CausalPy ITS
(structural state space)" to stop conflating with CausalImpact's BSTS.
- index.md: move the notebook out of "Comparative Interrupted Time Series"
into "Synthetic Control" since SC is the method the notebook endorses.
Pedagogy:
- Retitle to "Choosing the right counterfactual: ITS vs. synthetic control
on the same data" and replace the strawman opening with an examples-first
framing.
- Soften cells 12 and 16 so the CausalImpact and CausalPy-ITS results are
presented as the expected failure mode of ITS when cross-sectional
information is unused, rather than as gotchas.
- In cell 32, disentangle "method gap" from "library gap" and add a
reflection prompt with three diagnostic questions and a worked example.
Polish:
- Cell 2: define COLOR_CI/COLOR_ITS/COLOR_SC tied to method family.
- Cell 6: label the control lines collectively as "Controls (a–g)".
- Cell 7: note that -1.85 is the simulation's ATT over this post-period,
not a fundamental property.
- Cell 12: caveat the arbitrary monthly calendar and seasonal_length=12.
- Cell 28: one-sentence technical caption before the three-panel figure.
Glossary and citations:
- Link Interrupted Time Series, Comparative ITS, Synthetic Control,
Counterfactual, and ATT on first mention via {term} roles.
- Cite Brodersen et al. (2015) for CausalImpact/BSTS (new bib entry),
Abadie, Diamond & Hainmueller (2010) for synthetic control, and
Lopez Bernal et al. (2018) for CITS, using {cite:p} roles — no
per-notebook bibliography block (PR #834 central references).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regenerates outputs (MAE table, three-panel figure, sampling logs) under the portable default sampler so the rendered docs page matches the updated code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@drbenvincent apply the recommendations. Issues will solve after merge: #826 |
…act_v_causalpy Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # docs/source/notebooks/index.md
Move warnings.filterwarnings below imports and remove the unused seaborn import so the notebook passes the post-merge ruff (E402/F401) checks pulled in from main. Co-authored-by: Cursor <cursoragent@cursor.com>
daimon-pymclabs
left a comment
There was a problem hiding this comment.
Automated review round (requested via Daimon) — checking the current state of the branch against the earlier review checklist.
Carlos, you've worked through the large majority of the prior feedback and the notebook reads much better for it. Confirmed as addressed: the narrative reframe to "match method to data" (cells 0, 12, 16, 32), the reflection prompt at the end, moving the toctree entry under Synthetic Control in index.md, the orphan ::: now a proper :::{important} callout (cell 18), BSTS-vs-structural-state-space relabelling (cells 12/17/29/31), the arbitrary-calendar caveat (cells 12/13), the ground-truth/ATT framing (cells 7, 30), removal of the # | warning: false Quarto pragmas, the three-panel figure (now stacked 3×1 at width 10, sharey=True, zorder=10 on the true-effect line, named COLOR_* constants), the Controls (a–g) legend label, glossary {term} links on first mention, inline {cite:p} citations with no per-notebook bibliography block. Nicely done.
Two items from the "should fix before merge" list still look open:
1. Sampler portability + likely kwarg typo (cell 14) — please address before merge.
sampler_kwargs = {
...
"nut_sampler": "nutpie",
"nuts_sampler_kwargs": {"backend": "jax", "gradient_backend": "jax"},
}
ssts_model = cp.pymc_models.StateSpaceTimeSeries(..., mode="FAST_COMPILE")docs/source/conf.pysetsnb_execution_mode = "off", so the docs build will not execute this and won't catch a runtime failure — but a reader running the notebook locally will.nutpie+ thejax/gradient_backendpath +FAST_COMPILEaren't in the default install and will error for most readers. Consistent with AGENTS.md asking docs/tests to minimise MCMC load, the default sampler with modestsample_kwargs(e.g.draws=500, tune=500, target_accept=0.95) would be safer here.- Separately, the key is spelled
nut_sampler, but the PyMCpm.sampleargument isnuts_sampler— and the sibling notebookiv_vs_priors.ipynbuses"nuts_sampler": "numpyro". As written,nut_sampleris silently ignored rather than selecting nutpie. Worth confirming/fixing whichever sampler you settle on.
2. "95% HDI" label on the CausalImpact panel (cells 28/29). The ITS and SC bands now compute a genuine az.hdi(...) — good. But the CausalImpact band uses point_effects_lower/point_effects_upper from tfp-causalimpact, which are equal-tailed quantile intervals, while the shared legend and the cell-28 caption both call all three "95% HDI". Either relabel CausalImpact's band (e.g. "95% interval") or note that only the CausalPy bands are HDIs, so the three-panel legend isn't mislabelling one of them.
Everything else from the prior round looks resolved. Once the sampler kwargs in cell 14 are sorted (the portability concern is the main blocker), this is in good shape to merge.
|
@cetagostini is this useful #975 ? |
its_causalpy_vs_causalimpact.ipynb imports the third-party causalimpact package, which is not a CausalPy dependency and is absent from the notebook test environment. Add it to skip_notebooks.yml alongside the JAX-dependent notebooks so the notebooks CI job passes. Co-authored-by: Cursor <cursoragent@cursor.com>
Drop the non-portable jax backend and FAST_COMPILE from the ITS sampler config, split the data preview into its own cell, and clarify that the CausalImpact panel's band is its own quantile-based interval rather than an HDI. Co-authored-by: Cursor <cursoragent@cursor.com>
Yes, good guide! Solve all, "nutpie" will stay but other stuff with jax and mode are gone. Plus, about HDI from causal impact, I added a disclaimer. Everything should be good now! @juanitorduz |
|
@daimon-pymclabs I solved the issues. |
|
Confirmed on my end — |
Add a new Jupyter notebook to docs demonstrating a comparison between Interrupted Time Series (ITS) approaches and Synthetic Control methods. The notebook loads CausalPy's built-in
scdataset, sets up plotting and RNG seed, and walks through applying CausalImpact and CausalPy to the same data (treated unit 'actual', controls a–g, treatment at time 73) to highlight when synthetic control is the more appropriate method.