Skip to content

Commit 9841092

Browse files
committed
update PSIS ref + link to Nabiximols study for Jacobian correction
1 parent 3b6faf8 commit 9841092

2 files changed

Lines changed: 18 additions & 8 deletions

File tree

vignettes/online-only/faq.Rmd

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ In the papers and `loo` package, following notations have been used
129129
- elpd_loo: The Bayesian LOO estimate of the expected log pointwise predictive density (Eq 4 in @Vehtari+etal:PSIS-LOO:2017).
130130
- elpd_lfo: The Bayesian LFO estimate of the expected log pointwise predictive density (see, e.g, @Burkner+Gabry+Vehtari:LFO-CV:2020).
131131
- LOOIC: -2*elpd_loo. See later for discussion of multiplier -2.
132-
- p_loo: This is not utility/loss as the others, but an estimate of effective complexity of the model, which can be used for diagnostics. See Vignette [LOO Glossary](https://mc-stan.org/loo/reference/loo-glossary.html) for interpreting p_loo when Pareto k is large.
132+
- p_loo: This is not utility/loss as the others, but an estimate of effective complexity of the model, which can be used for diagnostics. See Vignette [LOO Glossary](https://mc-stan.org/loo/reference/loo-glossary.html) for interpreting p_loo when Pareto-$\hat{k}$ is large.
133133

134134
Similarly we can use the similar notation for other data divisions,
135135
and utility and loss functions. For example, when using LOO data
@@ -147,7 +147,7 @@ The choice of partitions to leave out or metric of model performance is independ
147147

148148
- $K$-fold-CV: Each cross-validation fold uses the same inference as is used for the full data. For example, if MCMC is used then MCMC inference needs to be run $K$ times.
149149
- LOO with $K$-fold-CV: If $K=N$, where $N$ is the number of observations, then $K$-fold-CV is LOO. Sometimes this is called exact, naive or brute-force LOO. This can be time consuming as the inference needs to be repeated $N$ times. Sometimes, efficient parallelization can make the wall clock time to be close to the time needed for one model fit [Cooper+etal:2023:parallelCV].
150-
- PSIS-LOO: Pareto smoothed importance sampling leave-one-out cross-validation. Pareto smoothed importance sampling (PSIS, @Vehtari+etal:PSIS-LOO:2017, @Vehtari+etal:PSIS:2019) is used to estimate leave-one-out predictive densities or probabilities.
150+
- PSIS-LOO: Pareto smoothed importance sampling leave-one-out cross-validation. Pareto smoothed importance sampling (PSIS, @Vehtari+etal:PSIS-LOO:2017, Vehtari+etal:PSIS:2024) is used to estimate leave-one-out predictive densities or probabilities.
151151
- PSIS: Richard McElreath shortens PSIS-LOO as PSIS in Statistical Rethinking, 2nd ed.
152152
- MM-LOO: Moment matching importance sampling leave-one-out cross-validation [@Paananen+etal:2021:implicit]. Which works better than PSIS-LOO in challenging cases, but is still faster than $K$-fold-CV with K=N.
153153
- RE-LOO: Run exact LOO (see LOO with $K$-fold-CV) for those observations for which PSIS diagnostic indicates PSIS-LOO is not accurate (that is, re-fit the model for those leave-one-out cases).
@@ -200,7 +200,7 @@ Thus if there are a very large number of models to be compared, either methods t
200200
See more in tutorial videos on using cross-validation for model selection
201201

202202
- Bayesian data analysis lectures
203-
[8.2](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=456afda7-0e6d-4903-b0df-b0ab00da8f1e), [9.1](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=a4961b5a-7e42-4603-8aaf-b0b200ca6295), [9.2](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=a4796c79-eab2-436e-b55f-b0b200dac7ce).
203+
[8.2](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=456afda7-0e6d-4903-b0df-b0ab00da8f1e), [9.1](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=a4961b5a-7e42-4603-8aaf-b0b200ca6295), [9.2](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=a4796c79-eab2-436e-b55f-b0b200dac7ce).
204204
, and [11.1](https://aalto.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=7ef70bc8-122b-4e86-80fa-b0c000cb5511).
205205

206206

@@ -305,10 +305,10 @@ See also [How to interpret in Standard error (SE) of elpd difference (elpd_diff)
305305

306306
# Can cross-validation be used to compare different observation models / response distributions / likelihoods? {#differentmodels}
307307

308-
Short answer is "Yes". First to make the terms more clear, $p(y \mid \theta)$ as a function of $y$ is an observation model and $p(y \mid \theta)$ as a function of $\theta$ is a likelihood. It is better to ask ``Can cross-validation be used to compare different observation models?``
308+
Short answer is "Yes". First to make the terms more clear, $p(y \mid \theta)$ as a function of $y$ is an observation model and $p(y \mid \theta)$ as a function of $\theta$ is a likelihood. It is better to ask "Can cross-validation be used to compare different observation models?"
309309

310310
- You can compare models given different discrete observation models and it’s also allowed to have different transformations of $y$ as long as the mapping is bijective (the probabilities will the stay the same).
311-
- You can't compare densities and probabilities directly. Thus you can’t compare model given continuous and discrete observation models, unless you compute probabilities in intervals from the continuous model (also known as discretising the continuous model).
311+
- You can't compare densities and probabilities directly. Thus you can’t compare model given continuous and discrete observation models, unless you compute probabilities in intervals from the continuous model (also known as discretising the continuous model). [Nabiximols case study](https://users.aalto.fi/~ave/casestudies/Nabiximols/nabiximols.html) includes an illustration how this discretisation can be easy for count data.
312312
- You can compare models given different continuous observation models if you have exactly the same $y$ (loo functions in `rstanarm` and `brms` check that the hash of $y$ is the same). If $y$ is transformed, then the Jacobian of that transformation needs to be included. There is an example of this in [mesquite case study](https://avehtari.github.io/ROS-Examples/Mesquite/mesquite.html).
313313
- Transformations of variables are briefly discussed in BDA3 p. 21 [@BDA3] and
314314
in [Stan Reference Manual Chapter 10](https://mc-stan.org/docs/reference-manual/variable-transforms-chapter.html).
@@ -394,8 +394,8 @@ The number of high Pareto $\hat{k}$'s can be reduced by
394394
For more information see
395395

396396
- Vehtari, A., Gelman, A., and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. _Statistics and Computing_. 27(5), 1413--1432. doi:10.1007/s11222-016-9696-4. [Online](http://link.springer.com/article/10.1007\%2Fs11222-016-9696-4).
397-
- Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2022).
398-
Pareto smoothed importance sampling. [arXiv preprint arXiv:1507.02646](http://arxiv.org/abs/1507.02646).
397+
- Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2024).
398+
Pareto smoothed importance sampling. _Journal of Machine Learning Research_, 25(72):1-58. [Online](https://jmlr.org/papers/v25/19-556.html).
399399
- Video [Pareto-$\hat{k} as practical pre-asymptotic diagnostic of Monte Carlo estimates](https://www.youtube.com/watch?v=U_EbJMMVdAU&t=278s) (34min)
400400
- [Practical pre-asymptotic diagnostic of Monte Carlo estimates in Bayesian inference and machine learning](https://www.youtube.com/watch?v=uIojz7lOz9w&list=PLBqnAso5Dy7PCUJbWHO7z3bdeizDdgOhY&index=2) (50min)
401401

vignettes/online-only/faq.bib

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -627,4 +627,14 @@ @article{Vehtari+etal:2019:limitations
627627
volume={2},
628628
pages={22--27},
629629
year={2019}
630-
}
630+
}
631+
632+
@article{Vehtari+etal:PSIS:2024,
633+
title={Pareto smoothed importance sampling},
634+
author={Vehtari, Aki and Simpson, Daniel and Gelman, Andrew and Yao, Yuling and Gabry, Jonah},
635+
journal={Journal of Machine Learning Research},
636+
year={2024},
637+
volume = 25,
638+
number = 72,
639+
pages = {1--58}
640+
}

0 commit comments

Comments
 (0)