Skip to content

Commit de47a37

Browse files
author
Florence Bockting
committed
docs: update vignette
1 parent 3798c2b commit de47a37

1 file changed

Lines changed: 45 additions & 69 deletions

File tree

vignettes/loo-pit-correlated-tests.Rmd

Lines changed: 45 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ bayesplot_theme_set()
2828
set.seed(2026)
2929
```
3030

31+
## Setup
3132
```{r, eval=FALSE}
3233
library(bayesplot)
3334
library(ggplot2)
@@ -49,23 +50,6 @@ Following the work of Tesso & Vehtari ([2026](#Tesso2026)), `bayesplot` now offe
4950

5051
This vignette focuses specifically on the changes introduced by this new correlation-aware method. For background information on graphical uniformity tests using PIT, see Säilynoja et al. ([2022](#Säilynoja2022)). For a more general discussion on the use of Leave-One-Out Cross-Validation (LOO-CV), see Vehtari et al. ([2017](#Vehtari2017), [2024](#Vehtari2024)), among others.
5152

52-
## The `method` argument
53-
### `method = "independent"` (superseded)
54-
When `method = "independent"` is selected, simultaneous confidence bands for the ECDF are constructed under the assumption that the PIT values are both independent and uniform (Säilynoja et al., [2022](#Säilynoja2022)). However, if this independence assumption is violated, the resulting bands can be too wide, which reduces the test's sensitivity to actual miscalibration (Tesso & Vehtari, [2026](#Tesso2026)).
55-
56-
**Deprecation and Compatibility**
57-
58-
As of `bayesplot vX.X.X`, the `"independent"` method is officially superseded. To maintain backward compatibility, `"independent"` remains the current default; however, using it will now trigger a message informing the user:
59-
```
60-
"The 'independent' method is superseded by the 'correlated' method."
61-
```
62-
This is intended to encourage a transition to the `"correlated"` method, which will become the default in a future release.
63-
64-
### `method = "correlated"` (new, recommended)
65-
This method employes one of three dependence-aware uniformity tests (selected via the `test` argument) to compute a global p-value for the null hypothesis of uniformity. Unlike the independent method, it accounts for the correlation among PIT values (Tesso & Vehtari, [2026](#Tesso2026)).
66-
67-
Instead of drawing traditional confidence bands, the plot highlights ECDF regions in red where the pointwise contribution to the test statistic is largest. This visualization makes it easier to diagnose the *type* and *location* of miscalibration.
68-
6953
## Reading the plots for different (mis)calibration scenarios
7054
The shape of the ECDF curve provides direct insight into *how* a predictive distribution is miscalibrated. To illustrate this, the following examples utilize simulated scenarios where "observed" values (`y`) are drawn from a `normal(0, sd)` distribution, while "replicated" values (`yrep`) are generated from a non-central t-distribution. By varying the degrees of freedom (`df`) and non-centrality parameter (`ncp`), we can simulate and visualize several distinct types of miscalibration.
7155

@@ -104,10 +88,9 @@ plot_scenarios <- function(
10488
legend.position = "top",
10589
legend.direction = "horizontal"
10690
)
107-
p2 <- ppc_pit_ecdf(y = y, yrep = yrep, method = "correlated", prob = 0.95) +
108-
labs(subtitle = "method = 'correlated'")
109-
p3 <- ppc_pit_ecdf(y = y, yrep = yrep, method = "independent", prob = 0.95) +
110-
labs(subtitle = "method = 'independent'")
91+
p2 <- ppc_pit_ecdf(y = y, yrep = yrep, method = "correlated", prob = 0.95)
92+
p3 <- ppc_pit_ecdf(y = y, yrep = yrep, method = "correlated",
93+
plot_diff = TRUE, prob = 0.95)
11194
11295
bayesplot_grid(
11396
p1, p2, p3,
@@ -166,13 +149,48 @@ When the model's predictions are systematically shifted relative to the data, ob
166149

167150
$$
168151
\begin{aligned}
169-
y &\sim \text{Normal}(1, 1) \\
170-
y_{rep} &\sim \text{Student}_{\nu = 100}(0, 1) \quad (\approx \text{Normal}(0, 1))
152+
y &\sim \text{Normal}(0, 1) \\
153+
y_{rep} &\sim \text{Student}_{\nu = 100}(-1, 1)
171154
\end{aligned}
172155
$$
173156

174157
```{r biased, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", fig.align="center", warning=FALSE, message=FALSE}
175-
plot_scenarios(df = 100, ncp = 1)
158+
plot_scenarios(df = 100, ncp = -1)
159+
```
160+
161+
## The `method` argument
162+
### `method = "independent"` (superseded)
163+
When `method = "independent"` is selected, simultaneous confidence bands for the ECDF are constructed under the assumption that the PIT values are both independent and uniform (Säilynoja et al., [2022](#Säilynoja2022)). However, if this independence assumption is violated, the resulting bands can be too wide, which reduces the test's sensitivity to actual miscalibration (Tesso & Vehtari, [2026](#Tesso2026)).
164+
165+
**Deprecation and Compatibility**
166+
167+
As of `bayesplot vX.X.X`, the `"independent"` method is officially superseded. To maintain backward compatibility, `"independent"` remains the current default; however, using it will now trigger a message informing the user:
168+
```
169+
"The 'independent' method is superseded by the 'correlated' method."
170+
```
171+
This is intended to encourage a transition to the `"correlated"` method, which will become the default in a future release.
172+
173+
### `method = "correlated"` (new, recommended)
174+
This method employes one of three dependence-aware uniformity tests (selected via the `test` argument) to compute a global p-value for the null hypothesis of uniformity. Unlike the independent method, it accounts for the correlation among PIT values (Tesso & Vehtari, [2026](#Tesso2026)).
175+
176+
Instead of drawing traditional confidence bands, the plot highlights ECDF regions in red where the pointwise contribution to the test statistic is largest. This visualization makes it easier to diagnose the *type* and *location* of miscalibration.
177+
178+
```{r comparison, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", fig.align="center"}
179+
set.seed(2026)
180+
pit <- rbeta(300, 1, 1.2)
181+
182+
p1 <- ppc_loo_pit_ecdf(
183+
pit = pit,
184+
method = "independent"
185+
) +
186+
labs(subtitle = "method = 'independent'")
187+
188+
p2 <- ppc_loo_pit_ecdf(
189+
pit = pit, method = "correlated"
190+
) +
191+
labs(subtitle = "method = 'correlated'")
192+
193+
p1 | p2
176194
```
177195

178196
## The three uniformity-tests within `method = "correlated"`
@@ -222,30 +240,13 @@ fit_normal <- brms::add_criterion(fit_normal, criterion="loo", save_psis=TRUE)
222240

223241
### `ppc_pit_ecdf()`
224242

225-
```{r ppc-pit-example-1, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", warning=FALSE, message=FALSE}
243+
```{r ppc-pit-example, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", warning=FALSE, message=FALSE}
226244
p1 <- ppc_pit_ecdf(
227-
y = cu_df$cu,
228-
yrep = brms::posterior_predict(fit_normal),
229-
method = "independent"
230-
)
231-
232-
p2 <- ppc_pit_ecdf(
233245
y = cu_df$cu,
234246
yrep = brms::posterior_predict(fit_normal),
235247
method = "correlated"
236248
)
237249
238-
p1 | p2
239-
```
240-
241-
```{r ppc-pit-example-2, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", warning=FALSE, message=FALSE}
242-
p1 <- ppc_pit_ecdf(
243-
y = cu_df$cu,
244-
yrep = brms::posterior_predict(fit_normal),
245-
method = "independent",
246-
plot_diff = TRUE
247-
)
248-
249250
p2 <- ppc_pit_ecdf(
250251
y = cu_df$cu,
251252
yrep = brms::posterior_predict(fit_normal),
@@ -258,33 +259,14 @@ p1 | p2
258259

259260
### `ppc_loo_pit_ecdf()`
260261

261-
```{r loo-pit-example-1, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", warning=FALSE, message=FALSE}
262+
```{r ppc-loo-pit-example, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", warning=FALSE, message=FALSE}
262263
p1 <- ppc_loo_pit_ecdf(
263-
y = cu_df$cu,
264-
yrep = brms::posterior_predict(fit_normal),
265-
method = "independent",
266-
lw = weights(loo(fit_normal)$psis_object)
267-
)
268-
269-
p2 <- ppc_loo_pit_ecdf(
270264
y = cu_df$cu,
271265
yrep = brms::posterior_predict(fit_normal),
272266
method = "correlated",
273267
lw = weights(loo(fit_normal)$psis_object)
274268
)
275269
276-
p1 | p2
277-
```
278-
279-
```{r loo-pit-example-2, fig.width=7, fig.height=3, fig.asp=NULL, out.width="80%", warning=FALSE, message=FALSE}
280-
p1 <- ppc_loo_pit_ecdf(
281-
y = cu_df$cu,
282-
yrep = brms::posterior_predict(fit_normal),
283-
method = "independent",
284-
lw = weights(loo(fit_normal)$psis_object),
285-
plot_diff = TRUE
286-
)
287-
288270
p2 <- ppc_loo_pit_ecdf(
289271
y = cu_df$cu,
290272
yrep = brms::posterior_predict(fit_normal),
@@ -310,20 +292,14 @@ ppc_pit_ecdf_grouped(
310292

311293
## Using `brms::pp_check()`
312294
It is also possible to use `brms::pp_check()` with `type = "loo_pit_ecdf"` to perform the same testing and plotting procedure as `ppc_loo_pit_ecdf()`. The following example demonstrates this using the same fitted model as above with `method = "correlated"`.
313-
```{r pp-check-example-1, fig.show="hold", out.width="80%", warning=FALSE, message=FALSE}
295+
```{r pp-check-example, fig.show="hold", out.width="80%", warning=FALSE, message=FALSE}
314296
brms::pp_check(
315297
fit_normal,
316298
type = "loo_pit_ecdf",
317299
method = "correlated"
318300
)
319301
```
320302

321-
Or for `method = "independent"`:
322-
323-
```{r pp-check-example-2, fig.show="hold", out.width="80%", warning=FALSE, message=FALSE}
324-
brms::pp_check(fit_normal, method = "independent", type = "pit_ecdf")
325-
```
326-
327303
## Additional arguments
328304
With the introduction of the `method = "correlated"` option, the three functions now have additional arguments that control the appearance and behavior of the plot when using correlated testing procedures. These arguments are:
329305

0 commit comments

Comments
 (0)