|
18 | 18 | #' [`elpd_loo`][loo-glossary] or `elpd_waic` (or multiplied by \eqn{-2}, if |
19 | 19 | #' desired, to be on the deviance scale). |
20 | 20 | #' |
21 | | -#' When using `loo_compare()`, the returned matrix will have one row per model |
22 | | -#' and several columns of estimates. The values in the |
23 | | -#' [`elpd_diff`][loo-glossary] and [`se_diff`][loo-glossary] columns of the |
24 | | -#' returned matrix are computed by making pairwise comparisons between each |
25 | | -#' model and the model with the largest ELPD (the model in the first row). For |
26 | | -#' this reason the `elpd_diff` column will always have the value `0` in the |
27 | | -#' first row (i.e., the difference between the preferred model and itself) and |
28 | | -#' negative values in subsequent rows for the remaining models. |
| 21 | +#' ## `elpd_diff` and `se_diff` |
| 22 | +#' When using `loo_compare()`, the returned data frame will have one row per |
| 23 | +#' model and several columns of estimates. The values of |
| 24 | +#' [`elpd_diff`][loo-glossary] and [`se_diff`][loo-glossary] are computed by |
| 25 | +#' making pairwise comparisons between each model and the model with the |
| 26 | +#' largest ELPD (the model listed first). Therefore, the first `elpd_diff` |
| 27 | +#' value will always be `0` (i.e., the difference between the preferred model |
| 28 | +#' and itself) and the rest of the values will be negative. |
29 | 29 | #' |
30 | 30 | #' To compute the standard error of the difference in [ELPD][loo-glossary] --- |
31 | 31 | #' which should not be expected to equal the difference of the standard errors |
|
39 | 39 | #' distribution, a practice derived for Gaussian linear models or |
40 | 40 | #' asymptotically, and which only applies to nested models in any case. |
41 | 41 | #' |
42 | | -#' The values in the `p_worse` column show the probabilities for models |
43 | | -#' having worse ELPD than the best model. These probabilities are |
44 | | -#' computed using the normal approximation and values from the |
45 | | -#' columns `elpd_diff` and `se_diff`. Sivula et al. (2025) present |
46 | | -#' the conditions when the normal approximation used for SE and |
47 | | -#' `se_diff` is good, and the column `diag_pnorm` contains possible |
48 | | -#' diagnostic messages: 1) small data (N < 100), 2) similar |
49 | | -#' predictions (|elpd_diff| < 4), or 3) possible outliers (khat > 0.5). |
50 | | -#' If any of these diagnostic messages is shown, the normal |
51 | | -#' approximation is not well calibrated and the shown probabilities |
52 | | -#' can be too large (small data or similar predictions) or too small |
53 | | -#' (outliers). |
| 42 | +#' ## `p_worse` and `diag_pnorm` |
| 43 | +#' The values in the `p_worse` column show the probability of each model |
| 44 | +#' having worse ELPD than the best model. These probabilities are computed |
| 45 | +#' with a normal approximation using the values from `elpd_diff` and |
| 46 | +#' `se_diff`. Sivula et al. (2025) present the conditions when the normal |
| 47 | +#' approximation used for SE and `se_diff` is good, and the column |
| 48 | +#' `diag_pnorm` contains possible diagnostic messages: |
54 | 49 | #' |
| 50 | +#' * small data (`N < 100`), |
| 51 | +#' * similar predictions (`|elpd_diff| < 4`) |
| 52 | +#' * possible outliers (`khat > 0.5`) |
| 53 | +#' |
| 54 | +#' If any of these diagnostic messages is shown, the normal approximation is |
| 55 | +#' not well calibrated and the probabilities can be too large (small data or |
| 56 | +#' similar predictions) or too small (outliers). |
| 57 | +#' |
| 58 | +#' ## Warnings for many model comparisons |
55 | 59 | #' If more than \eqn{11} models are compared, we internally recompute the model |
56 | 60 | #' differences using the median model by ELPD as the baseline model. We then |
57 | 61 | #' estimate whether the differences in predictive performance are potentially |
|
0 commit comments