Change default almost_fair_crps_alpha from 0.95 to 1.0#1139
Conversation
Adds FiniteDifferenceCRPSLoss which computes CRPS on spatial finite differences, with optional multi-level coarsening via avg_pool2d. Integrates into EnsembleLoss via finite_difference_crps_weight and finite_difference_crps_levels parameters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This is motivated by a) never having shown that afCRPS is helpful for our use case, and b) the theoretical motivation for afCRPS breaking down when we include energy score or finite difference CRPS as part of the loss, and c) the original theoretical motivation for afCRPS not being fully convincing, and the paper not clearly expressing it was done to fix an encountered issue vs just chosen for theoretical reasons. Regarding b), afCRPS exists for the case where one of the predictions equals exactly the target, making it so the CRPS doesn't constrain the other target at all. However, we never only use CRPS in practice, and the other loss terms (energy score and finite difference CRPS) require many outputs to exactly equal the target, which is not a risk. Regarding c), the failure mode really just means that sample won't contribute to gradient updates - the behavior for epsilon differences from the target is smooth, with small gradients/updates that reduce to zero as one of the two ensemble values approaches the target. Finally, I'm a little worried that our SSR metrics are consistently a bit uncalibrated. It would be nice to remove this as a potential source of that mis-calibration. |
| finite_difference_crps_weight: float = 0.0, | ||
| finite_difference_crps_levels: int = 1, | ||
| almost_fair_crps_alpha: float = 0.95, | ||
| almost_fair_crps_alpha: float = 1.0, |
There was a problem hiding this comment.
If we do decide this is the better approach, I lean toward making this "opt-in" and adding it to the baseline configs, rather than changing the pre-existing default behavior.
There was a problem hiding this comment.
My hesitance with that option is that in the past when Troy and I have had that scenario (we have a new config set we agree to use, and we update the baseline configs) we invariably forget to add it to several experiments. I think we're still missing affine_norms: true in a lot of our experimental configs.
|
It would be nice to just launch a quick experiment to verify this doesn't lead to any noticeable degradation. |
Yeah we can hold off for this - it's a one-line PR that isn't likely to develop merge conflicts. |
|
Ran benchmark on 4-degree daily era5-only no-co2 training: alpha=0.95: https://wandb.ai/ai2cm/ace/runs/injiirnf The runs show slightly lower CRPS across many variables while having slightly higher RMSE, and SSRs are slightly higher (which is good as most variables generally go low-biased on signal later in training. A stronger sign that this is a good change is that long-inference power spectral biases are improved for many variables, despite this not being directly optimized by the change: |
… default of 1.0 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Arcomano1234
left a comment
There was a problem hiding this comment.
Yeah I tend to agree that the "best" values for these types of parameters should be the default because we will inevitably be copying and pasting old configs and forget to explicitly set this. It also seems to not hurt most metrics and in some cases as Jeremy pointed out it improves them (marginally).




Changes the default
almost_fair_crps_alphainEnsembleLossfrom 0.95 (almost-fair CRPS) to 1.0 (fair CRPS). The almost-fair modification was originally motivated by avoiding unconstrained ensemble members when one member exactly matches the target, but in practice the loss is smooth in this regime and the gradient signal is sufficient without the modification.Changes:
fme.core.loss.EnsembleLoss: changealmost_fair_crps_alphadefault from 0.95 to 1.0Tests added
If dependencies changed, "deps only" image rebuilt and "latest_deps_only_image.txt" file updated
Depends on #1138
Ran benchmark on 4-degree daily era5-only no-co2 training:
alpha=0.95: https://wandb.ai/ai2cm/ace/runs/injiirnf
alpha=1.0: https://wandb.ai/ai2cm/ace/runs/ozsyoxtz