Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
6301adb
first attempt at implementing aggregate loss
mc4117 Apr 23, 2026
471f7f3
Merge branch 'main' into feat/aggregate_loss
mc4117 Apr 23, 2026
b83e921
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
6377c1a
tidy up doc string
mc4117 Apr 23, 2026
3b8769b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
eef75dd
fix tests
mc4117 Apr 23, 2026
97e5af7
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 23, 2026
febab85
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
ebd75f9
update schema
mc4117 Apr 23, 2026
a5168d5
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 23, 2026
63a0ff7
update name
mc4117 Apr 23, 2026
af56b79
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
d743272
Rename AggregateLossWrapper to TimeAggregateLossWrapper
mc4117 Apr 23, 2026
04ba026
Merge branch 'main' into feat/aggregate_loss
mc4117 Apr 23, 2026
aa68344
fix schema
mc4117 Apr 23, 2026
eb43110
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 23, 2026
3746fb3
fix schema and add restriction
mc4117 Apr 23, 2026
fd94837
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
e137d26
fix configs
mc4117 Apr 23, 2026
913bad3
rm unnecessary test
mc4117 Apr 23, 2026
399d142
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 23, 2026
7583dff
fix failing tests
mc4117 Apr 23, 2026
e795398
fix integration test
mc4117 Apr 23, 2026
3cb9aee
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
0110498
Remove extra Config class from training schema
mc4117 Apr 23, 2026
8a8a788
different approach
mc4117 Apr 23, 2026
d1cb040
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 23, 2026
459d117
fix tests
mc4117 Apr 24, 2026
516fd0c
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 24, 2026
e043045
update tests
mc4117 Apr 24, 2026
5756e88
fix schema
mc4117 Apr 24, 2026
d12f446
update schema
mc4117 Apr 24, 2026
cb276f4
making time aggregate losses work for multi datasets
mc4117 Apr 24, 2026
e9e8476
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 24, 2026
e89565c
update losses in config
mc4117 Apr 24, 2026
6e5d9f0
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 24, 2026
927d9c9
Merge branch 'main' into feat/aggregate_loss
mc4117 Apr 27, 2026
f445b21
different approach adding loss folders
mc4117 Apr 27, 2026
d7a0ff6
fix merge
mc4117 Apr 27, 2026
59c0e11
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 27, 2026
6cb6e87
rename losses
mc4117 Apr 27, 2026
bc392ea
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 Apr 27, 2026
fd08cc4
rm folder
mc4117 Apr 27, 2026
c98565e
update configs
mc4117 Apr 27, 2026
c9d7748
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 27, 2026
9d5cc64
Clean up single.yaml by removing loss functions comments
mc4117 Apr 27, 2026
6390053
Remove commented-out loss function section
mc4117 Apr 27, 2026
3ad6f9e
ensemle loss
mc4117 Apr 27, 2026
dadbc78
merge
mc4117 Apr 27, 2026
ba4aeee
fix integration
mc4117 Apr 28, 2026
3d23170
fix pre commit
mc4117 Apr 28, 2026
3b0d59e
fix integration tests
mc4117 Apr 28, 2026
426891e
revert change to base loss
mc4117 Apr 28, 2026
9384302
fix failing test
mc4117 Apr 28, 2026
8981866
update documentation
mc4117 Apr 29, 2026
6df1df5
Merge branch 'main' into feat/aggregate_loss
mc4117 Apr 30, 2026
009e07f
feat: add BaseLossWrapper as transparent loss wrapper base class (#1082)
VeraChristina May 1, 2026
d246cec
preserve inner loss squash mode when defaults are used ; use amin amax
ssmmnn11 May 1, 2026
8b67732
change loss calculation
mc4117 May 8, 2026
0aaf41c
revert change mod
mc4117 May 8, 2026
b84c7ca
fix merge conflict
mc4117 May 8, 2026
c0e88d5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 8, 2026
3f7dab5
rm configs
mc4117 May 8, 2026
c8d24a1
merge
mc4117 May 8, 2026
b0e7a7c
fix failing schema
mc4117 May 8, 2026
99497a0
fix failing tests
mc4117 May 8, 2026
e726b5c
fix pre commit
mc4117 May 8, 2026
b72f0f9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 8, 2026
b83c8aa
change scalings
mc4117 May 9, 2026
f2d26c9
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 May 9, 2026
79f690d
fix failing tests
mc4117 May 9, 2026
a4c5df3
fix schema
mc4117 May 11, 2026
043a14b
fix ensemble crps
mc4117 May 11, 2026
341c1c6
Merge branch 'main' into feat/aggregate_loss
mc4117 May 11, 2026
75e9373
fix failing tests
mc4117 May 11, 2026
08f70ac
fix tests
mc4117 May 11, 2026
570f638
rename to composite loss
mc4117 May 11, 2026
abb61f5
update docs
mc4117 May 11, 2026
061c01c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 11, 2026
9d4b4ef
rename
mc4117 May 11, 2026
f1ff222
fix failing test
mc4117 May 11, 2026
db62241
Merge branch 'main' into feat/aggregate_loss
mc4117 May 11, 2026
ff44365
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 11, 2026
421e6c0
fix scalers
mc4117 May 12, 2026
e95d0e8
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 May 12, 2026
503436d
update combined loss
mc4117 May 13, 2026
eb6a41c
update tests
mc4117 May 13, 2026
0d36005
fix merge conflict
mc4117 May 13, 2026
4a62954
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 13, 2026
aaff297
fix failing tests
mc4117 May 13, 2026
de95899
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 May 13, 2026
f0643ac
pre commit hook
mc4117 May 13, 2026
5c409a3
update behaviour
mc4117 May 14, 2026
4427846
fix pre commit
mc4117 May 14, 2026
df2098e
fix tests
mc4117 May 14, 2026
edf05b4
merge conflict
mc4117 May 14, 2026
056d047
add per timestep callback
mc4117 May 14, 2026
e550145
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 14, 2026
de82ddf
fix unit tests
mc4117 May 14, 2026
5ae450f
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 May 14, 2026
b04654c
tests
mc4117 May 14, 2026
210f5d1
Merge branch 'main' into feat/aggregate_loss
mc4117 May 14, 2026
e7db74f
tests
mc4117 May 14, 2026
b2b51b3
Merge branch 'feat/aggregate_loss' of https://github.com/ecmwf/anemoi…
mc4117 May 14, 2026
31ee3a0
update config
mc4117 May 14, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added none
Binary file not shown.
54 changes: 54 additions & 0 deletions training/docs/modules/losses.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,60 @@ deterministic:

.. _multiscale-loss-functions:

***************************
Time Aggregate Loss Functions
***************************

These loss functions encourage the model to produce **temporally consistent** outputs
i.e. output sequences that are internally coherent over
time, not just accurate at each individual step.

:class:`~anemoi.training.losses.aggregate.TimeAggregateLossWrapper`
addresses this by applying a base loss function to *time-aggregated*
versions of the prediction and target, rather than step-by-step. The
following aggregations are supported:

.. list-table::
:widths: 15 85
:header-rows: 1

- - Aggregation
- Description

- - ``mean``
- Mean over the output time window — penalises bias in the
temporal average.

- - ``max``
- Maximum over the output time window — penalises errors in peak
values.

- - ``min``
- Minimum over the output time window — penalises errors in
minimum values.

- - ``diff``
- Consecutive step-to-step differences
(``pred[:, 1:] - pred[:, :-1]``) — penalises unrealistic
temporal transitions and discontinuities.

The wrapper accumulates the specified loss function evaluated on each aggregation in
turn and returns the sum. Because the ``time_steps`` scaler is
intentionally excluded from the inner ``loss_fn`` (temporal aggregation
collapses the time dimension), only spatial and variable scalers should
be listed there.

.. note::

``TimeAggregateLossWrapper`` requires an output time dimension
greater than one, as it is not
meaningful for single-step tasks.

We strongly recommend using the time aggregate loss when training any
temporal downscaler. The pre-built config variants ``single_MSE_aggregation``
and ``ensemble_multiscale_aggregation`` combine it with the primary loss inside a
:class:`~anemoi.training.losses.combined.CombinedLoss`.

***************************
Multiscale Loss Functions
***************************
Expand Down
3 changes: 3 additions & 0 deletions training/docs/modules/tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ Example: ``input_timestep="6H"``, ``output_timestep="3H"``,
``output_left_boundary=True`` produces output offsets
``[0H, 3H]`` and input offsets ``[0H, 6H]``.

The default is to use the time aggregate loss when training any
temporal downscaler.

.. automodule:: anemoi.training.tasks.temporal_downscaling
:members:
:no-undoc-members:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ defaults:
- model: graphtransformer
- task: temporal_downscaler
- training: single
- override training/training_loss: single_MSE_aggregation
- _self_

config_validation: True
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ defaults:
- model: graphtransformer_ens
- task: temporal_downscaler
- training: ensemble
- override training/training_loss: ensemble_multiscale_aggregation
- _self_

config_validation: True
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
defaults:
- scalers: global
- training_loss: single
- optimization: default
- weight_averaging: null

Expand Down
28 changes: 1 addition & 27 deletions training/src/anemoi/training/config/training/ensemble.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
defaults:
- scalers: global
- training_loss: ensemble
- optimization: default
- weight_averaging: null

Expand Down Expand Up @@ -51,33 +52,6 @@ strategy:
loss_gradient_scaling: False


# loss function for the model
# To train without multiscale loss, set it to the desired loss directly
training_loss:
datasets:
data: # user-defined key in data
# loss class to initialise, can be anything subclassing torch.nn.Module
_target_: anemoi.training.losses.MultiscaleLossWrapper
# Disk mode: multiscale_config: {loss_matrices_path: /path, loss_matrices: ["file.npz", null]}
# On-the-fly: multiscale_config: {num_scales: 4, base_num_nearest_neighbours: 16, base_sigma: 0.01570}
multiscale_config: null # null = single scale, no smoothing
weights: [1.0]

per_scale_loss:
_target_: anemoi.training.losses.CRPS
scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights', 'time_steps']

# Scalers to include in loss calculation
# A selection of available scalers are listed in training/scalers.
# '*' is a valid entry to use all `scalers` given, if a scaler is to be excluded
# add `!scaler_name`, i.e. ['*', '!scaler_1'], and `scaler_1` will not be added.
# scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights']
ignore_nans: False
no_autocast: True
alpha: 0.95



# Validation metrics calculation,
# This may be a list, in which case all metrics will be calculated
# and logged according to their name.
Expand Down
14 changes: 1 addition & 13 deletions training/src/anemoi/training/config/training/lam.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
defaults:
- scalers: lam
- training_loss: single
- optimization: default
- weight_averaging: null

Expand Down Expand Up @@ -48,19 +49,6 @@ strategy:
# don't enable this by default until it's been tested and proven beneficial
loss_gradient_scaling: False

# loss function for the model
training_loss:
datasets:
data: # user-defined key in data
# loss class to initialise
_target_: anemoi.training.losses.MSELoss
# Scalers to include in loss calculation
# A selection of available scalers are listed in training/scalers/scalers.yaml
# '*' is a valid entry to use all `scalers` given, if a scaler is to be excluded
# add `!scaler_name`, i.e. ['*', '!scaler_1'], and `scaler_1` will not be added.
scalers: ['pressure_level', 'general_variable', 'node_weights', 'time_steps']
ignore_nans: False

# Validation metrics calculation,
# This may be a list, in which case all metrics will be calculated
# and logged according to their name.
Expand Down
2 changes: 1 addition & 1 deletion training/src/anemoi/training/config/training/multi.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
defaults:
- scalers: multi
- training_loss: single
- optimization: default
- weight_averaging: null

Expand Down Expand Up @@ -56,7 +57,6 @@ max_steps: 150000


submodules_to_freeze: []

# Dataset-specific loss and metrics configuration
training_loss:
datasets:
Expand Down
16 changes: 1 addition & 15 deletions training/src/anemoi/training/config/training/single.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
defaults:
- scalers: global
- training_loss: single
- optimization: default
- weight_averaging: null

Expand Down Expand Up @@ -39,26 +40,11 @@ strategy:
num_gpus_per_model: ${system.hardware.num_gpus_per_model}
read_group_size: ${dataloader.read_group_size}

# loss functions

# dynamic rescaling of the loss gradient
# see https://arxiv.org/pdf/2306.06079.pdf, section 4.3.2
# don't enable this by default until it's been tested and proven beneficial
loss_gradient_scaling: False

# loss function for the model
training_loss:
datasets:
data: # user-defined key in data
# loss class to initialise
_target_: anemoi.training.losses.MSELoss
# Scalers to include in loss calculation
# A selection of available scalers are listed in training/scalers.
# '*' is a valid entry to use all `scalers` given, if a scaler is to be excluded
# add `!scaler_name`, i.e. ['*', '!scaler_1'], and `scaler_1` will not be added.
scalers: ['pressure_level', 'general_variable', 'node_weights', 'time_steps']
ignore_nans: False

# Validation metrics calculation,
# This may be a list, in which case all metrics will be calculated
# and logged according to their name.
Expand Down
14 changes: 1 addition & 13 deletions training/src/anemoi/training/config/training/stretched.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
defaults:
- scalers: stretched
- training_loss: single
- optimization: default
- weight_averaging: null

Expand Down Expand Up @@ -49,19 +50,6 @@ strategy:
# don't enable this by default until it's been tested and proven beneficial
loss_gradient_scaling: False

# loss function for the model
training_loss:
datasets:
data: # user-defined key in data
# loss class to initialise
_target_: anemoi.training.losses.MSELoss
# Scalers to include in loss calculation
# A selection of available scalers are listed in training/scalers/scalers.yaml
# '*' is a valid entry to use all `scalers` given, if a scaler is to be excluded
# add `!scaler_name`, i.e. ['*', '!scaler_1'], and `scaler_1` will not be added.
scalers: ['pressure_level', 'general_variable', 'node_weights', 'time_steps']
ignore_nans: False

# Validation metrics calculation,
# This may be a list, in which case all metrics will be calculated
# and logged according to their name.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# loss function for the model
# To train without multiscale loss, set it to the desired loss directly
datasets:
data: # user-defined key in data
# loss class to initialise, can be anything subclassing torch.nn.Module
_target_: anemoi.training.losses.MultiscaleLossWrapper
# Disk mode: multiscale_config: {loss_matrices_path: /path, loss_matrices: ["file.npz", null]}
# On-the-fly: multiscale_config: {num_scales: 4, base_num_nearest_neighbours: 16, base_sigma: 0.01570}
multiscale_config: null # null = single scale, no smoothing
weights: [1.0]
per_scale_loss:
_target_: anemoi.training.losses.CRPS
scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights', 'time_steps']

# Scalers to include in loss calculation
# A selection of available scalers are listed in training/scalers.
# '*' is a valid entry to use all `scalers` given, if a scaler is to be excluded
# add `!scaler_name`, i.e. ['*', '!scaler_1'], and `scaler_1` will not be added.
# scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights']
ignore_nans: False
no_autocast: True
alpha: 0.95
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
datasets:
Comment thread
mc4117 marked this conversation as resolved.
data:
_target_: anemoi.training.losses.combined.CombinedLoss
ignore_nans: False
# loss_weights: [n_timesteps / (n_timesteps + n_agg_ops), n_agg_ops / (n_timesteps + n_agg_ops)]
# Each sub-loss averages internally (raw over timesteps, aggregate over agg ops).
# These weights re-scale so the total matches: sum_all / (n_timesteps + n_agg_ops).
# Example for 6 timesteps and 4 agg ops: [0.6, 0.4]
loss_weights: [0.6, 0.4]
losses:
- _target_: anemoi.training.losses.MultiscaleLossWrapper
multiscale_config: null # null = single scale, no smoothing
weights: [1.0]
per_scale_loss:
_target_: anemoi.training.losses.CRPS
scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights', 'time_steps']
ignore_nans: False
no_autocast: True
alpha: 0.95
- _target_: anemoi.training.losses.aggregate.TimeAggregateLossWrapper
scalers: ['pressure_level', 'general_variable', 'nan_mask_weights', 'node_weights']
time_aggregation_types: [mean, max, min, diff]
loss_fn:
_target_: anemoi.training.losses.CRPS
ignore_nans: False
no_autocast: True
alpha: 0.95
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
datasets:
data:
_target_: anemoi.training.losses.MSELoss
# Scalers to include in loss calculation
# A selection of available scalers are listed in training/scalers.
# '*' is a valid entry to use all `scalers` given, if a scaler is to be excluded
# add `!scaler_name`, i.e. ['*', '!scaler_1'], and `scaler_1` will not be added.
scalers: ['pressure_level', 'general_variable', 'node_weights', 'time_steps']
ignore_nans: False
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
datasets:
Comment thread
mc4117 marked this conversation as resolved.
data:
_target_: anemoi.training.losses.combined.CombinedLoss
ignore_nans: False
# loss_weights: [n_timesteps / (n_timesteps + n_agg_ops), n_agg_ops / (n_timesteps + n_agg_ops)]
# Each sub-loss averages internally (raw over timesteps, aggregate over agg ops).
# These weights re-scale so the total matches: sum_all / (n_timesteps + n_agg_ops).
# Example for 6 timesteps and 4 agg ops: [0.6, 0.4]
loss_weights: [0.6, 0.4]
losses:
- _target_: anemoi.training.losses.MSELoss
scalers: ['pressure_level', 'general_variable', 'node_weights', 'time_steps']
ignore_nans: False
- _target_: anemoi.training.losses.aggregate.TimeAggregateLossWrapper
scalers: ['pressure_level', 'general_variable', 'node_weights']
time_aggregation_types: [mean, max, min, diff]
loss_fn:
_target_: anemoi.training.losses.MSELoss
ignore_nans: False
Loading
Loading