feat(training, models)!: add transport diffusion and stochastic interpolant by ssmmnn11 · Pull Request #1096 · ecmwf/anemoi-core

ssmmnn11 · 2026-05-06T13:04:46Z

Introduce transport model for stochastic interpolants and diffusion EDM.

Stochastic interpolants provide a general framework for learning continuous paths between probability distributions, with Flow Matching being a special case. This makes them a flexible framework for forecasting, time-interpolation and downscaling.

training config
    training.transport_objective: diffusion | stochastic_interpolant
    training.prediction_mode:     state | tendency
    model.model.transport.objective: diffusion | stochastic_interpolant
          |
          v
  TransportTraining
    - owns the training step
    - selects PredictionMode
    - selects TransportObjective
          |
          +-----------------------------+
          |                             |
          v                             v
  PredictionMode                  TransportObjective
  state / tendency                diffusion / stochastic_interpolant
    - builds clean target           - builds source/noise
    - decides target space          - corrupts target
    - reconstructs metrics          - defines loss target
          |                             |
          +-------------+---------------+
                        v
  PreparedTransportObjective
    conditioned_target  -> corrupted/noised target passed to model
    condition           -> sigma or SI time
    loss_target         -> clean target or SI drift
    weights             -> EDM weights or None
    aux                 -> SI source/interpolant/time etc.
                        |
                        v
           model.forward(
             x,
             conditioned_target,
             condition,
           )
                        |
                        v
           loss / validation

PredictionMode decides what to predict:
- state
- tendency
TransportObjective decides how to train:
- diffusion: target as target + sigma * source, predicts clean endpoint, uses EDM weighting.
- stochastic_interpolant: builds bridge state and trains the model to predict bridge drift.
TransportSourceBuilder decides the source / anchor field:
- gaussian
- zero
- reference_state
- plus scale and additional additive Gaussian noise
SI bridge-noise schedules are now:
- brownian_bridge default
Flow-matching-like training is SI with:
- Gaussian source
- linear alpha/beta
- si_noise_scale: 0.0
- deterministic sampler such as heun.
During inference, the model-side TransportModelObjective dispatches to the sampler:
- diffusion -> EDM samplers like heun, dpmpp_2m
- SI -> euler/heun for deterministic vector field, or euler_maruyama for noisy sampling.

📚 Documentation preview 📚: https://anemoi-training--1096.org.readthedocs.build/en/1096/

📚 Documentation preview 📚: https://anemoi-graphs--1096.org.readthedocs.build/en/1096/

📚 Documentation preview 📚: https://anemoi-models--1096.org.readthedocs.build/en/1096/

…fusion # Conflicts: # models/src/anemoi/models/models/transport_encoder_processor_decoder.py

ssmmnn11 · 2026-05-06T20:22:59Z

tests

…fusion # Conflicts: # models/src/anemoi/models/models/transport_encoder_processor_decoder.py # models/src/anemoi/models/samplers/transport_samplers.py # training/src/anemoi/training/train/methods/diffusion.py

…o feat/transport-si-diffusion

…fusion # Conflicts: # training/src/anemoi/training/diagnostics/callbacks/plot.py # training/src/anemoi/training/diagnostics/plots.py # training/src/anemoi/training/train/methods/diffusion.py # training/tests/unit/diagnostics/test_plotting_callbacks.py

mchantry · 2026-05-13T12:32:40Z

@icedoom888 for awareness.

JoffreyDumontLeBrazidec · 2026-05-13T17:58:28Z

for downscaling, a clean solution is to add "residuals" as a new prediction objective (could also be part of the "tendency" but probably will result in something too hacky).

JoffreyDumontLeBrazidec

Very nice work Simon. It’s great to be able to try out the different options

Some comments:

1/ Some knob combinations are allowed but surprising
diffusion + source.kind: zero
diffusion + reference_state
We should either disallow clearly degenerate combinations or emit warnings/document them as experimental.
For example source.kind: reference_state with objective: diffusion is potentially useful for downscaling, where the source might be an upsampled LR state.
But I feel that could be confusing for users. Maybe document that Gaussian is the recommended/default source for EDM diffusion

2/ Why not heun_maruyama for SI sampling ? in the diffusion path, this is what we use.
Since SI transport is in this PR very generalistic I would consider making heun/euler and maruyama/nothing disconnected
→ Uncouple in the config
sampler: heun/euler
stochastic: true/false

3/ Also, for the SI stochastic sampler, once we add stochastic noise during sampling, noise can move samples away from the bridge. In the SI SDE formulation, this is usually corrected by score term to bring back the pushed away samples in the bridge marginal. Here in the PR, it looks like the SI part only estimates the drift/velocity. Could be nice to have in the doc that the noisy_sampling euler_maruyama is a heuristic noise injection for diversity.

4/ The PR separates diffusion and stochastic_interpolant in two separate options. SI is a sort of general framework where we could in theory meet edm_diffusion mathematically. On the other hand we have diffusion which is actually edm_diffusion so a very specific kind of diffusion which works empirically very well. I would rename diffusion in edm_diffusion to emphasize this and avoid suggesting it is a generic diffusion objective.

JoffreyDumontLeBrazidec

Very nice work Simon. It’s great to be able to try out the different options

Some comments:

1/ Some knob combinations are allowed but surprising
diffusion + source.kind: zero
diffusion + reference_state
We should either disallow clearly degenerate combinations or emit warnings/document them as experimental.
For example source.kind: reference_state with objective: diffusion is potentially useful for downscaling, where the source might be an upsampled LR state.
But I feel that could be confusing for users. Maybe document that Gaussian is the recommended/default source for EDM diffusion

2/ Why not heun_maruyama for SI sampling ? in the diffusion path, this is what we use.
Since SI transport is in this PR very generalistic I would consider making heun/euler and maruyama/nothing disconnected
--> Uncouple in the config
sampler: heun/euler
stochastic: true/false

3/ Also, for the SI stochastic sampler, once we add stochastic noise during sampling, noise can move samples away from the bridge. In the SI SDE formulation, this is usually corrected by score term to bring back the pushed away samples in the bridge marginal. Here in the PR, it looks like the SI part only estimates the drift/velocity. Could be nice to have in the doc that the noisy_sampling euler_maruyama is a heuristic noise injection for diversity.

4/ The PR separates diffusion and stochastic_interpolant in two separate options. SI is a sort of general framework where we could in theory meet edm_diffusion mathematically. On the other hand we have diffusion which is actually edm_diffusion so a very specific kind of diffusion which works empirically very well. I would rename diffusion in edm_diffusion to emphasize this and avoid suggesting it is a generic diffusion objective.

WeiPanMaths · 2026-05-14T12:52:29Z

Hi Simon, I think this is a really nice and elegant framework. Currently, if I were to pick something out, I feel the VectorFieldEulerSampler, VectorFieldHeunSampler, and StochasticInterpolantEulerMaruyamaSampler could be modularised a bit further. Right now each class mixes the numerical integration scheme with the specific SDE/ODE being solved, which means adding a new sampler requires a full new implementation.

An alternative design could be to separate the two concerns — a generic numerical solver class (e.g. a RungeKuttaSolver) that takes the integration scheme as a parameter, and a separate equation/dynamics class that defines the drift and diffusion terms. New samplers would then just be parameter choices rather than new classes. Something like a SDESolver(scheme="euler_maruyama", dynamics=StochasticInterpolantDynamics()).

That said, I appreciate this may be intentional for clarity and simplicity at this stage — just something to consider if the sampler zoo grows. Overall really cool!

ssmmnn11 · 2026-05-18T13:22:52Z

Hi Simon, I think this is a really nice and elegant framework. Currently, if I were to pick something out, I feel the VectorFieldEulerSampler, VectorFieldHeunSampler, and StochasticInterpolantEulerMaruyamaSampler could be modularised a bit further. Right now each class mixes the numerical integration scheme with the specific SDE/ODE being solved, which means adding a new sampler requires a full new implementation.

An alternative design could be to separate the two concerns — a generic numerical solver class (e.g. a RungeKuttaSolver) that takes the integration scheme as a parameter, and a separate equation/dynamics class that defines the drift and diffusion terms. New samplers would then just be parameter choices rather than new classes. Something like a SDESolver(scheme="euler_maruyama", dynamics=StochasticInterpolantDynamics()).

That said, I appreciate this may be intentional for clarity and simplicity at this stage — just something to consider if the sampler zoo grows. Overall really cool!

Very good point! I will put it on our todo list and leave this as a follow up for now.

ssmmnn11 · 2026-05-18T13:23:42Z

Very nice work Simon. It’s great to be able to try out the different options

Some comments:

1/ Some knob combinations are allowed but surprising diffusion + source.kind: zero diffusion + reference_state We should either disallow clearly degenerate combinations or emit warnings/document them as experimental. For example source.kind: reference_state with objective: diffusion is potentially useful for downscaling, where the source might be an upsampled LR state. But I feel that could be confusing for users. Maybe document that Gaussian is the recommended/default source for EDM diffusion

2/ Why not heun_maruyama for SI sampling ? in the diffusion path, this is what we use. Since SI transport is in this PR very generalistic I would consider making heun/euler and maruyama/nothing disconnected --> Uncouple in the config sampler: heun/euler stochastic: true/false

3/ Also, for the SI stochastic sampler, once we add stochastic noise during sampling, noise can move samples away from the bridge. In the SI SDE formulation, this is usually corrected by score term to bring back the pushed away samples in the bridge marginal. Here in the PR, it looks like the SI part only estimates the drift/velocity. Could be nice to have in the doc that the noisy_sampling euler_maruyama is a heuristic noise injection for diversity.

4/ The PR separates diffusion and stochastic_interpolant in two separate options. SI is a sort of general framework where we could in theory meet edm_diffusion mathematically. On the other hand we have diffusion which is actually edm_diffusion so a very specific kind of diffusion which works empirically very well. I would rename diffusion in edm_diffusion to emphasize this and avoid suggesting it is a generic diffusion objective.

Very true. I revised according to suggestions - we will support det. sampling for now.

…fusion # Conflicts: # training/src/anemoi/training/config/training/diffusion.yaml

JPXKQX · 2026-05-20T08:06:32Z

+``prediction_mode: tendency`` for tendency-space targets. The model must
+use :class:`AnemoiTransportModelEncProcDec` or
+:class:`AnemoiTransportTendModelEncProcDec`; the plain GNN model is not
+supported.


Would it make sense to move this to a note or a warning to highlight it?

.. warning:: The plain GNN model is not supported.

…fusion

Add transport diffusion and stochastic interpolant

7bef402

ssmmnn11 requested a review from JPXKQX May 6, 2026 13:04

ssmmnn11 self-assigned this May 6, 2026

github-project-automation Bot added this to Anemoi-dev May 6, 2026

github-project-automation Bot moved this to To be triaged in Anemoi-dev May 6, 2026

github-actions Bot added training models and removed training models labels May 6, 2026

ssmmnn11 changed the title ~~Add transport diffusion and stochastic interpolant~~ feat: add transport diffusion and stochastic interpolant May 6, 2026

ssmmnn11 added training models ATS Approval Needed Approval needed by ATS labels May 6, 2026

ssmmnn11 added 3 commits May 6, 2026 15:05

fixes

dfba7b4

test fix

9161041

Merge remote-tracking branch 'origin/main' into feat/transport-si-dif…

8350f9b

…fusion # Conflicts: # models/src/anemoi/models/models/transport_encoder_processor_decoder.py

Merge branch 'main' into feat/transport-si-diffusion

230ade4

ssmmnn11 changed the title ~~feat: add transport diffusion and stochastic interpolant~~ feat(training, models): add transport diffusion and stochastic interpolant May 7, 2026

ssmmnn11 added 3 commits May 7, 2026 12:58

Merge remote-tracking branch 'origin/main' into feat/transport-si-dif…

f985fd3

…fusion # Conflicts: # models/src/anemoi/models/models/transport_encoder_processor_decoder.py # models/src/anemoi/models/samplers/transport_samplers.py # training/src/anemoi/training/train/methods/diffusion.py

Merge remote-tracking branch 'origin/feat/transport-si-diffusion' int…

1f940bc

…o feat/transport-si-diffusion

ssmmnn11 requested a review from JoffreyDumontLeBrazidec May 9, 2026 22:47

mchantry changed the title ~~feat(training, models): add transport diffusion and stochastic interpolant~~ feat(training, models)!: add transport diffusion and stochastic interpolant May 13, 2026

mchantry added ATS Approved Approved by ATS and removed ATS Approval Needed Approval needed by ATS labels May 13, 2026

Fix compact transport conditioning

37cf6f7

JoffreyDumontLeBrazidec approved these changes May 14, 2026

View reviewed changes

JoffreyDumontLeBrazidec previously approved these changes May 14, 2026

View reviewed changes

github-project-automation Bot moved this from To be triaged to For merging in Anemoi-dev May 14, 2026

ssmmnn11 added 2 commits May 18, 2026 13:12

incoporate reviewer feedback

665436b

fix docu according to reviewer feedback

d4f5cec

ssmmnn11 dismissed JoffreyDumontLeBrazidec’s stale review via d4f5cec May 18, 2026 13:20

ssmmnn11 added 2 commits May 18, 2026 13:26

Merge remote-tracking branch 'origin/main' into feat/transport-si-dif…

8f5ea69

…fusion # Conflicts: # training/src/anemoi/training/config/training/diffusion.yaml

test clean-up

af2ad75

JoffreyDumontLeBrazidec self-requested a review May 18, 2026 14:19

JoffreyDumontLeBrazidec previously approved these changes May 18, 2026

View reviewed changes

ssmmnn11 added 3 commits May 18, 2026 15:28

Merge branch 'main' into feat/transport-si-diffusion

caa37d6

Merge remote-tracking branch 'origin/main' into feat/transport-si-dif…

2bd3b4a

…fusion # Conflicts: # training/src/anemoi/training/config/training/diffusion.yaml

config fix

5e1db67

JPXKQX reviewed May 20, 2026

View reviewed changes

ssmmnn11 added 2 commits May 20, 2026 14:19

clean-up

3f08741

rename and transport source factory cleanup

ed220bf

ssmmnn11 dismissed JoffreyDumontLeBrazidec’s stale review via ed220bf May 20, 2026 16:01

ssmmnn11 added 3 commits May 21, 2026 15:34

defaults clean-up

c3a971a

Merge remote-tracking branch 'origin/main' into feat/transport-si-dif…

405ec0f

…fusion

Merge branch 'main' into feat/transport-si-diffusion

ad5f90d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(training, models)!: add transport diffusion and stochastic interpolant#1096

feat(training, models)!: add transport diffusion and stochastic interpolant#1096
ssmmnn11 wants to merge 21 commits into
mainfrom
feat/transport-si-diffusion

ssmmnn11 commented May 6, 2026 •

edited by github-actions Bot

Loading

Uh oh!

ssmmnn11 commented May 6, 2026

Uh oh!

mchantry commented May 13, 2026

Uh oh!

JoffreyDumontLeBrazidec commented May 13, 2026

Uh oh!

JoffreyDumontLeBrazidec left a comment

Uh oh!

JoffreyDumontLeBrazidec left a comment

Uh oh!

WeiPanMaths commented May 14, 2026

Uh oh!

ssmmnn11 commented May 18, 2026

Uh oh!

ssmmnn11 commented May 18, 2026

Uh oh!

JPXKQX May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ssmmnn11 commented May 6, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssmmnn11 commented May 6, 2026

Uh oh!

mchantry commented May 13, 2026

Uh oh!

JoffreyDumontLeBrazidec commented May 13, 2026

Uh oh!

JoffreyDumontLeBrazidec left a comment

Choose a reason for hiding this comment

Uh oh!

JoffreyDumontLeBrazidec left a comment

Choose a reason for hiding this comment

Uh oh!

WeiPanMaths commented May 14, 2026

Uh oh!

ssmmnn11 commented May 18, 2026

Uh oh!

ssmmnn11 commented May 18, 2026

Uh oh!

JPXKQX May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ssmmnn11 commented May 6, 2026 •

edited by github-actions Bot

Loading