Skip to content

Simplify Relaxation: drop ContinuousForcing dependency, add transform and Field targets#5620

Open
glwagner wants to merge 12 commits into
mainfrom
glw/simplify-relaxation
Open

Simplify Relaxation: drop ContinuousForcing dependency, add transform and Field targets#5620
glwagner wants to merge 12 commits into
mainfrom
glw/simplify-relaxation

Conversation

@glwagner
Copy link
Copy Markdown
Member

Summary

Eliminates the two parallel materialize paths Relaxation had after #5575 — Number/function targets went through ContinuousForcing, while FieldTimeSeries targets used a bespoke FieldTimeSeriesTarget wrapper with a type-parameter-as-index trick and its own discrete-form callable. The asymmetry forced ~80 lines of duplicated machinery just because the FTS path couldn't fit through ContinuousForcing's parameter slot.

Now there's one Relaxation{R,M,T,F,L,Tr} that carries the forced field directly, one kernel callable, and one evaluate_target dispatch family handling all four target types: Number, function, AbstractField, FieldTimeSeries. As discussed in #5575 (#5575 (comment)...), storing the forced field directly is the underlying simplification that unlocks both the unified machinery and the new transform feature.

What changed

Storage of FTS grid (real fix, not workaround)

GPUAdaptedFieldTimeSeries now carries its grid (previously dropped to Nothing). The kernel can call interpolate(X, t, fts, loc, fts.grid) post-adapt without any wrapper. This deletes the entire FieldTimeSeriesTarget machinery (struct + type-param index trick + helper + custom Adapt) from #5575 in favor of a one-field change in OutputReaders.

transform kwarg

# Relax tracer to its own horizontal average:
Relaxation(rate=1/τ, transform=:horizontal_average)

# Or supply your own closure:
Relaxation(rate=1/τ, transform = f -> Field(Average(f, dims=(1, 2))))

transform builds a lazy target from the forced field at materialize time. Refreshed each step via the new compute_forcing! hook, which is wired into update_state! for all three model types.

Field targets

Relaxation(rate=1/τ, target=my_field) now works for any AbstractField sharing the forced field's grid + location. Validation throws clear errors on mismatch.

Type aliases

FieldRelaxation and FieldTimeSeriesRelaxation for tests and user-side type assertions.

Deletions

  • FieldTimeSeriesTarget{L,F,G,I} and _field_index helper
  • materialize_forcing(::Relaxation, …) ContinuousForcing wrapper
  • Six (::Relaxation)(x,y,z,t,field) / (x₁,x₂,t,field) / (x₁,t,field) continuous-form callable overloads
  • using …: ContinuousForcing dep from relaxation.jl

Files

  • src/OutputReaders/field_time_series.jl — +1 grid field on GPUAdaptedFieldTimeSeries
  • src/Forcings/relaxation.jl — rewrite (185 lines changed, net -10)
  • src/Forcings/compute_forcing.jl — new (15 lines): default no-op + Tuple/NamedTuple/MultipleForcings recursion + Relaxation method
  • src/Forcings/Forcings.jl — include + export compute_forcing!
  • src/Models/{Nonhydrostatic,HydrostaticFreeSurface,ShallowWater}Models/update_*_state.jl — call compute_forcing!(model.forcing) after update_model_field_time_series!
  • test/test_forcings.jl — updated FTS testset assertions, new Field-target testset, new transform=:horizontal_average testset (analytical decay verified)
  • docs/src/models/forcing_functions.md — doctest updated

Test plan

  • test_forcings.jl on CPU: 1215/1215 pass (+9 new tests covering Field target validation, location/grid mismatch errors, transform decay tracking, transform+FTS rejection, closure-form equivalence)
  • test_output_readers.jl on CPU: 11540/11540 pass — confirms GPUAdaptedFTS grid addition is non-breaking
  • Analytical verification: transform=:horizontal_average decay factor matches exp(-t/τ) to 8 decimal places, confirming compute_forcing! fires each step
  • Doctests in relaxation.jl and forcing_functions.md verified manually
  • GPU run via CI

Follow-ups

  • transform-shortcuts beyond :horizontal_average (e.g. :vertical_average, :domain_average) — trivial 1-line apply_transform methods, add on demand
  • Cross-grid AbstractField targets — currently FTS-only (validator enforces matching grid+location for plain Field targets); the kernel path already supports it via evaluate_target(::AbstractArray, …), just needs a different validator

🤖 Generated with Claude Code

… and Field targets

Eliminates the two parallel materialize paths Relaxation had after #5575
(Number/function → ContinuousForcing wrap, FTS → bespoke FieldTimeSeriesTarget
with type-parameter-as-index trick). Now a single Relaxation{R,M,T,F,L,Tr}
carries the forced field directly, with one kernel and one evaluate_target
dispatch family covering Number, function, AbstractField, and FieldTimeSeries
targets.

- src/OutputReaders/field_time_series.jl: GPUAdaptedFieldTimeSeries now
  carries its grid (was dropped to Nothing before), so the kernel can call
  interpolate(X, t, fts, loc, fts.grid) post-adapt without any wrapper.
- src/Forcings/relaxation.jl: rewrite around a 6-param Relaxation; delete
  FieldTimeSeriesTarget, the type-parameter index trick, and 6 continuous-form
  callable overloads that only existed to feed ContinuousForcing.
- Add `transform` kwarg: `transform = :horizontal_average` or any
  `f -> Field(...)` closure builds a lazy target from the forced field,
  refreshed each step via the new compute_forcing! hook.
- Add `FieldRelaxation` and `FieldTimeSeriesRelaxation` aliases for tests and
  user-side type assertions.
- src/Forcings/compute_forcing.jl: new compute_forcing! hook (default no-op,
  Tuple/NamedTuple/MultipleForcings recursion, Relaxation method that
  compute!'s a transformed target).
- Wire compute_forcing!(model.forcing) into update_state! for
  Nonhydrostatic, HydrostaticFreeSurface, and ShallowWater models.

Tests: 1215/1215 Forcings pass on CPU (incl. new Field-target and
transform=:horizontal_average testsets — analytical decay matches exp(-t/τ)
to 8 decimal places, confirming compute_forcing! fires each step).
11540/11540 OutputReaders pass, confirming the GPUAdaptedFTS grid addition
is non-breaking.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@glwagner glwagner requested review from ewquon and simone-silvestri and removed request for simone-silvestri May 21, 2026 21:08
@glwagner
Copy link
Copy Markdown
Member Author

@simone-silvestri @ewquon this is the refactor that we discussed on #5575

@ewquon
Copy link
Copy Markdown
Collaborator

ewquon commented May 21, 2026

Thanks for taking the lead on this @glwagner

Comment thread src/Forcings/relaxation.jl Outdated
Comment thread src/Forcings/relaxation.jl Outdated
run_distributed_simulation in test/distributed_tests_utils.jl drives the
serial-vs-distributed correctness checks in test_mpi_tripolar.jl by running
100 time steps of a HydrostaticFreeSurfaceModel with SplitExplicitFreeSurface
(20 substeps) on a 40x40 tripolar grid, then comparing serial and distributed
final states. This runs four times per fold topology (serial reference + slab
+ pencil + large-pencil) for each of two fold topologies, dominating the
~90-minute Distributed MPI Tripolar Grid CI job.

These are partition-equivalence checks, not stability checks. Boundary
exchanges and the fold are exercised every step, so 30 steps is enough to
catch any mismatch in the slab/pencil halo plumbing without buying long-term
integration time we are not testing.

(Pushed onto #5620 per request despite scope mismatch with the Relaxation
refactor; intent is to ship it on whichever PR merges first.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/Forcings/relaxation.jl
Comment thread src/Forcings/relaxation.jl
Co-authored-by: Eliot Quon <eliot@aeolus.earth>
@ewquon
Copy link
Copy Markdown
Collaborator

ewquon commented May 21, 2026

This should be good to go after addressing the forcing.target issue #5620 (comment).

glwagner and others added 5 commits May 21, 2026 17:31
…gets

- `transform` now produces the *quantity being damped* (the kernel LHS), and
  the user-supplied `target` is the RHS the transform is pulled toward. The
  tendency becomes `rate * mask * (target(X, t) - transform(c)[i, j, k])`,
  so `transform = :horizontal_average, target = c_target` nudges only the
  horizontal mean toward `c_target` while leaving fluctuations alone.
  `materialize_forcing` stores the lazy `transform(field)` in `r.field`, and
  `compute_forcing!(r)` refreshes it via `compute!(r.field)`.

- `evaluate_target` learns to interpolate `AbstractField` targets whose grid
  or location differs from the forced field's. `materialize_target` wraps
  such targets in a new `InterpolatedFieldTarget` that carries `loc` and
  `grid` separately so `Adapt`ing through the kernel doesn't lose the
  metadata that `interpolate(X, field, loc, grid)` needs. Same-grid +
  same-location (or reduced) Field targets stay unwrapped on the
  direct-indexing fast path.

- Right-align the `show` properties at the colon (`├──   rate`,
  `├──   mask`, `└── target`) and surface `transform` when set.
- Drop the `@test_throws ArgumentError` cases for location- and
  grid-mismatched Field targets; assert they auto-wrap in
  `InterpolatedFieldTarget` and that a uniform target still drives the
  tracer to `c_ref` after one step.
- Cover the new `transform`-relaxes-field semantics: with
  `transform = :horizontal_average, target = c_target` the horizontal
  mean follows `<c>(t) = c_target (1 - exp(-t/τ))` while fluctuations
  are preserved; also test a `LinearTarget{:z}` profile and the closure
  form.
Add a "Relaxing a reduction of the field toward a target" section explaining
the new `transform` semantics: `transform` produces the lazy reduction that
is damped, the user-supplied `target` is the RHS, and the tendency is
`rate * mask * (target - transform(c)[i, j, k])`. Two jldoctests cover the
`:horizontal_average` shortcut (with a `LinearTarget{:z}` profile) and a
custom `xz`-averaging callable.
Refactor the Relaxation `show` and `summary` methods to print any
non-`nothing` extras after `target`, with the tree connectors
(`├──`/`└──`) derived from position. Currently `location` and
`transform` both qualify, and the sponge-layer doctests in
`forcing_functions.md` are updated to include the new `location` line.
@simone-silvestri
Copy link
Copy Markdown
Collaborator

This refactor adds the grid as an field to the GPUAdaptedFieldTimeSeries. Will this create performance or parameter space issues?

@glwagner
Copy link
Copy Markdown
Member Author

This refactor adds the grid as an field to the GPUAdaptedFieldTimeSeries. Will this create performance or parameter space issues?

It could. Another option is to have a special adaptor for Relaxation (we need the grid for interpolation)

…bstractArray`

* Rename the field that holds the kernel LHS from `field` to `relaxed` (it
  is the forced field when `transform === nothing`, otherwise the lazy
  `transform(forced_field)`). All references in `relaxation.jl`,
  `compute_forcing.jl`, `test_forcings.jl`, and the docstring/docs follow.

* Widen the kernel callable's dispatch constraint from `F<:AbstractField`
  to `F<:AbstractArray`. `Adapt.adapt_structure(to, f::Field) = adapt(to,
  f.data)` strips the wrapping `Field` on the device, so `r.relaxed`
  arrives at the GPU kernel as an `OffsetArray{…, CuDeviceArray}` —
  triggering `MethodError`/`InvalidIRError` in
  `gpu_compute_Gu!`/`gpu_compute_Gc!` for every Relaxation testset under
  `time_stepping_2`. Both `Field` and `OffsetArray` are `<:AbstractArray`,
  so the relaxed bound matches pre- and post-`Adapt`.

* Rework `show`/`summary`: render `Relaxation{$FT}` for a freshly-
  constructed Relaxation and `Relaxation{$FT, $LX, $LY, $LZ}` once
  materialized; surface the `relaxed` field via `prettysummary` and
  `transform` when set. Doctests in `relaxation.jl` and the sponge-layer
  examples in `forcing_functions.md` are updated to match.
@simone-silvestri
Copy link
Copy Markdown
Collaborator

right, I guess this is the same scheme adopted by DatasetBackend. However, I like the idea to add the adapt to relaxation so we don't change the behavior for other code paths. With DatasetBackend sometimes I saw the registers explode not still sure why.

@glwagner
Copy link
Copy Markdown
Member Author

right, I guess this is the same scheme adopted by DatasetBackend. However, I like the idea to add the adapt to relaxation so we don't change the behavior for other code paths. With DatasetBackend sometimes I saw the registers explode not still sure why.

ok, i'll make that change

glwagner and others added 2 commits May 22, 2026 09:14
…meSeriesTarget

The previous approach added a `grid::G` field on `GPUAdaptedFieldTimeSeries`
so the relaxation kernel could call `interpolate(X, Time(t), fts, loc, fts.grid)`
on the device. That changed the GPU-adapted FTS from `AbstractField{...,Nothing,...}`
to `AbstractField{...,G,...}`, which is invasive for code that expects FTS to
have no grid on the GPU.

Instead, introduce a lean `FieldTimeSeriesTarget{F, G}` wrapper that holds the
FTS together with its grid (cached at materialize time so it survives Adapt).
`materialize_forcing` now wraps `forcing.target` in this struct for the FTS
path, and `evaluate_target(::FieldTimeSeriesTarget, ...)` reads the grid from
the wrapper.

- Revert `GPUAdaptedFieldTimeSeries` to its original 8-param form
- Add `FieldTimeSeriesTarget` with `Adapt.adapt_structure` and `summary`
- Widen `FieldTimeSeriesRelaxation` alias to match both `FlavorOfFTS` and
  `FieldTimeSeriesTarget` so it covers pre- and post-materialization
- Update FTS testset: `rm.target` is now a `FieldTimeSeriesTarget`, so
  assert `rm.target.field_time_series === fts` and `rm.target.grid === fts.grid`;
  drop the now-disjoint `rm isa FieldRelaxation` assertion

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Type parameters reordered from `{R, M, T, F, L, Tr}` to `{R, F, M, T, L, Tr}`
so the kernel-dispatch parameter (`F<:AbstractArray` for the `relaxed`
quantity) sits second. The struct field order and `Adapt.adapt_structure`
follow suit. Show output now mirrors Field's style: `Relaxation{FT}` for
freshly-constructed forcings and `Relaxation{FT} at (LX, LY, LZ)` once
materialized, with `show_location` reused from Fields and properties
right-aligned under the header. The `relaxed` property is placed second,
right after `rate`, matching its new type-parameter position.

- Simplify aliases: `FieldRelaxation = Relaxation{<:Any, <:Any, <:Any, <:AbstractField}`
  (and `FieldTimeSeriesRelaxation` likewise) — no need to bind every
  type variable explicitly when only the constraint matters
- Left-align `FieldTimeSeriesTarget`'s `grid` field
- Update jldoctest expected output in `relaxation.jl` and the three
  sponge-layer doctests in `forcing_functions.md`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ditionOperation

Three doctests were comparing against outputs that drifted with library
updates: two NetCDF file-size strings (32.8→32.7 KiB and 34.4→34.3 KiB)
and a floating-point reduction mean (1.0842e-19→0.0) where the new
reduction order yields exact zero. Updated the expected text to match
the current evaluated output so `Documenter.doctest` is green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants