Simplify Relaxation: drop ContinuousForcing dependency, add transform and Field targets#5620
Simplify Relaxation: drop ContinuousForcing dependency, add transform and Field targets#5620glwagner wants to merge 12 commits into
Conversation
… and Field targets Eliminates the two parallel materialize paths Relaxation had after #5575 (Number/function → ContinuousForcing wrap, FTS → bespoke FieldTimeSeriesTarget with type-parameter-as-index trick). Now a single Relaxation{R,M,T,F,L,Tr} carries the forced field directly, with one kernel and one evaluate_target dispatch family covering Number, function, AbstractField, and FieldTimeSeries targets. - src/OutputReaders/field_time_series.jl: GPUAdaptedFieldTimeSeries now carries its grid (was dropped to Nothing before), so the kernel can call interpolate(X, t, fts, loc, fts.grid) post-adapt without any wrapper. - src/Forcings/relaxation.jl: rewrite around a 6-param Relaxation; delete FieldTimeSeriesTarget, the type-parameter index trick, and 6 continuous-form callable overloads that only existed to feed ContinuousForcing. - Add `transform` kwarg: `transform = :horizontal_average` or any `f -> Field(...)` closure builds a lazy target from the forced field, refreshed each step via the new compute_forcing! hook. - Add `FieldRelaxation` and `FieldTimeSeriesRelaxation` aliases for tests and user-side type assertions. - src/Forcings/compute_forcing.jl: new compute_forcing! hook (default no-op, Tuple/NamedTuple/MultipleForcings recursion, Relaxation method that compute!'s a transformed target). - Wire compute_forcing!(model.forcing) into update_state! for Nonhydrostatic, HydrostaticFreeSurface, and ShallowWater models. Tests: 1215/1215 Forcings pass on CPU (incl. new Field-target and transform=:horizontal_average testsets — analytical decay matches exp(-t/τ) to 8 decimal places, confirming compute_forcing! fires each step). 11540/11540 OutputReaders pass, confirming the GPUAdaptedFTS grid addition is non-breaking. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@simone-silvestri @ewquon this is the refactor that we discussed on #5575 |
|
Thanks for taking the lead on this @glwagner |
run_distributed_simulation in test/distributed_tests_utils.jl drives the serial-vs-distributed correctness checks in test_mpi_tripolar.jl by running 100 time steps of a HydrostaticFreeSurfaceModel with SplitExplicitFreeSurface (20 substeps) on a 40x40 tripolar grid, then comparing serial and distributed final states. This runs four times per fold topology (serial reference + slab + pencil + large-pencil) for each of two fold topologies, dominating the ~90-minute Distributed MPI Tripolar Grid CI job. These are partition-equivalence checks, not stability checks. Boundary exchanges and the fold are exercised every step, so 30 steps is enough to catch any mismatch in the slab/pencil halo plumbing without buying long-term integration time we are not testing. (Pushed onto #5620 per request despite scope mismatch with the Relaxation refactor; intent is to ship it on whichever PR merges first.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Eliot Quon <eliot@aeolus.earth>
|
This should be good to go after addressing the |
…gets - `transform` now produces the *quantity being damped* (the kernel LHS), and the user-supplied `target` is the RHS the transform is pulled toward. The tendency becomes `rate * mask * (target(X, t) - transform(c)[i, j, k])`, so `transform = :horizontal_average, target = c_target` nudges only the horizontal mean toward `c_target` while leaving fluctuations alone. `materialize_forcing` stores the lazy `transform(field)` in `r.field`, and `compute_forcing!(r)` refreshes it via `compute!(r.field)`. - `evaluate_target` learns to interpolate `AbstractField` targets whose grid or location differs from the forced field's. `materialize_target` wraps such targets in a new `InterpolatedFieldTarget` that carries `loc` and `grid` separately so `Adapt`ing through the kernel doesn't lose the metadata that `interpolate(X, field, loc, grid)` needs. Same-grid + same-location (or reduced) Field targets stay unwrapped on the direct-indexing fast path. - Right-align the `show` properties at the colon (`├── rate`, `├── mask`, `└── target`) and surface `transform` when set.
- Drop the `@test_throws ArgumentError` cases for location- and
grid-mismatched Field targets; assert they auto-wrap in
`InterpolatedFieldTarget` and that a uniform target still drives the
tracer to `c_ref` after one step.
- Cover the new `transform`-relaxes-field semantics: with
`transform = :horizontal_average, target = c_target` the horizontal
mean follows `<c>(t) = c_target (1 - exp(-t/τ))` while fluctuations
are preserved; also test a `LinearTarget{:z}` profile and the closure
form.
Add a "Relaxing a reduction of the field toward a target" section explaining
the new `transform` semantics: `transform` produces the lazy reduction that
is damped, the user-supplied `target` is the RHS, and the tendency is
`rate * mask * (target - transform(c)[i, j, k])`. Two jldoctests cover the
`:horizontal_average` shortcut (with a `LinearTarget{:z}` profile) and a
custom `xz`-averaging callable.
Refactor the Relaxation `show` and `summary` methods to print any non-`nothing` extras after `target`, with the tree connectors (`├──`/`└──`) derived from position. Currently `location` and `transform` both qualify, and the sponge-layer doctests in `forcing_functions.md` are updated to include the new `location` line.
|
This refactor adds the grid as an field to the GPUAdaptedFieldTimeSeries. Will this create performance or parameter space issues? |
It could. Another option is to have a special adaptor for |
…bstractArray`
* Rename the field that holds the kernel LHS from `field` to `relaxed` (it
is the forced field when `transform === nothing`, otherwise the lazy
`transform(forced_field)`). All references in `relaxation.jl`,
`compute_forcing.jl`, `test_forcings.jl`, and the docstring/docs follow.
* Widen the kernel callable's dispatch constraint from `F<:AbstractField`
to `F<:AbstractArray`. `Adapt.adapt_structure(to, f::Field) = adapt(to,
f.data)` strips the wrapping `Field` on the device, so `r.relaxed`
arrives at the GPU kernel as an `OffsetArray{…, CuDeviceArray}` —
triggering `MethodError`/`InvalidIRError` in
`gpu_compute_Gu!`/`gpu_compute_Gc!` for every Relaxation testset under
`time_stepping_2`. Both `Field` and `OffsetArray` are `<:AbstractArray`,
so the relaxed bound matches pre- and post-`Adapt`.
* Rework `show`/`summary`: render `Relaxation{$FT}` for a freshly-
constructed Relaxation and `Relaxation{$FT, $LX, $LY, $LZ}` once
materialized; surface the `relaxed` field via `prettysummary` and
`transform` when set. Doctests in `relaxation.jl` and the sponge-layer
examples in `forcing_functions.md` are updated to match.
|
right, I guess this is the same scheme adopted by |
ok, i'll make that change |
…meSeriesTarget
The previous approach added a `grid::G` field on `GPUAdaptedFieldTimeSeries`
so the relaxation kernel could call `interpolate(X, Time(t), fts, loc, fts.grid)`
on the device. That changed the GPU-adapted FTS from `AbstractField{...,Nothing,...}`
to `AbstractField{...,G,...}`, which is invasive for code that expects FTS to
have no grid on the GPU.
Instead, introduce a lean `FieldTimeSeriesTarget{F, G}` wrapper that holds the
FTS together with its grid (cached at materialize time so it survives Adapt).
`materialize_forcing` now wraps `forcing.target` in this struct for the FTS
path, and `evaluate_target(::FieldTimeSeriesTarget, ...)` reads the grid from
the wrapper.
- Revert `GPUAdaptedFieldTimeSeries` to its original 8-param form
- Add `FieldTimeSeriesTarget` with `Adapt.adapt_structure` and `summary`
- Widen `FieldTimeSeriesRelaxation` alias to match both `FlavorOfFTS` and
`FieldTimeSeriesTarget` so it covers pre- and post-materialization
- Update FTS testset: `rm.target` is now a `FieldTimeSeriesTarget`, so
assert `rm.target.field_time_series === fts` and `rm.target.grid === fts.grid`;
drop the now-disjoint `rm isa FieldRelaxation` assertion
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Type parameters reordered from `{R, M, T, F, L, Tr}` to `{R, F, M, T, L, Tr}`
so the kernel-dispatch parameter (`F<:AbstractArray` for the `relaxed`
quantity) sits second. The struct field order and `Adapt.adapt_structure`
follow suit. Show output now mirrors Field's style: `Relaxation{FT}` for
freshly-constructed forcings and `Relaxation{FT} at (LX, LY, LZ)` once
materialized, with `show_location` reused from Fields and properties
right-aligned under the header. The `relaxed` property is placed second,
right after `rate`, matching its new type-parameter position.
- Simplify aliases: `FieldRelaxation = Relaxation{<:Any, <:Any, <:Any, <:AbstractField}`
(and `FieldTimeSeriesRelaxation` likewise) — no need to bind every
type variable explicitly when only the constraint matters
- Left-align `FieldTimeSeriesTarget`'s `grid` field
- Update jldoctest expected output in `relaxation.jl` and the three
sponge-layer doctests in `forcing_functions.md`
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ditionOperation Three doctests were comparing against outputs that drifted with library updates: two NetCDF file-size strings (32.8→32.7 KiB and 34.4→34.3 KiB) and a floating-point reduction mean (1.0842e-19→0.0) where the new reduction order yields exact zero. Updated the expected text to match the current evaluated output so `Documenter.doctest` is green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Eliminates the two parallel materialize paths
Relaxationhad after #5575 — Number/function targets went throughContinuousForcing, whileFieldTimeSeriestargets used a bespokeFieldTimeSeriesTargetwrapper with a type-parameter-as-index trick and its own discrete-form callable. The asymmetry forced ~80 lines of duplicated machinery just because the FTS path couldn't fit throughContinuousForcing's parameter slot.Now there's one
Relaxation{R,M,T,F,L,Tr}that carries the forced field directly, one kernel callable, and oneevaluate_targetdispatch family handling all four target types:Number, function,AbstractField,FieldTimeSeries. As discussed in #5575 (#5575 (comment)...), storing the forced field directly is the underlying simplification that unlocks both the unified machinery and the newtransformfeature.What changed
Storage of FTS grid (real fix, not workaround)
GPUAdaptedFieldTimeSeriesnow carries its grid (previously dropped toNothing). The kernel can callinterpolate(X, t, fts, loc, fts.grid)post-adapt without any wrapper. This deletes the entireFieldTimeSeriesTargetmachinery (struct + type-param index trick + helper + custom Adapt) from #5575 in favor of a one-field change inOutputReaders.transformkwargtransformbuilds a lazy target from the forced field at materialize time. Refreshed each step via the newcompute_forcing!hook, which is wired intoupdate_state!for all three model types.Field targets
Relaxation(rate=1/τ, target=my_field)now works for anyAbstractFieldsharing the forced field's grid + location. Validation throws clear errors on mismatch.Type aliases
FieldRelaxationandFieldTimeSeriesRelaxationfor tests and user-side type assertions.Deletions
FieldTimeSeriesTarget{L,F,G,I}and_field_indexhelpermaterialize_forcing(::Relaxation, …)ContinuousForcing wrapper(::Relaxation)(x,y,z,t,field)/(x₁,x₂,t,field)/(x₁,t,field)continuous-form callable overloadsusing …: ContinuousForcingdep fromrelaxation.jlFiles
src/OutputReaders/field_time_series.jl— +1 grid field onGPUAdaptedFieldTimeSeriessrc/Forcings/relaxation.jl— rewrite (185 lines changed, net -10)src/Forcings/compute_forcing.jl— new (15 lines): default no-op + Tuple/NamedTuple/MultipleForcings recursion + Relaxation methodsrc/Forcings/Forcings.jl— include + exportcompute_forcing!src/Models/{Nonhydrostatic,HydrostaticFreeSurface,ShallowWater}Models/update_*_state.jl— callcompute_forcing!(model.forcing)afterupdate_model_field_time_series!test/test_forcings.jl— updated FTS testset assertions, new Field-target testset, newtransform=:horizontal_averagetestset (analytical decay verified)docs/src/models/forcing_functions.md— doctest updatedTest plan
test_forcings.jlon CPU: 1215/1215 pass (+9 new tests covering Field target validation, location/grid mismatch errors, transform decay tracking, transform+FTS rejection, closure-form equivalence)test_output_readers.jlon CPU: 11540/11540 pass — confirmsGPUAdaptedFTSgrid addition is non-breakingtransform=:horizontal_averagedecay factor matchesexp(-t/τ)to 8 decimal places, confirmingcompute_forcing!fires each steprelaxation.jlandforcing_functions.mdverified manuallyFollow-ups
transform-shortcuts beyond:horizontal_average(e.g.:vertical_average,:domain_average) — trivial 1-lineapply_transformmethods, add on demandAbstractFieldtargets — currently FTS-only (validator enforces matching grid+location for plain Field targets); the kernel path already supports it viaevaluate_target(::AbstractArray, …), just needs a different validator🤖 Generated with Claude Code