Add asynchronous I/O for JLD2Writer, NetCDFWriter, ZarrWriter, and Checkpointer#5573
Open
glwagner wants to merge 20 commits into
Open
Add asynchronous I/O for JLD2Writer, NetCDFWriter, ZarrWriter, and Checkpointer#5573glwagner wants to merge 20 commits into
glwagner wants to merge 20 commits into
Conversation
Parameterize `AbstractOutputWriter{A}` by a singleton tag (`Synchronous` or
`Asynchronous`). Async writers run the disk-write phase on a background task
via `Threads.@spawn` while the GPU continues computing; the GPU→CPU copy still
happens synchronously on the main thread so output content is identical to
the sync case. A single in-flight task per writer is enforced via
`wait_for_async_writes!`, which is called automatically at the end of `run!`.
Each writer that supports async opts in by specializing `prepare_async_write`
(fetch + GPU→CPU) and `commit_async_write!` (disk write). Sync `write_output!`
runs both inline; async `write_output!` runs prepare inline and spawns commit.
Enabled via `asynchronous=true` keyword on `JLD2Writer` and `NetCDFWriter`;
default remains `false`. `Checkpointer` stays synchronous (infrequent and
correctness-critical). Tests verify byte-identical output between sync and
async modes for both writers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two NetCDFWriter file-size doctests were off by 0.1 KiB and one BoundaryConditionOperation doctest expected mean=1.0842e-19 where the operation now produces exactly 0.0. Both predate this branch but block the docs job. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `slope_limiter` field now shows the fully qualified
`Oceananigans.TurbulenceClosures.FluxTapering{Float64}` rather than the
unqualified `FluxTapering{Float64}`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
File sizes printed by `show(::NetCDFWriter)` and `show(::JLD2Writer)` drift by ~0.1 KiB across HDF5/NCDatasets minor version updates, breaking doctests that aren't actually testing anything load-bearing about file size. Filter the `file size: X.X UNIT` line so the doctests stay green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
My local run shows the type printed with `Oceananigans.TurbulenceClosures.` prefix but CI may produce the unqualified form depending on namespace state. Revert the doctest to the unqualified form (the original) and add a doctestfilter that strips the qualifier so both forms match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Local doctest run passes (modulo GPU-only) but CI keeps failing. Temporarily disable checkdocs to determine whether a missing-docstring warning is the cause. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The :none bisection didn't change the docs build outcome, so checkdocs wasn't the issue. Restore the standard setting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After many failed iterations adjusting doctests, restore the doctests to their pre-PR state. If docs still fails, the failure is rooted in this PR's core code changes (not the doctest expected outputs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # ext/OceananigansNCDatasetsExt/OceananigansNCDatasetsExt.jl # src/Oceananigans.jl
Parameterize `ZarrWriter` by the I/O mode tag (`Synchronous` / `Asynchronous`), specialize `prepare_async_write` / `commit_async_write!` on the serial path, and gate `Synchronous` for distributed-MPI runs so the MPI-collective write path stays on the main thread. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each output writer gets a `commit_delay::Float64` field (default 0.0) that `commit_async_write!` sleeps on. This is a test-only knob for injecting deterministic latency into the disk-write phase. The new `test_jld2_async_overlap` uses it to prove the main thread doesn't block on disk I/O: with `commit_delay = 0.5s`, `write_output!` returns in ~4ms, the spawned task is still running afterwards, and `wait_for_async_writes!` blocks for ~500ms. Together these rule out the "main thread did the work" failure mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… into glw/async-io
Each writer gets a `commit_throws::Union{Nothing,Exception}` field (default
`nothing`). When set, `commit_async_write!` raises it — letting tests assert
that worker-task failures surface on the main thread at the next sync point.
The new `test_jld2_async_exception_propagation` exercises both paths:
explicit `wait_for_async_writes!` and the implicit wait at the start of the
next `write_output!`. After either, `writer.task` is cleared, so the writer
remains usable once the error is observed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Checkpointer` is parameterized by `Mode` and accepts an `asynchronous` kwarg. Prepare materializes the prognostic state to CPU via `Adapt.adapt(Array, …)` — the same GPU→CPU copy that the synchronous path triggers implicitly during JLD2 serialization — and commit writes the CPU snapshot. Async is gated to GPU models. On CPU it would buy nothing (no GPU→CPU work to overlap) but cost an extra full-state copy, so `asynchronous=true` on a CPU model warns and falls back to `Synchronous`, matching the ZarrWriter+MPI pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On CPU, `convert_output` returns the input array unchanged when its eltype matches `eltype(array_type)` — so async writes would race with the next `time_step!` mutating that array. We now refuse to build such a writer at construction time with an `ArgumentError` listing the offending outputs. Triggered for any output that's an `AbstractField` (or `WindowedTimeAverage` of one) whose eltype matches the writer's `array_type`. GPU is unaffected: `convert_output` always copies via the `array_type(undef, ...)` allocation + `copyto!` path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the ability to write output asynchronously while computation continues on the GPU. The goal is to hide the cost of writing from CPU to disk; if this works as intended, I/O cost should reduce to 1) the cost of computing diagnostics on the GPU and 2) the cost of the GPU-CPU transfer.
Description
AbstractOutputWriter{A}is now parameterized by a singleton tag (SynchronousorAsynchronous). Async writers run the disk-write phase on a background task viaThreads.@spawnwhile the GPU continues computing; the GPU→CPU copy still happens synchronously on the main thread so output content is identical to the sync case. A single in-flight task per writer is enforced viawait_for_async_writes!, which is called automatically at the end ofrun!.Each writer that supports async opts in by specializing
prepare_async_write(fetch + GPU→CPU) andcommit_async_write!(disk write). Syncwrite_output!runs both inline; asyncwrite_output!runs prepare inline and spawns commit.Coverage
Enabled via
asynchronous=truekeyword on:JLD2WriterNetCDFWriterZarrWriter— supported only on non-distributed (serial) runs. Under MPI,asynchronous=trueis downgraded toSynchronouswith a warning so the MPI-collective write path stays on the main thread.Checkpointer— supported only onGPUmodels. Prepare materializes the prognostic state to CPU viaAdapt.adapt(Array, …). On CPU the snapshot copy would be pure overhead with no GPU→CPU work to hide, soasynchronous=truewarns and falls back toSynchronous.Default remains
falsefor all four.Tests
commit_delayfield on each writer makescommit_async_write!sleep on a known duration. The overlap test (JLD2 path; the spawn framework is shared) asserts that withcommit_delay = 0.5s,write_output!returns in ~ms while the spawned task is still running, andwait_for_async_writes!blocks for ~commit_delay.commit_throwsfield makescommit_async_write!raise a user-supplied exception on the worker task. The test asserts the exception surfaces on the main thread at the next sync point — either an explicitwait_for_async_writes!or the implicit wait at the start of the nextwrite_output!— and thatwriter.taskis cleared so the writer remains usable.ZarrWriter+MPI andCheckpointer+CPU: warns and downgrades toSynchronous.