Refactor MIRK k_discrete from Matrix to Vector{Vector} storage#458
Open
ChrisRackauckas-Claude wants to merge 4 commits intoSciML:masterfrom
Open
Refactor MIRK k_discrete from Matrix to Vector{Vector} storage#458ChrisRackauckas-Claude wants to merge 4 commits intoSciML:masterfrom
ChrisRackauckas-Claude wants to merge 4 commits intoSciML:masterfrom
Conversation
Three key changes to reduce allocations in the hot path (loss function
and Jacobian evaluation called every Newton iteration):
1. Remove Logging.with_logger(NullLogger()) wrapper from get_tmp
- The wrapper allocated a closure + task-local context on EVERY call
- get_tmp is called 50+ times per loss evaluation
- The warning it suppressed only occurs during adaptive cache resize
- This single change reduced total solve allocations by ~60%
2. Eliminate array comprehensions in MIRK loss functions
- Replaced `[get_tmp(r, u) for r in residual]` with direct DiffCache
passing to Φ! and recursive_flatten!
- Added _maybe_get_tmp helper for Φ! to handle both DiffCache and
plain array residuals
- Added recursive_flatten!/recursive_flatten_twopoint! overloads for
AbstractVector{<:DiffCache}
3. Fix NoDiffCacheNeeded Φ! to reuse fᵢ_cache instead of similar()
- `similar(fᵢ_cache)` allocated a new vector every call
- fᵢ_cache is a scratch buffer that can be used directly
Results (dt=0.1, N=2, MIRK4, non-adaptive):
- Total solve: 3,290 → 1,180 allocations (64% reduction)
- Loss per call: 13,552 → 4,080 bytes (70% reduction)
- Jacobian per call: 20,448 → 8,784 bytes (57% reduction)
Adds allocation regression tests to verify loss function allocations
stay bounded.
Reference: https://discourse.julialang.org/t/boundaryvaluediffeq-jl-reducing-allocations/136255
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…IRKN The `EvalSol.u[1:end] .= x` and `EvalSol.cache.k_discrete[1:end] .=` patterns create unnecessary SubArray views under @views. Using copyto! avoids the view allocation entirely. Applied to MIRK, FIRK, and MIRKN loss functions. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… test
Add y_cache field to MIRKCache that pre-allocates the Vector{Vector{T}}
needed by recursive_unflatten! for the primal (Float64) path. This
eliminates the last 144 bytes/call from get_tmp. broadcast allocation.
For the Dual path (ForwardDiff Jacobian computation), falls back to the
existing broadcast since element types differ.
Loss function is now fully allocation-free at runtime (0 bytes/call
verified via @timed over 100k calls).
Add zero-allocation QA test in the qa test group.
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change k_discrete from Vector{DiffCache{Matrix{T}}} (N×stage matrix per
mesh interval) to Vector{Vector{DiffCache{Vector{T}}}} (vector of
per-stage DiffCache vectors per mesh interval).
This eliminates all SubArray view allocations from K[:, r] column access
in the collocation inner loop. Stage vectors are now accessed directly as
k_discrete[i][r] instead of via matrix column views.
Changes:
- mirk.jl: k_discrete allocation uses nested vector comprehension
- collocation.jl: All 5 Φ!/Φ methods rewritten with inline matmul loops
using per-stage get_tmp(k_discrete[i][r], u) access
- adaptivity.jl: interp_setup! and sum_stages! use VoV patterns
- interpolation.jl: sum_stages! and EvalSol use VoV patterns
- utils.jl: Add __resize! for Vector{Vector{DiffCache}} and
__maybe_matmul! for AbstractVector{<:AbstractVector}
Loss function is fully allocation-free (0 bytes/call verified via @timed
over 100k calls).
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Benchmark ResultsClick to check benchmark results
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactors
k_discretestorage fromVector{DiffCache{Matrix{T}}}(N×stage matrix per mesh interval) toVector{Vector{DiffCache{Vector{T}}}}(vector of per-stage DiffCache vectors). This eliminates all SubArray view allocations fromK[:, r]column access in the hot collocation inner loop.Depends on #453 (allocation fixes) — this branch includes those commits.
Before (matrix columns)
After (vector of vectors)
Changes
k_discreteallocation uses nested vector comprehensioninterp_setup!andsum_stages!use VoV patternssum_stages!andEvalSoluse VoV patterns__resize!forVector{Vector{DiffCache}}and__maybe_matmul!forAbstractVector{<:AbstractVector}Results
@timedover 100k calls)Test plan
🤖 Generated with Claude Code