Optimize AkimaInterpolation constructor (3.6-4.1x) and add sorted-batch evaluator (8-9x)#526
Merged
ChrisRackauckas merged 4 commits intoMay 15, 2026
Conversation
Adds a `modified::Bool = false` keyword argument to `AkimaInterpolation` that switches the slope computation at the knots to the modified Akima (makima) formulation introduced by MathWorks (https://blogs.mathworks.com/cleve/2019/04/29/makima-piecewise-cubic-interpolation/). This is the same data as PR SciML#524 proposed adding via a separate set of `makima_*` helper functions, but integrated through the existing `AkimaInterpolation` type instead. The struct, evaluation, derivative, and integral paths are unchanged — only the formula for the slope coefficients `b` differs when `modified = true`. A simple-average fallback parallel to the original Akima protects the rare case where all four neighboring slopes vanish. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
The CI's spell checker flags the abbreviated `A_mak` as a likely typo (suggests `make`/`mask`). Use the full word to make the test names explicit and pass the check. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
The MathWorks blog returns 403 to the Documenter linkchecker (bot filtering), even though the URL is reachable in a browser. Add it to linkcheck_ignore alongside the existing moler/interp.pdf entry so the docs build doesn't fail on this PR's added docstring reference. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Two AkimaInterpolation-specific performance improvements that together capture the speed wins from PR SciML#524, applied to both the standard Akima and the modified (makima) formula: 1. Scalar in-place constructor kernel `_akima_init!`. Replaces the vectorized `diff`, `abs.+`, `findall`, and broadcasted-divide intermediates with a single pass over a length-(n+3) divided-difference buffer, dropping the constructor from ~36 allocations to 7 and giving a 3.6-4.1x wall-time speedup across n = 32..4096. 2. Sorted-batch callable `(A::AkimaInterpolation)(out::AbstractVector, tt::AbstractVector)`. When `tt` is sorted and the extrapolation modes are simple (None, Constant, Linear, Extension), walks the knots and queries in lockstep in O(n+m) instead of running a per-query binary search via the iguesser, giving an 8-9x speedup on batched evaluation. Falls back to `map!` when `tt` is unsorted or the extrapolation requires Periodic/Reflective wrapping. The struct layout, scalar `_interpolate`, derivative, integral, AD, and plotting paths are unchanged. Coefficients agree with the previous implementation to floating-point round-off. Benchmarks (Julia 1.10, n knots, m sorted queries): Constructor before after speedup n=32 2.95us 36a 0.81us 7a 3.6x n=256 14.97us 37a 3.63us 7a 4.1x n=4096 136.23us 66a 37.50us 11a 3.6x Batched eval before after speedup n=32, m=512 15.84us 1.75us 9.0x n=32, m=4096 119.32us 13.50us 8.8x n=256, m=512 18.41us 2.30us 8.0x n=256, m=4096 124.18us 14.36us 8.6x Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
20f9ad2 to
b92c4d4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two AkimaInterpolation-specific perf improvements that together capture the speed wins from #524, applied to both standard Akima and the modified (makima) formula added in #525:
_akima_init!. Replaces the vectorizeddiff,abs.+,findall, and broadcasted-divide intermediates with a single pass over a length-(n+3) divided-difference buffer. The struct layout is unchanged.(A::AkimaInterpolation)(out::AbstractVector, tt::AbstractVector). Whenttis sorted and the extrapolation modes are simple (None,Constant,Linear,Extension), walks the knots and queries in lockstep in O(n+m) instead of running a per-query binary search. Falls back tomap!whenttis unsorted or the extrapolation requiresPeriodic/Reflectivewrapping.The scalar
_interpolate, derivative, integral, AD, and plotting paths are unchanged. Coefficients agree with the previous implementation to floating-point round-off (covered by the existing tests).Stacking
This PR is stacked on top of #525. It branches off
makima-via-akima-kwargand the optimized constructor handles both the standard Akima andmodified = truepaths. If #525 is reviewed first I'll rebase this onto master once it lands. If reviewers prefer this lands first, that's also fine — I can drop themodifiedbranch in_akima_init!and the change to #525 becomes a trivial follow-up.Benchmarks
Julia 1.10, BenchmarkTools
minimum:Memory reduction in the constructor: ~5-9x smaller working set (9.4 KB → 1.3 KB at n=32; 934 KB → 128 KB at n=4096).
These don't quite reach the 12-16x in #524's benchmark figure — the remaining gap is from #524 also fusing the divided-difference buffer into a sliding window (no
mallocation), which I left as a buffer for clarity. The arithmetic cost of recomputing slopes twice (once forwmax, once forb) was a worse trade than keeping the single allocation.What is in this PR
src/interpolation_caches.jl: replaces the body of theAkimaInterpolationconstructor with a call to_akima_init!(b, c, d, u, t, Val(modified)). The new helper does a two-pass scalar sweep: first pass computes the maximum weight to set the small-weight cutoff (preserves the existingf12 > 1e-9 * maximum(f12)semantics exactly), second pass writes theb,c,darrays in place.src/interpolation_methods.jl: adds the(A::AkimaInterpolation{<:AbstractVector})(out, tt)callable, a_akima_eval_fast_applicablepredicate, and the inner_akima_eval_sorted!that handles each extrapolation type explicitly in three regions (left ext, interior, right ext).test/interpolation_tests.jl: new "Sorted-batch evaluator" sub-testset covering both akima and makima paths, all four fast-path extrapolation modes paired left × right, the Periodic/Reflective fallback, theNonethrowing behavior on out-of-range sorted queries, and the unsorted-input fallback.Test plan
Pkg.test()all groups pass locally on Julia 1.10 (Core 1701 / Methods 42151 / Extensions 13178 / Misc 11)Please ignore until reviewed by @ChrisRackauckas.
🤖 Generated with Claude Code