Optimize AkimaInterpolation constructor (3.6-4.1x) and add sorted-batch evaluator (8-9x) by ChrisRackauckas-Claude · Pull Request #526 · SciML/DataInterpolations.jl

ChrisRackauckas-Claude · 2026-05-15T10:49:41Z

Summary

Two AkimaInterpolation-specific perf improvements that together capture the speed wins from #524, applied to both standard Akima and the modified (makima) formula added in #525:

Scalar in-place constructor kernel _akima_init!. Replaces the vectorized diff, abs.+, findall, and broadcasted-divide intermediates with a single pass over a length-(n+3) divided-difference buffer. The struct layout is unchanged.
Sorted-batch callable (A::AkimaInterpolation)(out::AbstractVector, tt::AbstractVector). When tt is sorted and the extrapolation modes are simple (None, Constant, Linear, Extension), walks the knots and queries in lockstep in O(n+m) instead of running a per-query binary search. Falls back to map! when tt is unsorted or the extrapolation requires Periodic/Reflective wrapping.

The scalar _interpolate, derivative, integral, AD, and plotting paths are unchanged. Coefficients agree with the previous implementation to floating-point round-off (covered by the existing tests).

Stacking

This PR is stacked on top of #525. It branches off makima-via-akima-kwarg and the optimized constructor handles both the standard Akima and modified = true paths. If #525 is reviewed first I'll rebase this onto master once it lands. If reviewers prefer this lands first, that's also fine — I can drop the modified branch in _akima_init! and the change to #525 becomes a trivial follow-up.

Benchmarks

Julia 1.10, BenchmarkTools minimum:

Constructor	before	after	speedup	allocs
n=32	2.95 μs	0.81 μs	3.6x	36 → 7
n=256	14.97 μs	3.63 μs	4.1x	37 → 7
n=4096	136 μs	37.5 μs	3.6x	66 → 11

Batched eval (sorted)	before	after	speedup
n=32, m=512	15.84 μs	1.75 μs	9.0x
n=32, m=4096	119 μs	13.5 μs	8.8x
n=256, m=512	18.41 μs	2.30 μs	8.0x
n=256, m=4096	124 μs	14.36 μs	8.6x

Memory reduction in the constructor: ~5-9x smaller working set (9.4 KB → 1.3 KB at n=32; 934 KB → 128 KB at n=4096).

These don't quite reach the 12-16x in #524's benchmark figure — the remaining gap is from #524 also fusing the divided-difference buffer into a sliding window (no m allocation), which I left as a buffer for clarity. The arithmetic cost of recomputing slopes twice (once for wmax, once for b) was a worse trade than keeping the single allocation.

What is in this PR

src/interpolation_caches.jl: replaces the body of the AkimaInterpolation constructor with a call to _akima_init!(b, c, d, u, t, Val(modified)). The new helper does a two-pass scalar sweep: first pass computes the maximum weight to set the small-weight cutoff (preserves the existing f12 > 1e-9 * maximum(f12) semantics exactly), second pass writes the b, c, d arrays in place.
src/interpolation_methods.jl: adds the (A::AkimaInterpolation{<:AbstractVector})(out, tt) callable, a _akima_eval_fast_applicable predicate, and the inner _akima_eval_sorted! that handles each extrapolation type explicitly in three regions (left ext, interior, right ext).
test/interpolation_tests.jl: new "Sorted-batch evaluator" sub-testset covering both akima and makima paths, all four fast-path extrapolation modes paired left × right, the Periodic/Reflective fallback, the None throwing behavior on out-of-range sorted queries, and the unsorted-input fallback.

Test plan

CI passes (full matrix: 1 × lts × pre, Core/Methods/Extensions/Misc/QA)
Pkg.test() all groups pass locally on Julia 1.10 (Core 1701 / Methods 42151 / Extensions 13178 / Misc 11)
Constructor coefficients match the prior vectorized implementation on the existing Akima testset
Sorted-batch output matches per-point output on randomized sorted/unsorted query sets
Runic-formatted

Please ignore until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

Adds a `modified::Bool = false` keyword argument to `AkimaInterpolation` that switches the slope computation at the knots to the modified Akima (makima) formulation introduced by MathWorks (https://blogs.mathworks.com/cleve/2019/04/29/makima-piecewise-cubic-interpolation/). This is the same data as PR SciML#524 proposed adding via a separate set of `makima_*` helper functions, but integrated through the existing `AkimaInterpolation` type instead. The struct, evaluation, derivative, and integral paths are unchanged — only the formula for the slope coefficients `b` differs when `modified = true`. A simple-average fallback parallel to the original Akima protects the rare case where all four neighboring slopes vanish. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

The CI's spell checker flags the abbreviated `A_mak` as a likely typo (suggests `make`/`mask`). Use the full word to make the test names explicit and pass the check. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

The MathWorks blog returns 403 to the Documenter linkchecker (bot filtering), even though the URL is reachable in a browser. Add it to linkcheck_ignore alongside the existing moler/interp.pdf entry so the docs build doesn't fail on this PR's added docstring reference. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

Two AkimaInterpolation-specific performance improvements that together capture the speed wins from PR SciML#524, applied to both the standard Akima and the modified (makima) formula: 1. Scalar in-place constructor kernel `_akima_init!`. Replaces the vectorized `diff`, `abs.+`, `findall`, and broadcasted-divide intermediates with a single pass over a length-(n+3) divided-difference buffer, dropping the constructor from ~36 allocations to 7 and giving a 3.6-4.1x wall-time speedup across n = 32..4096. 2. Sorted-batch callable `(A::AkimaInterpolation)(out::AbstractVector, tt::AbstractVector)`. When `tt` is sorted and the extrapolation modes are simple (None, Constant, Linear, Extension), walks the knots and queries in lockstep in O(n+m) instead of running a per-query binary search via the iguesser, giving an 8-9x speedup on batched evaluation. Falls back to `map!` when `tt` is unsorted or the extrapolation requires Periodic/Reflective wrapping. The struct layout, scalar `_interpolate`, derivative, integral, AD, and plotting paths are unchanged. Coefficients agree with the previous implementation to floating-point round-off. Benchmarks (Julia 1.10, n knots, m sorted queries): Constructor before after speedup n=32 2.95us 36a 0.81us 7a 3.6x n=256 14.97us 37a 3.63us 7a 4.1x n=4096 136.23us 66a 37.50us 11a 3.6x Batched eval before after speedup n=32, m=512 15.84us 1.75us 9.0x n=32, m=4096 119.32us 13.50us 8.8x n=256, m=512 18.41us 2.30us 8.0x n=256, m=4096 124.18us 14.36us 8.6x Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

ChrisRackauckas added 4 commits May 15, 2026 05:49

Rename A_mak -> A_makima to satisfy typo spell check

3467498

The CI's spell checker flags the abbreviated `A_mak` as a likely typo (suggests `make`/`mask`). Use the full word to make the test names explicit and pass the check. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>

ChrisRackauckas-Claude force-pushed the akima-perf-opts branch from 20f9ad2 to b92c4d4 Compare May 15, 2026 11:24

ChrisRackauckas marked this pull request as ready for review May 15, 2026 13:38

ChrisRackauckas merged commit 96ab7c8 into SciML:master May 15, 2026
15 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize AkimaInterpolation constructor (3.6-4.1x) and add sorted-batch evaluator (8-9x)#526

Optimize AkimaInterpolation constructor (3.6-4.1x) and add sorted-batch evaluator (8-9x)#526
ChrisRackauckas merged 4 commits into
SciML:masterfrom
ChrisRackauckas-Claude:akima-perf-opts

ChrisRackauckas-Claude commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ChrisRackauckas-Claude commented May 15, 2026

Summary

Stacking

Benchmarks

What is in this PR

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants