WIP: reduced precision for lanczos_opencv by bjarthur · Pull Request #635 · JuliaMath/Interpolations.jl

bjarthur · 2025-10-24T21:21:13Z

using Float32 for Lanczos is ~2x faster and uses ~1/2x as much memory as the current Float64.

this PR currently uses whatever precision was input as the precision the internal calculations are performed with. i could also imagine specifying the type used for internal computations in the type (e.g. struct Lanczos4OpenCV{T} <: AbstractLanczos end) to separate it from the input.

i'm also curious where there is a more clever way to cast l4_2d_cs at compile time so as not to incur runtime penalities.

let me know what you think and i'll add some tests and docs.

julia> using Interpolations, BenchmarkTools

julia> x=rand(1_000_000);

julia> @benchmark Interpolations._lanczos4_opencv.(x)
BenchmarkTools.Trial: 231 samples with 1 evaluation per sample.
 Range (min … max):  19.061 ms … 83.561 ms  ┊ GC (min … max): 5.34% … 74.95%
 Time  (median):     21.303 ms              ┊ GC (median):    2.90%
 Time  (mean ± σ):   21.654 ms ±  4.135 ms  ┊ GC (mean ± σ):  4.35% ±  5.10%

                       ▅█▄▃                                    
  ▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▃▃▃▃▅▇█████▄▄▄▃▃▃▃▂▂▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▂ ▃
  19.1 ms         Histogram: frequency by time        24.8 ms <

 Memory estimate: 61.05 MiB, allocs estimate: 4.

julia> x=rand(Float32, 1_000_000);

julia> @benchmark Interpolations._lanczos4_opencv.(x)
BenchmarkTools.Trial: 387 samples with 1 evaluation per sample.
 Range (min … max):  12.078 ms … 76.608 ms  ┊ GC (min … max): 0.00% … 83.87%
 Time  (median):     12.695 ms              ┊ GC (median):    3.05%
 Time  (mean ± σ):   12.928 ms ±  3.290 ms  ┊ GC (mean ± σ):  4.56% ±  4.80%

     ▂       ▄▅▇██▇▅▂                                          
  ▆▇▇██▄▆▆▄▇██████████▄█▄▇▄▆▁▄▁▇▄▄▁▄▄▄▇▄▆▁▁▄▁▁▁▁▆▁▁▁▁▁▁▄▄▁▁▁▄ ▇
  12.1 ms      Histogram: log(frequency) by time      14.5 ms <

 Memory estimate: 30.53 MiB, allocs estimate: 4.

codecov · 2025-10-24T21:29:38Z

Codecov Report

❌ Patch coverage is 94.11765% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.15%. Comparing base (7a2d581) to head (9ba58a0).
⚠️ Report is 8 commits behind head on master.

Files with missing lines	Patch %	Lines
src/lanczos/lanczos_opencv.jl	94.11%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #635      +/-   ##
==========================================
+ Coverage   88.10%   88.15%   +0.05%     
==========================================
  Files          29       29              
  Lines        1908     1925      +17     
==========================================
+ Hits         1681     1697      +16     
- Misses        227      228       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bjarthur · 2025-10-24T21:42:26Z

second commit makes it slightly faster:

julia> x=rand(Float32, 1_000_000);

julia> @benchmark Interpolations.value_weights.(Ref(Lanczos4OpenCV()), x)
BenchmarkTools.Trial: 429 samples with 1 evaluation per sample.
 Range (min … max):  10.796 ms … 75.222 ms  ┊ GC (min … max): 0.00% … 85.30%
 Time  (median):     11.479 ms              ┊ GC (median):    3.84%
 Time  (mean ± σ):   11.664 ms ±  3.107 ms  ┊ GC (mean ± σ):  5.43% ±  4.66%

               ▂█▆▃▄▄▅▇▃▁                                      
  ▃▃▃▃▃▃▁▃▃▃▄▅▆██████████▆▃▃▃▄▃▁▁▂▃▁▁▃▃▂▁▁▃▁▂▁▃▁▂▂▁▁▁▁▁▁▁▁▂▂▂ ▃
  10.8 ms         Histogram: frequency by time          13 ms <

 Memory estimate: 30.53 MiB, allocs estimate: 5.

bjarthur · 2025-10-27T22:25:34Z

the output is qualitatively different for Float32 compared to Float64. :(

specifically, when δx is 0f0 (a 32-bit float) in _lanczos4_opencv, then s0 and c0 are exactly identical, and so cs[4] becomes 0f0 because the numerator is zero. this is not a problem when δx is 0.0 (a 64-bit float), because due to a higher numerical precision somehow s0 and c0 are off by 1 eps, and so the numerator of csis not zero.

i don't know enough about lanczos resampling to fix this in the right way. what i do know though is that this is not a problem if opencv's lanczos implementation is followed more closely (see the last commit of this PR). specifically, iszero is replaced with abs(y) <= 1e-6.

so i'm putting this on the back burner for now. below is the output without the last commit.

@mkitti @timholy @mileslucas

julia> Interpolations.value_weights(Lanczos4OpenCV{Float64}(), 0.0)
(8.837979241208245e-19, -2.8122277740499295e-18, 7.95418131708742e-18, 1.0, -7.95418131708742e-18, 2.8122277740499295e-18, -8.837979241208245e-19, -7.80550006310146e-35)

julia> Interpolations.value_weights(Lanczos4OpenCV{Float32}(), 0f0)
(8.547569f6, -2.7198194f7, 7.692811f7, -0.0f0, -7.692811f7, 2.7198194f7, -8.547569f6, -0.0f0)

bjarthur · 2026-05-08T20:27:48Z

latest commit preserves the original implementation of lanczos_opencv, and adds a new version which can use Float32 internally to mirror openCV identically.

bjarthur · 2026-05-08T20:28:28Z

@mkitti can you please review?

mkitti

I have a few questions on how general this is and where the constants are coming from.

mkitti · 2026-05-09T10:12:53Z

+        _lanczos4_opencv_faithful(float(T), float(T).(l4_2d_cs), δx)
+
+# main differences with `_lanczos4_opencv` are (1) the criterion for preventing a
+# division by zero is `< pi/4 * 1e-6` (instead of `iszero`) and (2) the resulting


Can you justify where these new constants are coming from? Is it going to work for Float16?

I wonder if we should use isapprox?

reduced precision for lanczos_opencv

a127d7d

function barrier for float(F)

5f69ead

bjarthur changed the title ~~reduced precision for lanczos_opencv~~ WIP: reduced precision for lanczos_opencv Oct 27, 2025

bjarthur marked this pull request as draft October 27, 2025 22:09

bjarthur added 2 commits October 27, 2025 18:20

specify lanczos precision in type parameter

6dfc814

more faithfully follow opencv's lanczos code

740fa93

preserve former lanczos opencv code

04dc3a8

bjarthur marked this pull request as ready for review May 8, 2026 20:27

increase test coverage

9ba58a0

mkitti reviewed May 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: reduced precision for lanczos_opencv#635

WIP: reduced precision for lanczos_opencv#635
bjarthur wants to merge 6 commits into
JuliaMath:masterfrom
bjarthur:bja/lanczosT

bjarthur commented Oct 24, 2025

Uh oh!

codecov Bot commented Oct 24, 2025 •

edited

Loading

Uh oh!

bjarthur commented Oct 24, 2025

Uh oh!

bjarthur commented Oct 27, 2025 •

edited

Loading

Uh oh!

bjarthur commented May 8, 2026 •

edited

Loading

Uh oh!

bjarthur commented May 8, 2026

Uh oh!

mkitti left a comment •

edited

Loading

Uh oh!

mkitti May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bjarthur commented Oct 24, 2025

Uh oh!

codecov Bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bjarthur commented Oct 24, 2025

Uh oh!

bjarthur commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjarthur commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjarthur commented May 8, 2026

Uh oh!

mkitti left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mkitti May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Oct 24, 2025 •

edited

Loading

bjarthur commented Oct 27, 2025 •

edited

Loading

bjarthur commented May 8, 2026 •

edited

Loading

mkitti left a comment •

edited

Loading