Skip to content

A new backend for Lomb Scargle Fourier Transform based on Low Rank Approximation #925

@pupperemeritus

Description

@pupperemeritus

The paper introducing the method

The method seems to provide much closer accuracy to the slow method, while maintaining the same time complexity (just scaled up by a constant). I suggest we parallellize the FFTs in the LRA optionally using numba. This method is currently in development at astropy. Its a much smaller hassle to implement in Stingray.jl as FastTransforms.jl has an implementation. But nonetheless that is not really my forte someone else can take it up.

TLDR

I propose implementing a newer, high-performance algorithm for the Lomb-Scargle periodogram based on the Non-Uniform Fast Fourier Transform (NUFFT) via Low-Rank Approximation (LRA). This method offers significant advantages over the traditional fast method (Press & Rybicki) in terms of parallelism, accuracy control, and memory efficiency.

This method is slowly being adopted already in the FOSS community, with a reference implementation in development for astropy (astropy/astropy#17842).

The Method

The core idea is to re-frame the non uniform discrete fourier transform problem. Instead of "spreading" non-uniform data onto a uniform grid (like Press & Rybicki), this method decomposes the non-uniform transform matrix itself into a sum of a small, fixed number (K) of diagonally-scaled, uniform DFTs.

Paper: A Nonuniform Fast Fourier Transform based on Low Rank Approximation

Key Advantages

  • Parallelism: The algorithm consists of K (typically ~16 for double precision) completely independent FFTs. This is a perfect fit for modern multi-core CPUs and can be parallelized trivially using tools like numba, potentially leading to a massive speedup over the mostly-sequential Press & Rybicki method.
  • Tunable and Guaranteed Accuracy: The accuracy is explicitly controlled by the parameter K, which is chosen based on a desired working precision ε. Unlike the spreading method, where accuracy is indirectly tied to the oversampling factor, this method provides a direct and a priori error bound. This means we can offer users a much more predictable trade-off between speed and precision. It is less costly than the oversampling in the Press & Rybicki method.
  • Memory Efficiency: This method avoids creating a large, oversampled grid for the FFT. It operates on vectors of the original data size N, making it more memory-friendly, especially for very large datasets where the oversampling grid of other methods might become a bottleneck.

Implementation Pointers

  • The K independent FFTs are an ideal use case for numba.njit(parallel=True). A prange loop could compute all K FFTs concurrently.
  • The Astropy PR has an implementation we can refer to in order to modify our lsft_fast in fourier.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions