Add the TaylorFast SarBp ComputeType and documentation by tbensonatl · Pull Request #1192 · NVIDIA/MatX

tbensonatl · 2026-05-27T19:37:19Z

Adds a new SarBp ComputeType, TaylorFast. This variant leverages a Taylor expansion of a differential range function to compute per-pixel per-pulse differential ranges. After algebraic manipulation, the range calculation is low computational cost. TaylorFast currently requires that the PhaseLUTOptimization be set. TaylorFast has slightly lower accuracy relative to Double than FloatFloat, but it is the fastest option on most hardware. In many cases, TaylorFast is faster than the current Float at much higher accuracy.

The documentation contains the full derivation of the TaylorFast approximation along with accuracy considerations. The documentation also notes the inclusion of a property, PropSarBpTaylorFastAddThirdOrder, that makes the approximation more accurate for certain scenarios. The SAR BP example has also been updated to support TaylorFast, along with the optional --taylor-fast-third-order to use the PropSarBpTaylorFastAddThirdOrder property. The unit tests have also been updated to support TaylorFast.

Adds a new SarBp ComputeType, TaylorFast. This variant leverages a Taylor expansion of a differential range function to compute per-pixel per-pulse differential ranges. After algebraic manipulation, the range calculation is low cost. TaylorFast currently requires that the PhaseLUTOptimization be set. TaylorFast has slightly lower accuracy relative to Double than FloatFloat, but it is the fastest option on hardware with both full and reduced rate double-precision. See the documentation added for TaylorFast for the derivation of the approximation along with accuracy considerations and the optional inclusion of a property to enable additional terms in the approximation. Signed-off-by: Thomas Benson <tbenson@nvidia.com>

copy-pr-bot · 2026-05-27T19:37:22Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-05-27T19:52:42Z

Greptile Summary

This PR adds SarBpComputeType::TaylorFast, a new SAR backprojection mode that replaces the per-pixel sqrt with a local Taylor expansion of the range function centered on a per-CTA reference pixel. The approximation is mathematically derived in the new RST documentation (second-order by default; optionally third-order via PropSarBpTaylorFastAddThirdOrder). The PR also includes structural refactoring: SarBpSharedMemory → SarBpPulseBlockCache, ComputeBinToPixelFloatFloat → ComputeBinWeightToPixelFloatFloat returning a SarBpBinAndWeight struct, and several cleanup items in the example and tests.

Adds TaylorFast kernel path with SarBpTaylorFastSharedMemory (per-pulse unit vector, half-inverse-range, and split integer/fractional reference bin) and SarBpThreadState<TaylorFast,true>; requires PhaseLUTOptimization and is guarded by a compile-time static_assert.
Adds PropSarBpTaylorFastAddThirdOrder as a compile-time operator property that instantiates the TaylorFastAddThirdOrder=true kernel variant.
Extends the PointTarget unit test to validate TaylorFast (second- and third-order) with a 132×132 image exercising partial-edge CTAs.

Confidence Score: 5/5

The change is safe to merge. The new TaylorFast path is additive, does not modify any existing compute type, and is guarded by a compile-time property and a runtime PhaseLUT check.

The Taylor approximation math is correct and consistent with the RST derivation. Shared memory synchronization is correct. The SarBpTaylorFastSharedMemory struct is 8-byte aligned. The bin decomposition correctly uses floor(n + x) = n + floor(x). No existing compute paths are altered.

No files require special attention.

Important Files Changed

Filename	Overview
include/matx/kernels/sar_bp.cuh	Core kernel changes: adds SarBpTaylorFastSharedMemory, SarBpThreadState, ComputeBinWeightToPixelTaylorFast, and FillPulseBlockCache lambda. Math, synchronization, and struct alignment are all correct.
include/matx/transforms/sar_bp.h	Adds TaylorFast kernel dispatch (PhaseLUT=true only) and extends fp32_bin_path guard to TaylorFast via the existing !=Double check.
include/matx/operators/sar_bp.h	Adds TaylorFast enum value and PropSarBpTaylorFastAddThirdOrder property struct; extends SarBpOp with variadic CurrentProps and props<>() method.
test/00_transform/SarBp.cu	Adds TaylorFast second-order and third-order test cases to MixedPrecision and PointTarget tests. Image size changed to 132x132 to exercise partial edge CTAs.
examples/sarbp/sarbp.cu	Adds --precision taylor_fast and --taylor-fast-third-order CLI options with correct validation and null-guards before cudaFreeHost calls.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[sar_bp_impl] --> B{compute_type?}
    B -->|TaylorFast| C{PhaseLUT on?}
    C -->|No| D[throw: PhaseLUT required]
    C -->|Yes| E[Fill phase LUT double-to-float]
    E --> F[Launch SarBp kernel TaylorFast]
    F --> G[tid==0 selects reference pixel at block center]
    G --> H[syncthreads]
    H --> I[All threads compute local float offset ref_dx/dy/dz]
    I --> J[Pulse block loop]
    J --> K[FillPulseBlockCache: unit vector u, half_inv_R, ref_bin int+frac]
    K --> L[ComputeBinWeightToPixelTaylorFast]
    L --> M[s = dot u d]
    M --> N[q = norm_d_sq minus s_sq]
    N --> O[dR_2nd = s + q times half_inv_R]
    O --> P{AddThirdOrder?}
    P -->|Yes| Q[dR = dR_2nd minus correction term]
    P -->|No| R[dR = dR_2nd]
    Q --> S[bin_loc = ref_bin_frac + dR times dr_inv_f32]
    R --> S
    S --> T[bin_int = ref_bin_int + floor bin_loc]
    T --> U{bin in valid range?}
    U -->|Yes| V[Interpolate range profile, Apply phase LUT, Accumulate]
    U -->|No| W[skip pulse]

_{Reviews (2): Last reviewed commit: "Add comments that TaylorFast does not su..." | Re-trigger Greptile}

TaylorFast does not support cases where the antenna phase center is located at a pixel that would be used as a reference pixel in the kernel. This is because the calculated reference range would then be 0 and we divide by that reference range. We do not test for and handle this condition at run-time as it is uncommon and would be costly. Other ComputeTypes can support this case if it does occur in practice. Signed-off-by: Thomas Benson <tbenson@nvidia.com>

tbensonatl · 2026-05-27T20:30:58Z

/build

coveralls · 2026-05-28T03:20:10Z

Coverage is 93.606% — tbenson/sarbp-add-taylor-based-variant into main. No base build found for main.

tbensonatl self-assigned this May 27, 2026

greptile-apps Bot reviewed May 27, 2026

View reviewed changes

Comment thread include/matx/kernels/sar_bp.cuh

Comment thread include/matx/kernels/sar_bp.cuh

Comment thread test/00_transform/SarBp.cu

tbensonatl requested a review from cliffburdick May 27, 2026 20:31

cliffburdick approved these changes May 27, 2026

View reviewed changes

tbensonatl merged commit 7e427fa into main May 28, 2026
1 check passed

tbensonatl deleted the tbenson/sarbp-add-taylor-based-variant branch May 28, 2026 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the TaylorFast SarBp ComputeType and documentation#1192

Add the TaylorFast SarBp ComputeType and documentation#1192
tbensonatl merged 2 commits into
mainfrom
tbenson/sarbp-add-taylor-based-variant

tbensonatl commented May 27, 2026

Uh oh!

copy-pr-bot Bot commented May 27, 2026

Uh oh!

greptile-apps Bot commented May 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tbensonatl commented May 27, 2026

Uh oh!

coveralls commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tbensonatl commented May 27, 2026

Uh oh!

copy-pr-bot Bot commented May 27, 2026

Uh oh!

greptile-apps Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tbensonatl commented May 27, 2026

Uh oh!

coveralls commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented May 27, 2026 •

edited

Loading