You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add the TaylorFast SarBp ComputeType and documentation (#1192)
* Add the TaylorFast SarBp ComputeType and documentation
Adds a new SarBp ComputeType, TaylorFast. This variant leverages a
Taylor expansion of a differential range function to compute per-pixel
per-pulse differential ranges. After algebraic manipulation, the range
calculation is low cost. TaylorFast currently requires that the
PhaseLUTOptimization be set. TaylorFast has slightly lower accuracy
relative to Double than FloatFloat, but it is the fastest option on
hardware with both full and reduced rate double-precision.
See the documentation added for TaylorFast for the derivation of the
approximation along with accuracy considerations and the optional
inclusion of a property to enable additional terms in the approximation.
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
* Add comments that TaylorFast does not support a degenerate geometry
TaylorFast does not support cases where the antenna phase center is located at
a pixel that would be used as a reference pixel in the kernel. This is because the
calculated reference range would then be 0 and we divide by that reference range.
We do not test for and handle this condition at run-time as it is uncommon and
would be costly. Other ComputeTypes can support this case if it does occur in
practice.
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Copy file name to clipboardExpand all lines: examples/sarbp/README.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,14 +106,16 @@ real/imag, row-major), written to `output_image.raw` in this example.
106
106
|`-w {hamming,none}`| Window for range compression (default: hamming) |
107
107
|`-b {auto,all,0,N}`| Pulses per processing block. `auto` uses the GPU L2 cache size to choose a block size; `all` and `0` use all pulses (default: auto) |
108
108
|`--image-tiles N`| Process the image as N x N tiles during backprojection (default: 1) |
|`--warmup`| Warmup GPU kernels and FFT plans before timed run |
111
112
112
113
The `--precision` flag controls the arithmetic used by the `sar_bp` operator. For spaceborne SAR, `float` does not provide enough precision to store fractional wavelengths at the range-to-MCP magnitudes (hundreds of km), so pure `float` is not sufficient to produce focused images. The available modes are:
113
114
114
115
-`double` -- full double-precision arithmetic. Most accurate.
115
116
-`mixed` -- double-precision for range computation, single-precision elsewhere. Default. Close to `double` in image quality with slightly higher throughput on GPUs with reduced double-precision throughput. Other than `float`, this is the fastest option on hardware with full-throughput double-precision (e.g., A100, H100/H200, B200).
116
117
-`fltflt` -- float-float evaluation using two `float` values for the high-precision range math. Significantly higher throughput on GPUs where `double` throughput is reduced (e.g., RTX PROs, Jetson Orin/Thor, gaming GPUs).
118
+
-`taylor_fast` -- local Taylor approximation of the pulse-to-pixel range about a centered per-thread-block reference point. Highest-throughput experimental mode for spaceborne SAR geometries where moderate approximation error is acceptable.
117
119
-`float` -- single-precision throughout. Fastest but not accurate enough for most spaceborne data.
0 commit comments