Skip to content

Commit f55e209

Browse files
committed
Add CUDA kernels for Sinusoidal, LAEA, Polar Stere, State Plane (#1045)
All 12 projections now have GPU CUDA kernels. Performance on A6000: - Sinusoidal: 18ms (56x vs pyproj) - LAEA Europe: 18ms (92x) - Polar Stere: 57ms (64-67x) - State Plane tmerc: 23ms (88x) - State Plane lcc ftUS: 36ms (124x) - LCC France: 39ms (78x) All bit-exact against CPU Numba kernels. Updated README benchmark table and projection support matrix.
1 parent cca3c1b commit f55e209

File tree

2 files changed

+395
-17
lines changed

2 files changed

+395
-17
lines changed

README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -225,9 +225,9 @@ Built-in Numba JIT and CUDA projection kernels bypass pyproj for common CRS pair
225225
| Lambert Conformal Conic | 2154, State Plane | ✅️ | ✅️ |
226226
| Albers Equal Area | 5070 | ✅️ | ✅️ |
227227
| Cylindrical Equal Area | 6933 | ✅️ | ✅️ |
228-
| Sinusoidal | MODIS grids | ✅️ | |
229-
| Lambert Azimuthal Equal Area | 3035, 6931, 6932 | ✅️ | |
230-
| Polar Stereographic | 3031, 3413, 3996 | ✅️ | |
228+
| Sinusoidal | MODIS grids | ✅️ | ✅️ |
229+
| Lambert Azimuthal Equal Area | 3035, 6931, 6932 | ✅️ | ✅️ |
230+
| Polar Stereographic | 3031, 3413, 3996 | ✅️ | ✅️ |
231231

232232
Other CRS pairs fall back to pyproj automatically.
233233

@@ -238,15 +238,15 @@ Other CRS pairs fall back to pyproj automatically.
238238
| Web Mercator | 148ms (6x) | 6ms (146x) | 858ms |
239239
| UTM zone 33N | 221ms (8x) | 21ms (84x) | 1.78s |
240240
| Ell. Mercator | 273ms (10x) | 26ms (102x) | 2.64s |
241-
| LCC France | 329ms (9x) | | 3.02s |
241+
| LCC France | 329ms (9x) | 39ms (78x) | 3.02s |
242242
| Albers CONUS | 172ms (7x) | 14ms (92x) | 1.25s |
243243
| CEA EASE-Grid | 146ms (6x) | 43ms (19x) | 839ms |
244-
| Sinusoidal (MODIS) | 191ms (5x) | | 1.01s |
245-
| LAEA Europe | 196ms (8x) | | 1.65s |
246-
| Polar Stere Antarctic | 376ms (10x) | | 3.63s |
247-
| Polar Stere Arctic | 354ms (11x) | | 3.84s |
248-
| State Plane ME (tmerc) | 223ms (9x) | | 2.03s |
249-
| State Plane CA (lcc, ftUS) | 426ms (11x) | | 4.47s |
244+
| Sinusoidal (MODIS) | 191ms (5x) | 18ms (56x) | 1.01s |
245+
| LAEA Europe | 196ms (8x) | 18ms (92x) | 1.65s |
246+
| Polar Stere Antarctic | 376ms (10x) | 57ms (64x) | 3.63s |
247+
| Polar Stere Arctic | 354ms (11x) | 57ms (67x) | 3.84s |
248+
| State Plane ME (tmerc) | 223ms (9x) | 23ms (88x) | 2.03s |
249+
| State Plane CA (lcc, ftUS) | 426ms (11x) | 36ms (124x) | 4.47s |
250250

251251
Speedups in parentheses are relative to pyproj. The Numba kernels port the PROJ C math (Krueger 6th-order series for Transverse Mercator, Newton iteration for LCC/Mercator inverse, authalic latitude Fourier series for equal-area projections) to `@njit(parallel=True)`. CUDA kernels use `@cuda.jit(device=True)` for the same per-pixel math.
252252

0 commit comments

Comments
 (0)