You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generate JIT classes and LTO IR for single-block C2C fft2/ifft2 fusions, including shared-memory tiling through cuFFTDx 1D passes.
Teach the JIT launcher about grouped 2D blocks and vectorized EPT indexing so FFT2 operators can return multiple columns per thread.
Document the supported FFT2 JIT shape/type limits and add forward/inverse FFT2 JIT fusion coverage.
Copy file name to clipboardExpand all lines: docs_input/basics/fusion.rst
+4-5Lines changed: 4 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,8 +59,9 @@ CUDA JIT Kernel Fusion
59
59
60
60
CUDA JIT kernel fusion is considered an experimental feature. There may be bugs that don't occur with JIT disabled, and new features are being added over time.
61
61
62
-
MatX supports CUDA JIT kernel fusion that compiles the entire expression into a single kernel. Currently this is enabled
63
-
for all standard MatX element-wise operators and FFT and GEMM operations via MathDx. To enable fusion with MathDx,
62
+
MatX supports CUDA JIT kernel fusion that compiles the entire expression into a single kernel. Currently this is enabled
63
+
for all standard MatX element-wise operators and FFT and GEMM operations via MathDx. cuFFTDx supports 1D FFT fusion and
64
+
single-block complex-to-complex 2D ``fft2``/``ifft2`` fusion for supported power-of-two square transforms. To enable fusion with MathDx,
64
65
the following options must be enabled: ``-DMATX_EN_MATHDX=ON``. Once enabled, the ``CUDAJITExecutor`` can be used perform JIT compilation
65
66
in supported situations. If the expression cannot be JIT compiled, the JITExecutor may throw an error.
66
67
@@ -118,12 +119,10 @@ MathDx Compatibility
118
119
- Enabled via ``-DMATX_EN_MATHDX=ON`` for GEMM fusion paths.
119
120
* - cuFFTDx
120
121
- Yes
121
-
- Enabled via ``-DMATX_EN_MATHDX=ON`` for FFT fusion paths.
122
+
- Enabled via ``-DMATX_EN_MATHDX=ON`` for 1D FFT fusion paths and supported single-block 2D C2C FFT fusion paths.
0 commit comments