Rough guideline on making parts of Trixi.jl GPU-compatible:
https://github.com/trixi-framework/Trixi.jl/blob/main/docs/src/heterogeneous.md#writing-gpu-kernels
Discussion
Step 1: Minimal feature set
Step 2a: "Full" implementation without host-device memory transfers
Step 2b: Initial performance analysis & optimization
Step 3: Advanced features
Later
Not planned
- TreeMesh
- StructuredMesh
- UnstructuredMesh
Rough guideline on making parts of Trixi.jl GPU-compatible:
https://github.com/trixi-framework/Trixi.jl/blob/main/docs/src/heterogeneous.md#writing-gpu-kernels
Discussion
Pass✅typeof(mesh)to kernels, instead of justmeshtypeof(dg)to kernels, instead of justdg?cache?Step 1: Minimal feature set
Step 2a: "Full" implementation without host-device memory transfers
SummaryCallbackAdd trixi_backend_info! to show which GPU backend is being used #2906StepsizeCallback, in particular reduction of speeds GPU-compatible reduction of speeds in step size computation #2823AnalysisCallbackUseAcceleratedKernels.mapreduceinmax_scaled_speedandintegrate_via_indices#2882SaveSolutionCallbackSaveRestartCallbackStep 2b: Initial performance analysis & optimization
calc_volume_integral!optimized kernel on quadrature nodes for conservative systemscalc_volume_integral!optimized kernel on quadrature nodes for conservative systems(Optimized 3D flux differencing kernel for GPU #3015)calc_volume_integral!optimized kernel on quadrature nodes for nonconservative systemscalc_volume_integral!optimized kernel on quadrature nodes for nonconservative systemsapply_jacobian!2D kernel on quadrature nodes (Optimized 2D GPU Kernel forapply_jacobian!#3013)apply_jacobian!3D kernel on quadrature nodes (Optimizedapply_jacobian!3D GPU kernel #3017)calc_source_terms!2D and 3D kernels on quadrature nodes (Add GPU kernel forcalc_sources!#3012)prolong2boundaries!andcalc_boundary_flux!(started in WIP: Add 2D GPU kernels for boundary conditions #3007, the 2Dprolong2boundaries_per_boundary!indg_2d_gpu.jlneeds to be moved in the CPU filedgsem_p4est/dg_2d.jl, see comment regarding 2D kernel)prolong2boundaries!andcalc_boundary_flux!(Add 3D GPUcalc_boundary_flux!kernels #3034)calc_surface_integral!andprolong2interfacescalc_source_terms!andapply_jacobian!can be fused)prolong2boundariesandcalc_boundary_flux!(see Avoid launching a kernel for each boundarycalc_surface_integrals!(Better parallelism forcalc_surface_integral!GPU kernels #3058)Step 3: Advanced features
VecOfArraysSwitch VecOfArrays for RecursiveArrayTools's VectorOfArray #2952backendshould be first argument to functions, or last, or ...Later
meshvs.MeshTNot planned