Apply GPU optimizations to TLSPH by efaulhaber · Pull Request #1139 · trixi-framework/TrixiParticles.jl

efaulhaber · 2026-04-13T10:51:53Z

Based on trixi-framework/PointNeighbors.jl#154. Tests will pass once PointNeighbors 0.6.6 is released.

Copilot

Pull request overview

This PR refactors Total Lagrangian SPH (TLSPH) neighbor interactions to better match GPU-friendly execution patterns (per-particle threading, reduced memory traffic, fewer repeated loads), following the newer PointNeighbors neighbor-iteration approach.

Changes:

Refactor TLSPH deformation gradient and RHS assembly to use per-particle @threaded loops with foreach_neighbor, accumulating into Refs to reduce global writes.
Optimize penalty force and viscosity kernels for GPU performance (preload deformation gradients, use div_fast, and use smoothing_kernel_unsafe after cutoff filtering).
Add a SIMD-based fast path for extracting 2×2 matrices (extract_smatrix) and wire in the SIMD dependency; update tests’ mock systems accordingly.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
test/systems/tlsph_system.jl	Updates TLSPH test mock system to a concrete struct (GPU-friendly) and adjusts numeric literals.
test/schemes/structure/total_lagrangian_sph/rhs.jl	Updates RHS tests’ mock system layout and adds `deformation_gradient` stub required by new RHS path.
src/schemes/structure/total_lagrangian_sph/viscosity.jl	Threads deformation gradient through viscosity path and applies `div_fast` in hot divisions.
src/schemes/structure/total_lagrangian_sph/system.jl	Reworks deformation gradient assembly into per-particle neighbor loops with reduced memory writes.
src/schemes/structure/total_lagrangian_sph/rhs.jl	Reworks RHS assembly similarly; passes deformation gradients into penalty/viscosity for fewer loads.
src/schemes/structure/total_lagrangian_sph/penalty_force.jl	Converts penalty force to an in-place accumulator API and switches to unsafe kernel + fast divisions.
src/schemes/boundary/wall_boundary/system.jl	Adds `smoothing_kernel_unsafe` specialization for wall boundary systems.
src/general/neighborhood_search.jl	Adds `foreach_neighbor` wrapper around PointNeighbors neighbor iteration.
src/general/abstract_system.jl	Adds `extract_smatrix` Val-specialization and a SIMD 2D fast path.
src/TrixiParticles.jl	Imports SIMD module for use in `extract_smatrix`.
Project.toml	Adds SIMD dependency + compat; minor reordering of weakdeps/extensions entries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-04-15T11:06:22Z

Codecov Report

❌ Patch coverage is 74.71264% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.26%. Comparing base (379db88) to head (d5766f0).

Files with missing lines	Patch %	Lines
...es/structure/total_lagrangian_sph/penalty_force.jl	14.28%	12 Missing ⚠️
...chemes/structure/total_lagrangian_sph/viscosity.jl	16.66%	5 Missing ⚠️
src/schemes/boundary/wall_boundary/system.jl	0.00%	3 Missing ⚠️
src/general/abstract_system.jl	84.61%	2 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (379db88) and HEAD (d5766f0). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (379db88) HEAD (d5766f0)

total 1 0

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1139       +/-   ##
===========================================
- Coverage   89.06%   67.26%   -21.81%     
===========================================
  Files         128      128               
  Lines        9868     9857       -11     
===========================================
- Hits         8789     6630     -2159     
- Misses       1079     3227     +2148

Flag	Coverage Δ
total	`?`
unit	`67.26% <74.71%> (+0.08%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

efaulhaber · 2026-04-15T14:52:35Z

/run-gpu-tests

efaulhaber · 2026-04-15T15:02:46Z

/run-gpu-tests

Copilot

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…rticles.jl into tlsph-gpu-performance

efaulhaber · 2026-04-16T08:12:54Z

/run-gpu-tests

efaulhaber self-assigned this Apr 13, 2026

efaulhaber added performance gpu labels Apr 13, 2026

efaulhaber mentioned this pull request Apr 13, 2026

3x Speedup on GPUs: Checklist #1131

Open

7 tasks

efaulhaber force-pushed the tlsph-gpu-performance branch from 69f3560 to 006d4f5 Compare April 13, 2026 15:17

efaulhaber added 3 commits April 14, 2026 12:23

Improve performance of TLSPH RHS

b9b9395

Optimize deformation gradient

bcdc27d

Use new foreach_neighbor_unsafe

7518a8c

efaulhaber force-pushed the tlsph-gpu-performance branch from 006d4f5 to 7518a8c Compare April 14, 2026 10:26

efaulhaber added 3 commits April 14, 2026 12:28

Remove PR dependencies

d6e5e8f

Fix

30e5260

Add comments to extract_smatrix

ac173e1

efaulhaber marked this pull request as ready for review April 14, 2026 15:16

efaulhaber added 2 commits April 14, 2026 17:17

Fix unit tests

32479b4

Reformat

ec0de67

efaulhaber requested a review from Copilot April 14, 2026 15:19

Copilot started reviewing on behalf of efaulhaber April 14, 2026 15:19 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Comment thread src/general/abstract_system.jl Outdated

Comment thread src/schemes/structure/total_lagrangian_sph/viscosity.jl Outdated

Fix

ed3d88c

LasNikas reviewed Apr 14, 2026

View reviewed changes

Comment thread src/schemes/structure/total_lagrangian_sph/rhs.jl

Comment thread src/schemes/structure/total_lagrangian_sph/system.jl

Merge branch 'main' into tlsph-gpu-performance

f18eaf6

efaulhaber requested a review from svchb April 14, 2026 15:42

svchb requested changes Apr 15, 2026

View reviewed changes

Comment thread src/schemes/structure/total_lagrangian_sph/penalty_force.jl Outdated

efaulhaber added 5 commits April 15, 2026 11:56

Fix penalty force and add regression test

d4b97d1

Fix allocations

e431d8b

Add warning and separate aligned function for the vloada extract_smatrix

f7215bb

Fix unit tests

088df40

Reformat

6908456

Fix deformation gradient

706149d

Add GPU tests

d5766f0

efaulhaber requested review from LasNikas, Copilot and svchb April 15, 2026 16:10

Copilot started reviewing on behalf of efaulhaber April 15, 2026 16:11 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Comment thread src/schemes/structure/total_lagrangian_sph/system.jl

Comment thread src/general/neighborhood_search.jl

Comment thread src/schemes/structure/total_lagrangian_sph/system.jl

efaulhaber added 2 commits April 15, 2026 18:18

Fix aligned extract_smatrix calls

80d7424

Merge branch 'tlsph-gpu-performance' of github.com:efaulhaber/TrixiPa…

92ed24e

…rticles.jl into tlsph-gpu-performance

efaulhaber marked this pull request as draft April 15, 2026 17:01

efaulhaber added 2 commits April 16, 2026 10:05

Remove extract_smatrix_aligned from this PR

2c60907

Remove SIMD.jl dependency

d4ce4d5

efaulhaber marked this pull request as ready for review April 16, 2026 13:13

svchb approved these changes Apr 16, 2026

View reviewed changes

LasNikas approved these changes Apr 16, 2026

View reviewed changes

efaulhaber merged commit 6ee0bf4 into trixi-framework:main Apr 16, 2026
15 of 17 checks passed

efaulhaber deleted the tlsph-gpu-performance branch April 16, 2026 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply GPU optimizations to TLSPH#1139

Apply GPU optimizations to TLSPH#1139
efaulhaber merged 21 commits intotrixi-framework:mainfrom
efaulhaber:tlsph-gpu-performance

efaulhaber commented Apr 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

efaulhaber commented Apr 15, 2026

Uh oh!

efaulhaber commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

efaulhaber commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

efaulhaber commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

efaulhaber commented Apr 15, 2026

Uh oh!

efaulhaber commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

efaulhaber commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

efaulhaber commented Apr 13, 2026 •

edited

Loading

codecov bot commented Apr 15, 2026 •

edited

Loading