Skip to content

Elements Lifetime (GPU)#1368

Merged
ax3l merged 6 commits into
BLAST-ImpactX:developmentfrom
ax3l:fix-elements-lifetime
Apr 10, 2026
Merged

Elements Lifetime (GPU)#1368
ax3l merged 6 commits into
BLAST-ImpactX:developmentfrom
ax3l:fix-elements-lifetime

Conversation

@ax3l
Copy link
Copy Markdown
Member

@ax3l ax3l commented Mar 27, 2026

On GPU, we have to more carefully deal with the lifetime of dynamic (array) data in elements. This PR is an infrastructure update for complex elements that have such dynamic GPU data, to address multiple usage bugs on GPU.

Part 1: This PR

  • ensure we can create elements before an initialized simulation object
  • add unit tests
  • clean them up more carefully after a simulation ends, incl. no error for elements that were never added to sim.lattice
  • we can even reuse the same elements between different simulations (with AMReX finalize and another init in between!)
  • similar logic for beam monitor/source: do not over-eagerly open files, but lazily on first use. Close on finalize (even if not in lattice): already done in Fix BeamMonitor: Delay Open #1178

Part 2: Follow-up PR

@ax3l ax3l added bug Something isn't working bug: affects latest release Bug also exists in latest release version component: elements Elements/maps/external fields component: python Python bindings labels Mar 27, 2026
@ax3l ax3l force-pushed the fix-elements-lifetime branch from 7f272a7 to 7b00400 Compare March 28, 2026 10:07
@ax3l ax3l marked this pull request as ready for review March 28, 2026 14:43
@ax3l ax3l force-pushed the fix-elements-lifetime branch from 7b00400 to 8837f7d Compare March 28, 2026 18:39
@ax3l ax3l added backend: cuda Specific to CUDA execution (GPUs) backend: sycl Specific to DPC++/SYCL execution (CPUs/GPUs) backend: hip Specific to ROCm execution (GPUs) labels Mar 28, 2026
@ax3l ax3l force-pushed the fix-elements-lifetime branch 4 times, most recently from ef43a24 to 6cf4c41 Compare March 28, 2026 22:40
@ax3l ax3l mentioned this pull request Mar 28, 2026
5 tasks
Comment thread src/elements/SoftSol.H Outdated
@ax3l ax3l force-pushed the fix-elements-lifetime branch 2 times, most recently from 71c9e9c to c82bdbf Compare March 29, 2026 05:28
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Mar 29, 2026

Merging this PR will not alter performance

✅ 37 untouched benchmarks


Comparing ax3l:fix-elements-lifetime (d2e1ffa) with development (3afd49b)

Open in CodSpeed

@ax3l ax3l force-pushed the fix-elements-lifetime branch from c82bdbf to 19f5905 Compare March 31, 2026 00:51
Comment thread src/elements/mixin/TrackedVector.H Dismissed
Comment on lines +1 to +12
/* Copyright 2022-2023 The Regents of the University of California, through Lawrence
* Berkeley National Laboratory (subject to receipt of any required
* approvals from the U.S. Dept. of Energy). All rights reserved.
*
* This file is part of ImpactX.
*
* FIXME: SHOULD BE REPLACED WITH amrex::Gpu::TrackedVector in v26.05+, see
* https://github.com/AMReX-Codes/amrex/pull/5253
*
* Authors: Axel Huebl
* License: BSD-3-Clause-LBNL
*/

Check notice

Code scanning / CodeQL

FIXME comment Note

FIXME comment: SHOULD BE REPLACED WITH amrex::Gpu::TrackedVector in v26.05+, see [...]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be replaced after we released 26.04 of ImpactX, because it goes in AMReX 26.05+ only

Comment thread src/elements/mixin/dynamicdata.H Outdated
@ax3l ax3l force-pushed the fix-elements-lifetime branch 2 times, most recently from 8363c50 to c069964 Compare April 1, 2026 07:25
Comment thread src/elements/ExactCFbend.H Outdated
Comment thread src/elements/ExactMultipole.H Outdated
Comment thread src/elements/PolygonAperture.H Outdated
Comment thread src/elements/RFCavity.H Outdated
Comment thread src/elements/SoftQuad.H Outdated
@ax3l ax3l requested a review from EZoni April 3, 2026 21:38
@ax3l ax3l force-pushed the fix-elements-lifetime branch 3 times, most recently from f480634 to cca268d Compare April 5, 2026 05:10
ax3l added 4 commits April 5, 2026 21:14
Template for all elements with dynamic GPU data.
GPUDataRegistry in src/elements/mixin/dynamicdata.H uses
function-local statics (next_id, registry) inside template member
functions. With explicit template instantiation
(IMPACTX_GPUDATA_INSTANTIATE / IMPACTX_GPUDATA_INSTANTIATE), this works
for static builds. However, for DLL/shared library builds, each DLL
that links the library could get its own copy of these statics unless
the explicit instantiations are properly exported with visibility
attributes.
@ax3l ax3l force-pushed the fix-elements-lifetime branch from cca268d to 1a83a74 Compare April 6, 2026 04:15
@ax3l
Copy link
Copy Markdown
Member Author

ax3l commented Apr 7, 2026

@WeiqunZhang @EZoni this is ready to be reviewed/merged.

@ax3l ax3l added this to the Advanced Methods (SciDAC-5) milestone Apr 8, 2026
Comment on lines +57 to +80
def create_soft_solenoid(name="sol"):
"""Create a SoftSolenoid element with minimal coefficients."""
return elements.SoftSolenoid(
name=name,
ds=1.0,
bscale=0.1,
cos_coefficients=[0.5, 0.3, 0.1],
sin_coefficients=[0.0, 0.0, 0.0],
mapsteps=10,
nslice=1,
)


def create_soft_quadrupole(name="quad"):
"""Create a SoftQuadrupole element with minimal coefficients."""
return elements.SoftQuadrupole(
name=name,
ds=0.5,
gscale=1.0,
cos_coefficients=[1.0, 0.5],
sin_coefficients=[0.0, 0.0],
mapsteps=10,
nslice=1,
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add coverage for all other elements that use the new data infrastructure or is it too difficult to set up a meaningful/reasonable test simulation with all of those?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions! No that is not needed, there are 6 elements in ImpactX that use dynamic GPU data and they all use the same pattern. Testing one of them is sufficient for demonstrate the general lifetime issue and to ensure it does not happen again.

Comment thread tests/python/test_element_lifetime.py Outdated

# Verify simulation completed
ref = sim.particle_container().ref_particle()
assert ref.s == pytest.approx(1.5, rel=1e-6)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 1.5 here some sort of expected analytical value? If so, it may be useful to add a short comment about it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just the sum of the .ds of the elements in sim.lattice - I can write this more explicitly, this is just a test the simulation actually did anything. We can actually remove this part as well.

Comment thread tests/python/test_element_lifetime.py Outdated

ref1 = sim1.particle_container().ref_particle()
s_after_sim1 = ref1.s
assert s_after_sim1 == pytest.approx(1.0, rel=1e-6)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 1.0 here some sort of expected analytical value? If so, it may be useful to add a short comment about it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, just the ds of the one lattice element we modeled. Not super needed to validate really.

Comment thread tests/python/test_element_lifetime.py Outdated

ref2 = sim2.particle_container().ref_particle()
s_after_sim2 = ref2.s
assert s_after_sim2 == pytest.approx(1.0, rel=1e-6)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 1.0 here some sort of expected analytical value? If so, it may be useful to add a short comment about it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, just checked the sim did anything. not needed to check.

Comment thread src/elements/ExactMultipole.H Outdated
Comment thread src/elements/ExactCFbend.H Outdated
@EZoni
Copy link
Copy Markdown
Member

EZoni commented Apr 9, 2026

@ax3l

As far as I can review, the PR looks good to me. I just left a few minor suggestions and will approve once a couple of inline comments are fixed (left-over from copy/paste).

Comment thread tests/python/test_element_lifetime.py Outdated
Comment thread tests/python/test_element_lifetime.py Outdated
Comment thread tests/python/test_element_lifetime.py Outdated
ax3l and others added 2 commits April 9, 2026 17:08
Not super needed to check that the simulation had a minimal
effect on the reference particle. We just want to check
element lifetimes here.

Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
Co-authored-by: Edoardo Zoni <59625522+EZoni@users.noreply.github.com>
Copy link
Copy Markdown
Member

@EZoni EZoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! As far as my understanding goes, this looks good to me. I'd let you or @WeiqunZhang finalize it and decide when it's ready to merge.

@ax3l ax3l merged commit 909ea97 into BLAST-ImpactX:development Apr 10, 2026
18 checks passed
@ax3l ax3l deleted the fix-elements-lifetime branch April 10, 2026 03:58
@ax3l
Copy link
Copy Markdown
Member Author

ax3l commented Apr 10, 2026

Thank you! 🎉

WeiqunZhang pushed a commit to AMReX-Codes/amrex that referenced this pull request Apr 10, 2026
## Summary

This adds a helper class for synchronizing a pair of host & device
vectors (w/o using managed memory).

In BLAST codes, I find this to cause significant boilerplate in
physics-focused parts of the code and this little helper class cuts down
on lines and mental book-keeping.

Additionally, especially in interactive pyAMReX and ImpactX workflows,
this construct is a building block for enabling user-friendly and even
multi-simulation-spanning user data (e.g., ImpactX element data) by
enabling users to define their inputs even before AMReX was initialized
and being able to reuse their input classes across many thousands of
simulations, e.g., in optimization loops, even with AMReX device arenas
being shut down/recreated in between.

## Additional background

The unit test shows the most common usage patterns & needs.

BLAST-ImpactX/impactx#1368

## Checklist

The proposed changes:
- [x] add new capabilities to AMReX
- [x] CPU build: write the optimized fallback path
- [x] GPU build: use AMReX-session agnostic pinned memory for host OR do
synchronous copies
- [x] include documentation in the code and/or rst files, if appropriate
- [x] is covered by unit tests

---------

Co-authored-by: Alexander Sinn <alexander.sinn@desy.de>
ax3l added a commit to ax3l/impactx that referenced this pull request May 4, 2026
NVCC 12.2 + C++20 evaluates the default initializer of an inline
static data member during implicit instantiation of the enclosing
class template, so any reference such as `DynamicData::get(id)`
from an element header triggers `inline static
std::map<int, std::shared_ptr<T>> s_registry{};` and fails with
"incomplete type is not allowed".

Move `s_next_id` / `s_registry` to plain `static` declarations
inside the class with out-of-class non-inline template definitions
in the same header. Implicit class instantiation no longer touches
the default initializer; the definitions are processed only when
an explicit instantiation actually requests them.

This preserves the cross-DSO guarantee from BLAST-ImpactX#1441: combined with
`IMPACTX_EXPORT`, `extern template struct` in element headers, and
`template struct` in each element .cpp, the data members emit as
`STB_GNU_UNIQUE` symbols and resolve to a single shared registry
even when pyImpactX is built with `-fvisibility=hidden` and LTO.
It also avoids the function-local-static design from BLAST-ImpactX#1368, which
under LTO + hidden visibility duplicated the registry inside
pyImpactX via the inlined accessor body of `emplace<Args...>`.

Only `dynamicdata.H` changes; call sites and the
`IMPACTX_GPUDATA_EXTERN` / `_INSTANTIATE` macros are unchanged.
ax3l added a commit that referenced this pull request May 4, 2026
* C++20

Like AMReX and WarpX, we transition to C++20 now.

* CI: CUDA 12.2

* Fix GPUDataRegistry for NVCC 12 + C++20

NVCC 12.2 + C++20 evaluates the default initializer of an inline
static data member during implicit instantiation of the enclosing
class template, so any reference such as `DynamicData::get(id)`
from an element header triggers `inline static
std::map<int, std::shared_ptr<T>> s_registry{};` and fails with
"incomplete type is not allowed".

Move `s_next_id` / `s_registry` to plain `static` declarations
inside the class with out-of-class non-inline template definitions
in the same header. Implicit class instantiation no longer touches
the default initializer; the definitions are processed only when
an explicit instantiation actually requests them.

This preserves the cross-DSO guarantee from #1441: combined with
`IMPACTX_EXPORT`, `extern template struct` in element headers, and
`template struct` in each element .cpp, the data members emit as
`STB_GNU_UNIQUE` symbols and resolve to a single shared registry
even when pyImpactX is built with `-fvisibility=hidden` and LTO.
It also avoids the function-local-static design from #1368, which
under LTO + hidden visibility duplicated the registry inside
pyImpactX via the inlined accessor body of `emplace<Args...>`.

Only `dynamicdata.H` changes; call sites and the
`IMPACTX_GPUDATA_EXTERN` / `_INSTANTIATE` macros are unchanged.

* CI: Bump ASAN

Clang 14 is known buggy/incomplete for std::source_location,
which toml11 uses (via openPMD-api).

* Doc: Update ROCm to 6+

* Fix NVCC Compile Issue in Envelope
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend: cuda Specific to CUDA execution (GPUs) backend: hip Specific to ROCm execution (GPUs) backend: sycl Specific to DPC++/SYCL execution (CPUs/GPUs) bug: affects latest release Bug also exists in latest release version bug Something isn't working component: elements Elements/maps/external fields component: python Python bindings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants