Dual test for NumPy and CuPy in tests by tharittk · Pull Request #165 · PyLops/pylops-mpi

tharittk · 2025-08-03T09:53:15Z

Some bugs were found on

Infinity norm causes segmentation fault when using with CuPy + MPI
? Pylops BlockDiag ** 3 seems to reverse the engine to NumPy despite being initialized as CuPy

…on env var

mrava87 · 2025-08-03T21:13:11Z

Some bugs were found on

Infinity norm causes segmentation fault when using with CuPy + MPI

This looks like our usual problem with CuPy+MPI... the infinity norms use recv_buf (

pylops-mpi/pylops_mpi/DistributedArray.py

Line 698 in b5a4c54

recv_buf, op=MPI.MAX)

) but the others don't.. however I did a quick test and took recv_buf from the _allreduce_subcomm calls, but this leads to deadlock also for Numpy+MPI (which is probably the reason the code was written like this in first place)... the issue with using recv_buf is that the _allreduce_subcomm method uses self.sub_comm.Allreduce that we know does not play well with CuPy arrays for now as we are not doing any syncronization.

? Pylops BlockDiag ** 3 seems to reverse the engine to NumPy despite being initialized as CuPy

Found issue in PyLops - fixed here PyLops/pylops#689. For now (until the next PyLops release) we can easily fix the test by using Pop = BDiag * BDiag * BDiag which is technically equivalent to Pop = BDiag ** 3

mrava87 · 2025-08-03T22:06:30Z

Also I think for all tests we need to add some logic like

pylops-mpi/tutorials_cupy/poststack_cupy.py

Line 33 in b5a4c54

cp.cuda.Device(rank % device_count).use();

to have different ranks use different GPUs otherwise they will all run on the same GPU (default=0)

mrava87

@tharittk good job!

I think this is nearly ready to go. I would maybe add some targets in the Makefile like we have in pylops https://github.com/PyLops/pylops/blob/a94ea8eae3b9c06bf39637b2e29f6a45a0e7766f/Makefile#L54 and in the contributing part of the documentation (see again what we have in pylops and maybe also add something about the NCCL tests and examples which I just realized is missing)

mrava87 · 2025-08-06T08:09:54Z

@hongyx11 I think this is pretty much ready and a great addition to our test suite as we move forwards trying to change MPI methods from objects to buffers… do you think we can put this into a self-hosted runner like we did for Pylops… I think a single node with even just 2 GPUs would be enough as if will guarantee that we can do some checks on any change we make in the communication bits of our library 😀

hongyx11 · 2025-08-06T11:26:17Z

it's doable, let me give it a try, we need to use srun

…s_gpu

tharittk and others added 2 commits August 3, 2025 04:51

Change scripts in tests/ to be able to run with CuPy and NumPy based …

51ab3e7

…on env var

test: temporary fix to test_power

8d458b0

tharittk added 2 commits August 4, 2025 09:52

temporary use CPU buffer for CuPy + MPI in inf and -inf norm

33121a5

ensure unqiue gpu device for each mpi rank in CuPy MPI tests

66e3b16

tharittk marked this pull request as ready for review August 5, 2025 13:01

mrava87 reviewed Aug 5, 2025

View reviewed changes

Comment thread tests/test_blockdiag.py

mrava87 changed the title ~~Dual test for NumPy and CuPy in tests/~~ Dual test for NumPy and CuPy in tests Aug 6, 2025

tharittk added 2 commits August 6, 2025 10:11

remove slurm-related env in test scripts, fix Makefile type, add test…

99ff24b

…s_gpu

add documentation on contributing page

af8977f

mrava87 approved these changes Aug 8, 2025

View reviewed changes

mrava87 merged commit 1cacf8b into PyLops:main Aug 8, 2025
61 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dual test for NumPy and CuPy in tests#165

Dual test for NumPy and CuPy in tests#165
mrava87 merged 6 commits into
PyLops:mainfrom
tharittk:dual_test

tharittk commented Aug 3, 2025

Uh oh!

mrava87 commented Aug 3, 2025 •

edited

Loading

Uh oh!

mrava87 commented Aug 3, 2025

Uh oh!

mrava87 left a comment

Uh oh!

Uh oh!

mrava87 commented Aug 6, 2025

Uh oh!

hongyx11 commented Aug 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tharittk commented Aug 3, 2025

Uh oh!

mrava87 commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrava87 commented Aug 3, 2025

Uh oh!

mrava87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mrava87 commented Aug 6, 2025

Uh oh!

hongyx11 commented Aug 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mrava87 commented Aug 3, 2025 •

edited

Loading