Skip to content

low level Waitall#639

Closed
t-bltg wants to merge 1 commit into
JuliaParallel:masterfrom
t-bltg:wait_all
Closed

low level Waitall#639
t-bltg wants to merge 1 commit into
JuliaParallel:masterfrom
t-bltg:wait_all

Conversation

@t-bltg
Copy link
Copy Markdown
Collaborator

@t-bltg t-bltg commented Sep 22, 2022

This is the same reasoning as for #638.

Add a Waitall closer to the low level MPI_Waitall signature.

Comment thread src/nonblocking.jl
@simonbyrne
Copy link
Copy Markdown
Member

I'll admit that I'm a bit reluctant to add these, but not completely opposed.

I think if we do add this, it would be better to not reuse the main functions, but instead isolate them in a submodule: in that way, it is clear if you're using the "unsafe" functions, or the MPI.jl ones. (Ideally we would just auto-generate all the wrappers, similar to how HDF5.jl does it)

@t-bltg
Copy link
Copy Markdown
Collaborator Author

t-bltg commented Sep 28, 2022

I think if we do add this, it would be better to not reuse the main functions, but instead isolate them in a submodule

Agreed to make this part of a submodule.

Ideally we would just auto-generate all the wrappers, similar to how HDF5.jl does it

I am much in favor of this, and I can work on this, but I will require some direction and something to start with.
I'm guessing something similar to https://github.com/JuliaIO/HDF5.jl/tree/master/gen.

@simonbyrne
Copy link
Copy Markdown
Member

I am much in favor of this, and I can work on this, but I will require some direction and something to start with. I'm guessing something similar to https://github.com/JuliaIO/HDF5.jl/tree/master/gen.

That's one model. Another option might be to use Clang.jl (you can see an effort to use that with HDF5.jl here JuliaIO/HDF5.jl#897). The problem is that the headers will differ between implementations.

It might be possible to also get a full list of the functions directly from the standard. Unfortunately it is publishes as a PDF, but you can request access to the latex source here https://github.com/mpi-forum/mpi-issues/wiki/Access-to-the-MPI-Forum-private-repository (I just made a request, as I'm curious to see what format it is in).

@t-bltg
Copy link
Copy Markdown
Collaborator Author

t-bltg commented Sep 28, 2022

Thanks for the pointers.

The problem is that the headers will differ between implementations.

I can already see the difficulty of this, but I'm sure we can come up to something clever.

I never used Clang.jl but I can try something out, at least for mpich and openmpi.

@sloede
Copy link
Copy Markdown
Member

sloede commented Sep 28, 2022

The problem is that the headers will differ between implementations.

But the MPI API is standardized and the ABI does not come into play if you keep using MPI types, e.g., MPI_Comm instead of int or void*, or am I missing something?

@simonbyrne
Copy link
Copy Markdown
Member

Thanks! Feel free to open a draft PR if you would like early feedback

@simonbyrne
Copy link
Copy Markdown
Member

But the MPI API is standardized and the ABI does not come into play if you keep using MPI types, e.g., MPI_Comm instead of int or void*, or am I missing something?

Yeah, it should be possible, the challenge is the Clang.jl can leak ABI details into the API specification (the Open MPI headers in particular are a mess of macros)

@lcw
Copy link
Copy Markdown
Member

lcw commented Sep 28, 2022

A radical option would be to fix MPI.jl on the MPItrampoline ABI. We could then make a simple low level static wrapper. We would of course rely on MPItrampoline for all ABI translations.

@simonbyrne
Copy link
Copy Markdown
Member

I was able to get access to the MPI standard repo: it looks like they have some Python code that generates the C headers (https://github.com/mpi-forum/mpi-standard/tree/mpi-4.x/binding-tool); it may be possible to adapt that to generate appropriate Julia code.

@t-bltg
Copy link
Copy Markdown
Collaborator Author

t-bltg commented Sep 28, 2022

Well on my side I could generate the mpi signatures using Clang.jl for MPICH and openMPI, that went surprisingly well.
I'm now working to integrate that in MPI.jl.

I have a minor issue regarding OpenMPI_jll: why doesn't it have the attribute artifact_dir as MPICH_jll does ?

@t-bltg
Copy link
Copy Markdown
Collaborator Author

t-bltg commented Sep 28, 2022

Another issue with the MPICH headers:

<MPICH_jll artifact>/include/mpi.h

typedef int MPI_Datatype;
[...]
/* 
   The layouts for the types MPI_DOUBLE_INT etc are simply
   struct { 
       double var;
       int    loc;
   }
   This is documented in the man pages on the various datatypes.   
 */
#define MPI_FLOAT_INT         ((MPI_Datatype)0x8c000000)
#define MPI_DOUBLE_INT        ((MPI_Datatype)0x8c000001)
#define MPI_LONG_INT          ((MPI_Datatype)0x8c000002)
#define MPI_SHORT_INT         ((MPI_Datatype)0x8c000003)
#define MPI_2INT              ((MPI_Datatype)0x4c000816)
#define MPI_LONG_DOUBLE_INT   ((MPI_Datatype)0x8c000004)

What the heck: 0x8c000000 > typemax(Int32).
Naturally, in julia it throws: ERROR: InexactError: check_top_bit(Int32, 2348810240).

@t-bltg
Copy link
Copy Markdown
Collaborator Author

t-bltg commented Sep 29, 2022

Closing in favor of #644.

@t-bltg t-bltg closed this Sep 29, 2022
@t-bltg t-bltg deleted the wait_all branch September 29, 2022 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants