@@ -72,60 +72,19 @@ location to be automatically picked up by the software shipped with EESSI. This
7272/cvmfs/software.eessi.io/host_injections/2023.06/software/linux/aarch64/nvidia/grace/rpath_overrides/OpenMPI/system/lib
7373```
7474
75- ** Validating the ` libmpi.so.40 ` in ` host_injections ` from OpenMPI/5.0.7 on ARM nodes built with:**
75+ ** OpenMPI/5.0.7 on ARM nodes built with:**
7676```
7777./configure --prefix=/cluster/installations/eessi/default/aarch64/software/OpenMPI/5.0.7-GCC-12.3.0 --with-cuda=${EBROOTCUDA} --with-cuda-libdir=${EBROOTCUDA}/lib64 --with-slurm --enable-mpi-ext=cuda --with-libfabric=${EBROOTLIBFABRIC} --with-ucx=${EBROOTUCX} --enable-mpirun-prefix-by-default --enable-shared --with-hwloc=/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/hwloc/2.9.1-GCCcore-12.3.0 --with-libevent=/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/libevent/2.1.12-GCCcore-12.3.0 --with-pmix=/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/PMIx/4.2.4-GCCcore-12.3.0 --with-ucc=/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/UCC/1.2.0-GCCcore-12.3.0 --with-prrte=internal
7878```
79- ```
80- ldd /cvmfs/software.eessi.io/host_injections/2023.06/software/linux/aarch64/nvidia/grace/rpath_overrides/OpenMPI/system/lib/libmpi.so.40
81-
82- linux-vdso.so.1 (0x0000fffcfd1d0000)
83- libucc.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/UCC/1.2.0-GCCcore-12.3.0/lib64/libucc.so.1 (0x0000fffcfce50000)
84- libucs.so.0 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/UCX/1.14.1-GCCcore-12.3.0/lib64/libucs.so.0 (0x0000fffcfcde0000)
85- libnuma.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/numactl/2.0.16-GCCcore-12.3.0/lib64/libnuma.so.1 (0x0000fffcfcdb0000)
86- libucm.so.0 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/UCX/1.14.1-GCCcore-12.3.0/lib64/libucm.so.0 (0x0000fffcfcd70000)
87- libopen-pal.so.80 => /cluster/installations/eessi/default/aarch64/software/OpenMPI/5.0.7-GCC-12.3.0/lib/libopen-pal.so.80 (0x0000fffcfcc40000)
88- libfabric.so.1 => /cvmfs/software.eessi.io/host_injections/2023.06/software/linux/aarch64/nvidia/grace/rpath_overrides/OpenMPI/system/lib/libfabric.so.1 (0x0000fffcfca50000)
89- librdmacm.so.1 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/librdmacm.so.1 (0x0000fffcfca10000)
90- libefa.so.1 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/libefa.so.1 (0x0000fffcfc9e0000)
91- libibverbs.so.1 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/libibverbs.so.1 (0x0000fffcfc9a0000)
92- libcxi.so.1 => /cluster/installations/eessi/default/aarch64/software/shs-libcxi/1.7.0-GCCcore-12.3.0/lib64/libcxi.so.1 (0x0000fffcfc960000)
93- libcurl.so.4 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/libcurl.so.4 (0x0000fffcfc8a0000)
94- libjson-c.so.5 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/json-c/0.16-GCCcore-12.3.0/lib64/libjson-c.so.5 (0x0000fffcfc870000)
95- libatomic.so.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/GCCcore/12.3.0/lib64/libatomic.so.1 (0x0000fffcfc840000)
96- libcudart.so.12 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/accel/nvidia/cc90/software/CUDA/12.1.1/lib64/libcudart.so.12 (0x0000fffcfc780000)
97- libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x0000fffcf97d0000)
98- libnvidia-ml.so.1 => /usr/lib64/libnvidia-ml.so.1 (0x0000fffcf8980000)
99- libnl-route-3.so.200 => /cluster/installations/eessi/default/aarch64/software/libnl/3.11.0-GCCcore-12.3.0/lib64/libnl-route-3.so.200 (0x0000fffcf88d0000)
100- libnl-3.so.200 => /cluster/installations/eessi/default/aarch64/software/libnl/3.11.0-GCCcore-12.3.0/lib64/libnl-3.so.200 (0x0000fffcf8890000)
101- libpmix.so.2 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/PMIx/4.2.4-GCCcore-12.3.0/lib64/libpmix.so.2 (0x0000fffcf8690000)
102- libevent_core-2.1.so.7 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/libevent/2.1.12-GCCcore-12.3.0/lib64/libevent_core-2.1.so.7 (0x0000fffcf8630000)
103- libevent_pthreads-2.1.so.7 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/libevent/2.1.12-GCCcore-12.3.0/lib64/libevent_pthreads-2.1.so.7 (0x0000fffcf8600000)
104- libhwloc.so.15 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/hwloc/2.9.1-GCCcore-12.3.0/lib64/libhwloc.so.15 (0x0000fffcf8580000)
105- libpciaccess.so.0 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/libpciaccess/0.17-GCCcore-12.3.0/lib64/libpciaccess.so.0 (0x0000fffcf8550000)
106- libxml2.so.2 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/libxml2/2.11.4-GCCcore-12.3.0/lib64/libxml2.so.2 (0x0000fffcf83e0000)
107- libz.so.1 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/libz.so.1 (0x0000fffcf83a0000)
108- liblzma.so.5 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/liblzma.so.5 (0x0000fffcf8330000)
109- libm.so.6 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/lib/../lib64/libm.so.6 (0x0000fffcf8280000)
110- libc.so.6 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/lib/../lib64/libc.so.6 (0x0000fffcf80e0000)
111- /lib/ld-linux-aarch64.so.1 (0x0000fffcfd1e0000)
112- libcares.so.2 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/libcares.so.2 (0x0000fffcf80a0000)
113- libnghttp2.so.14 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/usr/lib/../lib64/libnghttp2.so.14 (0x0000fffcf8050000)
114- libssl.so.1.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/OpenSSL/1.1/lib64/libssl.so.1.1 (0x0000fffcf7fb0000)
115- libcrypto.so.1.1 => /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/nvidia/grace/software/OpenSSL/1.1/lib64/libcrypto.so.1.1 (0x0000fffcf7d10000)
116- libdl.so.2 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/lib/../lib64/libdl.so.2 (0x0000fffcf7ce0000)
117- libpthread.so.0 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/lib/../lib64/libpthread.so.0 (0x0000fffcf7cb0000)
118- librt.so.1 => /cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64/lib/../lib64/librt.so.1 (0x0000fffcf7c80000)
119- ```
120-
12179### Testing
12280
12381We plan to provide more comprehensive test results in the future. In this blog post we want to report that the approach works in principle, and that the EESSI stack can pick up and use the custom OpenMPI build and extract
12482performance from the host interconnect ** without the need to rebuild any software packages** .
12583
126- ** 1- Test using OSU-Micro-Benchmarks on 2-nodes (x86_64 AMD-CPUs)** :
84+ ** 1- Test using OSU-Micro-Benchmarks from EESSI on 2-nodes (x86_64 AMD-CPUs)** :
12785```
12886Environment set up to use EESSI (2023.06), have fun!
87+
12988hostname:
13089x1001c6s2b0n1
13190x1001c6s3b0n0
@@ -207,7 +166,7 @@ Currently Loaded Modules:
2071662097152 90.79
208167```
209168
210- ** 2- Test using OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0 on 2-nodes (Grace/Hopper GPUs)** :
169+ ** 2- Test using OSU-Micro-Benchmarks/7.5-gompi-2023b-CUDA-12.4.0 from EESSI on 2-nodes/2-GPUs (Grace/Hopper GPUs)** :
211170```
212171Environment set up to use EESSI (2023.06), have fun!
213172
@@ -297,6 +256,79 @@ Currently Loaded Modules:
2972562097152 93.98
2982574194304 180.14
299258```
300- ## Conclusion
301259
260+ ** 3- Test using OSU-Micro-Benchmarks/7.5 with PrgEnv-cray on 2-nodes/2-GPUs (Grace/Hopper GPUs)** :
261+ ```
262+
263+ hostname:
264+ x1000c4s4b1n0
265+ x1000c5s3b0n0
266+
267+ CPU info:
268+ Vendor ID: ARM
269+
270+ Currently Loaded Modules:
271+ 1) craype-arm-grace 8) craype/2.7.34
272+ 2) libfabric/1.22.0 9) cray-dsmml/0.3.1
273+ 3) craype-network-ofi 10) cray-mpich/8.1.32
274+ 4) perftools-base/25.03.0 11) cray-libsci/25.03.0
275+ 5) xpmem/2.11.3-1.3_gdbda01a1eb3d 12) PrgEnv-cray/8.6.0
276+ 6) cce/19.0.0 13) cudatoolkit/24.11_12.6
277+
278+ # OSU MPI-CUDA Bi-Directional Bandwidth Test v7.5
279+ # Datatype: MPI_CHAR.
280+ # Size Bandwidth (MB/s)
281+ 1 1.06
282+ 2 2.17
283+ 4 4.40
284+ 8 8.80
285+ 16 17.64
286+ 32 35.17
287+ 64 70.55
288+ 128 140.91
289+ 256 281.22
290+ 512 559.04
291+ 1024 1114.45
292+ 2048 2081.25
293+ 4096 4068.64
294+ 8192 1852.11
295+ 16384 18564.47
296+ 32768 22647.40
297+ 65536 33108.03
298+ 131072 39553.95
299+ 262144 43140.01
300+ 524288 44853.40
301+ 1048576 45761.69
302+ 2097152 46228.10
303+ 4194304 46470.29
304+
305+ # OSU MPI-CUDA Latency Test v7.5
306+ # Datatype: MPI_CHAR.
307+ # Size Avg Latency(us)
308+ 1 2.76
309+ 2 2.72
310+ 4 2.90
311+ 8 2.86
312+ 16 2.85
313+ 32 2.73
314+ 64 2.60
315+ 128 3.41
316+ 256 4.17
317+ 512 4.19
318+ 1024 4.29
319+ 2048 4.44
320+ 4096 4.66
321+ 8192 7.59
322+ 16384 8.17
323+ 32768 8.44
324+ 65536 9.92
325+ 131072 12.59
326+ 262144 18.07
327+ 524288 29.00
328+ 1048576 50.64
329+ 2097152 94.06
330+ 4194304 180.44
331+ ```
332+
333+ ## Conclusion
302334The approach demonstrates EESSI's flexibility in accommodating specialized hardware requirements while preserving the benefits of a standardized software stack! There is plenty of more testing to do, but the signs at this stage are very good!
0 commit comments