Skip to content

Commit 7797b80

Browse files
authored
Merge pull request #15 from simongdg/main
removed MPI Warmup so that the NVSHMEM_DISABLE_CUDA_VMM=1 and extra i…
2 parents dc2daac + e746f2e commit 7797b80

12 files changed

Lines changed: 9 additions & 72 deletions

File tree

08-H_NCCL_NVSHMEM/.master/NVSHMEM/Instructions.ipynb

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -64,13 +64,6 @@
6464
"GPUs listed. This is automatically done for the `sanitize`, `run` and\n",
6565
"`profile` make targets.\n",
6666
"\n",
67-
"`NVSHMEM_DISABLE_CUDA_VMM=1` is set for the `sanitize`, `run` and\n",
68-
"`profile` make targets. This is done to hide warnings and errors\n",
69-
"appearing only for NVSHMEM version 2.5.0 to be fixed in the next\n",
70-
"release. You might still see cuMemFree, symmetric heap, and\n",
71-
"`nvshmem_finalize` errors at the end of the program execution, depending\n",
72-
"on the system used to run the program. You may ignore these errors for\n",
73-
"now."
7467
],
7568
"id": "e1e833fa-16a5-4510-8e6c-7d6abd44310e"
7669
}

08-H_NCCL_NVSHMEM/.master/NVSHMEM/Instructions.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,3 @@ Study the performance by glimpsing at the profile generated with
4242

4343
The Slurm installation on JUWELS-Booster sets `CUDA_VISIBLE_DEVICES` automatically so that each spawned process only sees the GPU it should use (see [GPU Devices](https://apps.fz-juelich.de/jsc/hps/juwels/booster-overview.html#gpu-devices) in the JUWELS Booster Overview documentation). This is not supported for NVSHMEM. The automatic setting of `CUDA_VISIBLE_DEVICES` can be disabled by setting `CUDA_VISIBLE_DEVICES=0,1,2,3` in the shell that executes srun. With `CUDA_VISIBLE_DEVICES` set all spawned processes can see all GPUs listed. This is automatically done for the `sanitize`, `run` and `profile` make targets.
4444

45-
`NVSHMEM_DISABLE_CUDA_VMM=1` is set for the `sanitize`, `run` and `profile` make targets. This is done to hide warnings and errors appearing only for NVSHMEM version 2.5.0 to be fixed in the next release. You might still see cuMemFree, symmetric heap, and `nvshmem_finalize` errors at the end of the program execution, depending on the system used to run the program. You may ignore these errors for now.

08-H_NCCL_NVSHMEM/.master/NVSHMEM/Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,10 @@ clean:
3636
rm -f jacobi jacobi.o *.nsys-rep jacobi.*.compute-sanitizer.log
3737

3838
sanitize: jacobi
39-
NVSHMEM_DISABLE_CUDA_VMM=$(N_D_C_VMM) CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) compute-sanitizer --log-file jacobi.%q{SLURM_PROCID}.compute-sanitizer.log ./jacobi -niter 10
39+
CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) compute-sanitizer --log-file jacobi.%q{SLURM_PROCID}.compute-sanitizer.log ./jacobi -niter 10
4040

4141
run: jacobi
42-
NVSHMEM_DISABLE_CUDA_VMM=$(N_D_C_VMM) CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) ./jacobi
42+
CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) ./jacobi
4343

4444
profile: jacobi
45-
NVSHMEM_DISABLE_CUDA_VMM=$(N_D_C_VMM) CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) nsys profile --trace=mpi,cuda,nvtx -o jacobi.%q{SLURM_PROCID} ./jacobi -niter 10
45+
CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) nsys profile --trace=mpi,cuda,nvtx -o jacobi.%q{SLURM_PROCID} ./jacobi -niter 10

08-H_NCCL_NVSHMEM/.master/NVSHMEM/jacobi.cu

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -317,19 +317,6 @@ int main(int argc, char* argv[]) {
317317
real* l2_norm_h;
318318
CUDA_RT_CALL(cudaMallocHost(&l2_norm_h, sizeof(real)));
319319

320-
PUSH_RANGE("MPI_Warmup", 5)
321-
for (int i = 0; i < 10; ++i) {
322-
const int top = rank > 0 ? rank - 1 : (size - 1);
323-
const int bottom = (rank + 1) % size;
324-
MPI_CALL(MPI_Sendrecv(a_new + iy_start * nx, nx, MPI_REAL_TYPE, top, 0,
325-
a_new + (iy_end * nx), nx, MPI_REAL_TYPE, bottom, 0, MPI_COMM_WORLD,
326-
MPI_STATUS_IGNORE));
327-
MPI_CALL(MPI_Sendrecv(a_new + (iy_end - 1) * nx, nx, MPI_REAL_TYPE, bottom, 0, a_new, nx,
328-
MPI_REAL_TYPE, top, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE));
329-
std::swap(a_new, a);
330-
}
331-
POP_RANGE
332-
333320
CUDA_RT_CALL(cudaDeviceSynchronize());
334321

335322
if (!csv && 0 == rank) {

08-H_NCCL_NVSHMEM/solutions/NVSHMEM/Instructions.ipynb

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -64,13 +64,6 @@
6464
"GPUs listed. This is automatically done for the `sanitize`, `run` and\n",
6565
"`profile` make targets.\n",
6666
"\n",
67-
"`NVSHMEM_DISABLE_CUDA_VMM=1` is set for the `sanitize`, `run` and\n",
68-
"`profile` make targets. This is done to hide warnings and errors\n",
69-
"appearing only for NVSHMEM version 2.5.0 to be fixed in the next\n",
70-
"release. You might still see cuMemFree, symmetric heap, and\n",
71-
"`nvshmem_finalize` errors at the end of the program execution, depending\n",
72-
"on the system used to run the program. You may ignore these errors for\n",
73-
"now."
7467
],
7568
"id": "e1e833fa-16a5-4510-8e6c-7d6abd44310e"
7669
}

08-H_NCCL_NVSHMEM/solutions/NVSHMEM/Instructions.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,3 @@ Study the performance by glimpsing at the profile generated with
4242

4343
The Slurm installation on JUWELS-Booster sets `CUDA_VISIBLE_DEVICES` automatically so that each spawned process only sees the GPU it should use (see [GPU Devices](https://apps.fz-juelich.de/jsc/hps/juwels/booster-overview.html#gpu-devices) in the JUWELS Booster Overview documentation). This is not supported for NVSHMEM. The automatic setting of `CUDA_VISIBLE_DEVICES` can be disabled by setting `CUDA_VISIBLE_DEVICES=0,1,2,3` in the shell that executes srun. With `CUDA_VISIBLE_DEVICES` set all spawned processes can see all GPUs listed. This is automatically done for the `sanitize`, `run` and `profile` make targets.
4444

45-
`NVSHMEM_DISABLE_CUDA_VMM=1` is set for the `sanitize`, `run` and `profile` make targets. This is done to hide warnings and errors appearing only for NVSHMEM version 2.5.0 to be fixed in the next release. You might still see cuMemFree, symmetric heap, and `nvshmem_finalize` errors at the end of the program execution, depending on the system used to run the program. You may ignore these errors for now.

08-H_NCCL_NVSHMEM/solutions/NVSHMEM/Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,10 @@ clean:
3636
rm -f jacobi jacobi.o *.nsys-rep jacobi.*.compute-sanitizer.log
3737

3838
sanitize: jacobi
39-
NVSHMEM_DISABLE_CUDA_VMM=$(N_D_C_VMM) CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) compute-sanitizer --log-file jacobi.%q{SLURM_PROCID}.compute-sanitizer.log ./jacobi -niter 10
39+
CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) compute-sanitizer --log-file jacobi.%q{SLURM_PROCID}.compute-sanitizer.log ./jacobi -niter 10
4040

4141
run: jacobi
42-
NVSHMEM_DISABLE_CUDA_VMM=$(N_D_C_VMM) CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) ./jacobi
42+
CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) ./jacobi
4343

4444
profile: jacobi
45-
NVSHMEM_DISABLE_CUDA_VMM=$(N_D_C_VMM) CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) nsys profile --trace=mpi,cuda,nvtx -o jacobi.%q{SLURM_PROCID} ./jacobi -niter 10
45+
CUDA_VISIBLE_DEVICES=$(C_V_D) $(JSC_SUBMIT_CMD) -n $(NP) nsys profile --trace=mpi,cuda,nvtx -o jacobi.%q{SLURM_PROCID} ./jacobi -niter 10

08-H_NCCL_NVSHMEM/solutions/NVSHMEM/jacobi.cu

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -304,19 +304,6 @@ int main(int argc, char* argv[]) {
304304
real* l2_norm_h;
305305
CUDA_RT_CALL(cudaMallocHost(&l2_norm_h, sizeof(real)));
306306

307-
PUSH_RANGE("MPI_Warmup", 5)
308-
for (int i = 0; i < 10; ++i) {
309-
const int top = rank > 0 ? rank - 1 : (size - 1);
310-
const int bottom = (rank + 1) % size;
311-
MPI_CALL(MPI_Sendrecv(a_new + iy_start * nx, nx, MPI_REAL_TYPE, top, 0,
312-
a_new + (iy_end * nx), nx, MPI_REAL_TYPE, bottom, 0, MPI_COMM_WORLD,
313-
MPI_STATUS_IGNORE));
314-
MPI_CALL(MPI_Sendrecv(a_new + (iy_end - 1) * nx, nx, MPI_REAL_TYPE, bottom, 0, a_new, nx,
315-
MPI_REAL_TYPE, top, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE));
316-
std::swap(a_new, a);
317-
}
318-
POP_RANGE
319-
320307
CUDA_RT_CALL(cudaDeviceSynchronize());
321308

322309
if (!csv && 0 == rank) {

08-H_NCCL_NVSHMEM/tasks/NVSHMEM/Instructions.ipynb

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -64,13 +64,6 @@
6464
"GPUs listed. This is automatically done for the `sanitize`, `run` and\n",
6565
"`profile` make targets.\n",
6666
"\n",
67-
"`NVSHMEM_DISABLE_CUDA_VMM=1` is set for the `sanitize`, `run` and\n",
68-
"`profile` make targets. This is done to hide warnings and errors\n",
69-
"appearing only for NVSHMEM version 2.5.0 to be fixed in the next\n",
70-
"release. You might still see cuMemFree, symmetric heap, and\n",
71-
"`nvshmem_finalize` errors at the end of the program execution, depending\n",
72-
"on the system used to run the program. You may ignore these errors for\n",
73-
"now."
7467
],
7568
"id": "e1e833fa-16a5-4510-8e6c-7d6abd44310e"
7669
}

08-H_NCCL_NVSHMEM/tasks/NVSHMEM/Instructions.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,3 @@ Study the performance by glimpsing at the profile generated with
4242

4343
The Slurm installation on JUWELS-Booster sets `CUDA_VISIBLE_DEVICES` automatically so that each spawned process only sees the GPU it should use (see [GPU Devices](https://apps.fz-juelich.de/jsc/hps/juwels/booster-overview.html#gpu-devices) in the JUWELS Booster Overview documentation). This is not supported for NVSHMEM. The automatic setting of `CUDA_VISIBLE_DEVICES` can be disabled by setting `CUDA_VISIBLE_DEVICES=0,1,2,3` in the shell that executes srun. With `CUDA_VISIBLE_DEVICES` set all spawned processes can see all GPUs listed. This is automatically done for the `sanitize`, `run` and `profile` make targets.
4444

45-
`NVSHMEM_DISABLE_CUDA_VMM=1` is set for the `sanitize`, `run` and `profile` make targets. This is done to hide warnings and errors appearing only for NVSHMEM version 2.5.0 to be fixed in the next release. You might still see cuMemFree, symmetric heap, and `nvshmem_finalize` errors at the end of the program execution, depending on the system used to run the program. You may ignore these errors for now.

0 commit comments

Comments
 (0)