Skip to content

Commit 24a1c01

Browse files
committed
debug: traces for compute engine
This shows strace for the inside and outside of the container. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
1 parent 93cb53b commit 24a1c01

6 files changed

Lines changed: 174965 additions & 30 deletions

File tree

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Debugging
2+
3+
This is looking at running the osu all_reduce benchmark with flux (as the flux user) and with Singularity.
4+
I did this in two ways:
5+
6+
- `strace -f`
7+
- `strace -f -s 128`
8+
9+
And from the outside and within the container.
10+
11+
## Traces
12+
13+
### [flux-singularity-trace-f.txt](flux-singularity-trace-f.txt)
14+
15+
```bash
16+
strace -f flux run -opmi=pmix --env OMPI_COMM_WORLD_LOCAL_RANK=0 -N 1 -n 8 -g 1 -o cpu-affinity=per-task -o gpu-affinity=per-task singularity exec --nv --bind /usr/local/cuda /opt/containers/metric-osu-gpu_google-gpu.sif /bin/bash -c "/opt/osu-benchmark/build.openmpi/mpi/collective/osu_allreduce -d cuda H H" 2> flux-singularity-trace-f.txt
17+
```
18+
19+
### [flux-singularity-trace-s-f.txt](flux-singularity-trace-s-f.txt)
20+
21+
```bash
22+
strace -f -s 128 flux run -opmi=pmix --env OMPI_COMM_WORLD_LOCAL_RANK=0 -N 1 -n 8 -g 1 -o cpu-affinity=per-task -o gpu-affinity=per-task singularity exec --nv --bind /usr/local/cuda /opt/containers/metric-osu-gpu_google-gpu.sif /bin/bash -c "/opt/osu-benchmark/build.openmpi/mpi/collective/osu_allreduce -d cuda H H" 2> flux-singularity-trace-s-f.txt
23+
```
24+
25+
### [flux-singularity-inside-container-trace-f.txt](flux-singularity-inside-container-trace-f.txt)
26+
27+
```bash
28+
flux run -opmi=pmix --env OMPI_COMM_WORLD_LOCAL_RANK=0 -N 1 -n 8 -g 1 -o cpu-affinity=per-task -o gpu-affinity=per-task singularity exec --nv --bind /usr/local/cuda /opt/containers/metric-osu-gpu_google-gpu.sif /bin/bash -c "strace -f /opt/osu-benchmark/build.openmpi/mpi/collective/osu_allreduce -d cuda H H" 2> flux-singularity-inside-container-trace-f.txt
29+
```
30+
31+
### [flux-singularity-inside-container-trace-s-f.txt](flux-singularity-inside-container-trace-s-f.txt)
32+
33+
```bash
34+
flux run -opmi=pmix --env OMPI_COMM_WORLD_LOCAL_RANK=0 -N 1 -n 8 -g 1 -o cpu-affinity=per-task -o gpu-affinity=per-task singularity exec --nv --bind /usr/local/cuda /opt/containers/metric-osu-gpu_google-gpu.sif /bin/bash -c "strace -f -s 128 /opt/osu-benchmark/build.openmpi/mpi/collective/osu_allreduce -d cuda H H" 2> flux-singularity-inside-container-trace-s-f.txt
35+
```

0 commit comments

Comments
 (0)