Skip to content

Commit 8863022

Browse files
committed
docs: document Intel MPI multi-node SSH bootstrap workaround for missing bundled ssh
1 parent 0534d69 commit 8863022

1 file changed

Lines changed: 35 additions & 0 deletions

File tree

docs/documentation/intel-gpu-max.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,41 @@ To run a case anyway (testing code correctness on CPU fallback), invoke
253253
`pre_process` and `simulation` directly from their install paths, bypassing
254254
the `./mfc.sh run` wrapper that calls `syscheck` first.
255255

256+
### Multi-node MPI with Intel MPI 2021.x
257+
258+
Intel MPI 2021.x expects a bundled `ssh` binary at `$I_MPI_ROOT/bin/ssh` that
259+
understands an `--external-launcher` flag used by hydra bootstrap. This binary
260+
is missing from some oneAPI installations, causing SSH bootstrap to fail with
261+
`unknown option -- -`.
262+
263+
Workaround: create a wrapper that strips the Intel-specific flag:
264+
265+
```bash
266+
mkdir -p ~/bin
267+
cat > ~/bin/ssh << 'EOF'
268+
#!/bin/bash
269+
args=(-q -o StrictHostKeyChecking=yes -o BatchMode=yes)
270+
for arg in "$@"; do
271+
[[ "$arg" == "--external-launcher" ]] && continue
272+
[[ "$arg" == "--" ]] && break
273+
args+=("$arg")
274+
done
275+
exec /usr/bin/ssh "${args[@]}"
276+
EOF
277+
chmod +x ~/bin/ssh
278+
```
279+
280+
Then run with:
281+
```bash
282+
PATH=$HOME/bin:$PATH \
283+
I_MPI_HYDRA_BOOTSTRAP=rsh \
284+
I_MPI_HYDRA_BOOTSTRAP_EXEC=$HOME/bin/ssh \
285+
mpirun -n <ranks> -hosts <node1>,<node2> ./simulation
286+
```
287+
288+
Nodes must have passwordless SSH from the launch node and no `pam_slurm_adopt`
289+
blocking. Suppress the SSH login banner on remote nodes with `touch ~/.hushlogin`.
290+
256291
### `libumf.so.1` not found at runtime
257292
The 2026.0 Level Zero and OpenCL UR adapters link against `libumf.so.1`.
258293
If not in `LD_LIBRARY_PATH`, all adapters fail silently and sycl-ls reports

0 commit comments

Comments
 (0)