Skip to content

Commit f4de38e

Browse files
committed
Detect MPI with Singularity.
It uses a wrapper script to detect environment variables added by an MPI launcher program such as mpirun or srun, and exports them as SINGULARITYENV_$KEY=$VALUE. Updates the MpiConfig of the MPIRequirement extension to add the shared memory directory, and a flag to enable or disable shared memory with Singularity (on by default). When enabled, it maps a volume for the directory used (default /dev/shm).
1 parent c3028e3 commit f4de38e

6 files changed

Lines changed: 395 additions & 46 deletions

File tree

README.rst

Lines changed: 40 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -663,41 +663,46 @@ The host-specific parameters are configured in a simple YAML file
663663
(specified with the ``--mpi-config-file`` flag). The allowed keys are
664664
given in the following table; all are optional.
665665

666-
+----------------+------------------+----------+------------------------------+
667-
| Key | Type | Default | Description |
668-
+================+==================+==========+==============================+
669-
| runner | str | "mpirun" | The primary command to use. |
670-
+----------------+------------------+----------+------------------------------+
671-
| nproc_flag | str | "-n" | Flag to set number of |
672-
| | | | processes to start. |
673-
+----------------+------------------+----------+------------------------------+
674-
| default_nproc | int | 1 | Default number of processes. |
675-
+----------------+------------------+----------+------------------------------+
676-
| extra_flags | List[str] | [] | A list of any other flags to |
677-
| | | | be added to the runner's |
678-
| | | | command line before |
679-
| | | | the ``baseCommand``. |
680-
+----------------+------------------+----------+------------------------------+
681-
| env_pass | List[str] | [] | A list of environment |
682-
| | | | variables that should be |
683-
| | | | passed from the host |
684-
| | | | environment through to the |
685-
| | | | tool (e.g., giving the |
686-
| | | | node list as set by your |
687-
| | | | scheduler). |
688-
+----------------+------------------+----------+------------------------------+
689-
| env_pass_regex | List[str] | [] | A list of python regular |
690-
| | | | expressions that will be |
691-
| | | | matched against the host's |
692-
| | | | environment. Those that match|
693-
| | | | will be passed through. |
694-
+----------------+------------------+----------+------------------------------+
695-
| env_set | Mapping[str,str] | {} | A dictionary whose keys are |
696-
| | | | the environment variables set|
697-
| | | | and the values being the |
698-
| | | | values. |
699-
+----------------+------------------+----------+------------------------------+
700-
666+
+----------------+------------------+------------+------------------------------+
667+
| Key | Type | Default | Description |
668+
+================+==================+============+==============================+
669+
| runner | str | "mpirun" | The primary command to use. |
670+
+----------------+------------------+------------+------------------------------+
671+
| nproc_flag | str | "-n" | Flag to set number of |
672+
| | | | processes to start. |
673+
+----------------+------------------+------------+------------------------------+
674+
| default_nproc | int | 1 | Default number of processes. |
675+
+----------------+------------------+------------+------------------------------+
676+
| extra_flags | List[str] | [] | A list of any other flags to |
677+
| | | | be added to the runner's |
678+
| | | | command line before |
679+
| | | | the ``baseCommand``. |
680+
+----------------+------------------+------------+------------------------------+
681+
| env_pass | List[str] | [] | A list of environment |
682+
| | | | variables that should be |
683+
| | | | passed from the host |
684+
| | | | environment through to the |
685+
| | | | tool (e.g., giving the |
686+
| | | | node list as set by your |
687+
| | | | scheduler). |
688+
+----------------+------------------+------------+------------------------------+
689+
| env_pass_regex | List[str] | [] | A list of python regular |
690+
| | | | expressions that will be |
691+
| | | | matched against the host's |
692+
| | | | environment. Those that match|
693+
| | | | will be passed through. |
694+
+----------------+------------------+------------+------------------------------+
695+
| env_set | Mapping[str,str] | {} | A dictionary whose keys are |
696+
| | | | the environment variables set|
697+
| | | | and the values being the |
698+
| | | | values. |
699+
+----------------+------------------+------------+------------------------------+
700+
| shm_enabled | bool | True | Flag to control whether |
701+
| | | | shared memory is used or not.|
702+
+----------------+------------------+------------+------------------------------+
703+
| shm_dir | str | "/dev/shm" | Location to use for shared |
704+
| | | | memory. |
705+
+----------------+------------------+------------+------------------------------+
701706

702707
Enabling Fast Parser (experimental)
703708
===================================

cwltool/mpi.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ def __init__(
2323
env_pass: list[str] | None = None,
2424
env_pass_regex: list[str] | None = None,
2525
env_set: Mapping[str, str] | None = None,
26+
shm_enabled: bool = True,
27+
shm_dir: str = "/dev/shm", # nosec B108 - required for MPI/shared memory in containers
2628
) -> None:
2729
"""
2830
Initialize from the argument mapping.
@@ -35,6 +37,8 @@ def __init__(
3537
env_pass: []
3638
env_pass_regex: []
3739
env_set: {}
40+
shm_enabled: True
41+
shm_dir: "/dev/shm
3842
3943
Any unknown keys will result in an exception.
4044
"""
@@ -45,6 +49,11 @@ def __init__(
4549
self.env_pass = env_pass or []
4650
self.env_pass_regex = env_pass_regex or []
4751
self.env_set = env_set or {}
52+
self.shm_enabled = shm_enabled
53+
# POSIX only contains functions to handle shared memory, but it does not
54+
# specify the directory to be used, nor if a directory needs to be used
55+
# at all -- ref: https://pubs.opengroup.org/onlinepubs/9699919799/
56+
self.shm_dir = shm_dir
4857

4958
@classmethod
5059
def load(cls: type[MpiConfigT], config_file_name: str) -> MpiConfigT:

cwltool/singularity.py

Lines changed: 56 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
"""Support for executing Docker format containers using Singularity {2,3}.x or Apptainer 1.x."""
22

3+
import atexit
34
import copy
45
import hashlib
56
import json
@@ -11,7 +12,10 @@
1112
import sys
1213
import threading
1314
from collections.abc import Callable, MutableMapping, MutableSequence
15+
from contextlib import suppress
16+
from importlib.resources import files as resource_files
1417
from subprocess import check_call, check_output, run # nosec
18+
from tempfile import NamedTemporaryFile
1519
from typing import cast
1620

1721
from cwl_utils.types import CWLDirectoryType, CWLFileType, CWLObjectType
@@ -29,6 +33,7 @@
2933
from .errors import WorkflowException
3034
from .job import ContainerCommandLineJob
3135
from .loghandler import _logger
36+
from .mpi import MPIRequirementName
3237
from .pathmapper import MapperEnt, PathMapper
3338
from .singularity_utils import singularity_supports_userns
3439
from .utils import create_tmp_dir, ensure_non_writable, ensure_writable
@@ -203,7 +208,7 @@ def __init__(
203208
hints: list[CWLObjectType],
204209
name: str,
205210
) -> None:
206-
"""Builder for invoking the Singularty software container engine."""
211+
"""Builder for invoking the Singularity software container engine."""
207212
super().__init__(builder, joborder, make_path_mapper, requirements, hints, name)
208213

209214
@staticmethod
@@ -592,14 +597,55 @@ def create_runtime(
592597
"""Return the Singularity runtime list of commands and options."""
593598
any_path_okay = self.builder.get_requirement("DockerRequirement")[1] or False
594599

595-
runtime = [
596-
"singularity",
597-
"--quiet",
598-
"run" if (is_apptainer_1_1_or_newer() or is_version_3_10_or_newer()) else "exec",
599-
"--contain",
600-
"--ipc",
601-
"--cleanenv",
602-
]
600+
mpi_req, is_req = self.builder.get_requirement(MPIRequirementName)
601+
mpi_enabled = mpi_req and is_req
602+
mpi_config = runtime_context.mpi_config
603+
mpi_env_vars_reference_file_name: str | None = None
604+
runtime: list[str] = []
605+
if mpi_enabled:
606+
# Save current environment variables. The ``singularity_wrapper.sh`` will
607+
# diff it against the env vars produced by mpirun/srun/etc., and use the new
608+
# env vars as SINGULARITYENV_... for Singularity.
609+
with NamedTemporaryFile(mode="w+", delete=False) as f:
610+
for k, v in os.environ.items():
611+
f.write(f"{k}={v}\n")
612+
mpi_env_vars_reference_file_name = f.name
613+
614+
def delete_mpi_baseline_env() -> None:
615+
"""Clean up the MPI baseline environment variables file at exit."""
616+
with suppress(FileNotFoundError): # pragma: no cover
617+
os.remove(mpi_env_vars_reference_file_name) # pragma: no cover
618+
619+
atexit.register(delete_mpi_baseline_env)
620+
621+
runtime.extend(
622+
[
623+
str(resource_files("cwltool") / "singularity_wrapper.sh"),
624+
mpi_env_vars_reference_file_name,
625+
"singularity",
626+
]
627+
)
628+
else:
629+
runtime.append("singularity")
630+
631+
runtime.extend(
632+
[
633+
"--quiet",
634+
"run" if (is_apptainer_1_1_or_newer() or is_version_3_10_or_newer()) else "exec",
635+
"--contain",
636+
"--ipc",
637+
"--cleanenv",
638+
]
639+
)
640+
if mpi_enabled and mpi_config.shm_enabled:
641+
# MPI implementations like OpenMPI and MPICH use shared memory.
642+
self.append_volume(
643+
runtime,
644+
runtime_context.create_tmpdir(),
645+
mpi_config.shm_dir,
646+
writable=True,
647+
)
648+
603649
if is_apptainer_1_1_or_newer() or is_version_3_10_or_newer():
604650
runtime.append("--no-eval")
605651

@@ -665,4 +711,4 @@ def create_runtime(
665711
if container_HOME:
666712
# Restore HOME if we removed it above.
667713
self.environment["HOME"] = container_HOME
668-
return (runtime, None)
714+
return runtime, None

cwltool/singularity_wrapper.sh

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
# singularity_wrapper.sh
5+
#
6+
# DESCRIPTION
7+
# Wrapper around Singularity/Apptainer for CWL + MPI + Singularity.
8+
#
9+
# This script identifies environment variables added by an MPI launcher
10+
# (e.g. srun, mpirun) and adds these environment variables as Singularity
11+
# environment variables using the format ``SINGULARITYENV_$KEY=$VALUE``.
12+
#
13+
# This allows CWL (which uses ``--cleanenv``) to launch MPI + Singularity.
14+
#
15+
# USAGE
16+
# singularity_wrapper.sh <baseline-env-file> <singularity-bin> <args>
17+
#
18+
# ARGUMENTS
19+
# <baseline-env-file>
20+
# Path to the file containing KEY=VALUE pairs with the baseline env.
21+
#
22+
# <singularity-bin>
23+
# Path to singularity/apptainer executable.
24+
#
25+
# [args...]
26+
# Arguments passed to the singularity binary.
27+
#
28+
# EXAMPLE
29+
# singularity_wrapper.sh env.txt singularity --cleanenv exec image.sif
30+
#
31+
# DEPENDENCIES
32+
# It uses the following binaries:
33+
# - printenv
34+
35+
usage() {
36+
cat >&2 <<EOF
37+
singularity_wrapper.sh
38+
39+
Wrapper around Singularity/Apptainer for CWL + MPI + Singularity.
40+
41+
USAGE:
42+
singularity_wrapper.sh <baseline-env-file> <singularity-bin> [args...]
43+
EOF
44+
exit 1
45+
}
46+
47+
if [[ "${1:-}" == "--help" ]]; then
48+
usage
49+
fi
50+
51+
[[ $# -ge 2 ]] || usage
52+
53+
BASELINE_FILE="$1"
54+
SINGULARITY_BIN="$2"
55+
shift 2
56+
57+
if [[ ! -f "$BASELINE_FILE" ]]; then
58+
echo "Error: baseline env file not found: $BASELINE_FILE" >&2
59+
exit 2
60+
fi
61+
62+
# Read baseline env into a variable.
63+
BASELINE_CONTENT=$'\n'"$(cat "$BASELINE_FILE")"$'\n'
64+
65+
# Build new environment variables for Singularity (i.e. ``SINGULARITYENV_KEY=VALUE``).
66+
# Excludes empty variables and variables whose name do not follow POSIX (e.g. some
67+
# Bash environments on HPC clusters such as BSC MareNostrum5, ``BASH_FUNC_module%%=``).
68+
while IFS='=' read -r k v; do
69+
[[ -n "$k" ]] || continue
70+
[[ "$k" =~ ^[A-Za-z_][A-Za-z0-9_]*$ ]] || continue
71+
# If the current env doesn't exist (``! -z``) in the given baseline env (``BASE_ENV``),
72+
# then we want to add it as ``--env`` in singularity.
73+
# Check if the key exists in the BASELINE_CONTENT string in the
74+
# form \n$KEY= (that's why we start the BASELINE and end it with \n).
75+
if [[ ! "$BASELINE_CONTENT" == *$'\n'"$k"=* ]]; then
76+
# Debug
77+
# echo "Adding env var for Singularity command: SINGULARITYENV_$k=$v" >&2
78+
export "SINGULARITYENV_$k=$v"
79+
fi
80+
done < <(printenv)
81+
82+
# Launch the Singularity binary.
83+
exec "$SINGULARITY_BIN" "${@}"

0 commit comments

Comments
 (0)