Skip to content

Commit e28bb86

Browse files
Copilotnjzjz
andcommitted
Add MPI linking to api_cc library with USE_MPI macro for conditional compilation
Co-authored-by: njzjz <9496702+njzjz@users.noreply.github.com>
1 parent e422ee2 commit e28bb86

3 files changed

Lines changed: 38 additions & 9 deletions

File tree

doc/development/pytorch-profiler.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,11 @@ export DP_PYTORCH_PROFILER_OUTPUT_DIR=./profiler_results
1414

1515
3. Check for profiler output in the specified directory:
1616
```bash
17+
# For single-rank or non-MPI usage
1718
ls -la ./profiler_results/pytorch_profiler_trace.json
19+
20+
# For MPI usage, each rank gets its own file
21+
ls -la ./profiler_results/pytorch_profiler_trace_rank*.json
1822
```
1923

2024
For MPI applications, you can use different output directories per rank:
@@ -36,11 +40,13 @@ The profiler uses PyTorch's modern `torch::profiler` API and automatically:
3640
- Creates the output directory if it doesn't exist
3741
- Profiles all forward pass operations in DeepPotPT and DeepSpinPT
3842
- Saves profiling results to a JSON file when the object is destroyed
43+
- Automatically includes MPI rank in filename when MPI is available and initialized
3944

4045
## Output Files
4146

42-
- **All usage**: `pytorch_profiler_trace.json`
47+
- **Single-rank or non-MPI usage**: `pytorch_profiler_trace.json`
48+
- **MPI usage**: `pytorch_profiler_trace_rank{rank}.json` (e.g., `pytorch_profiler_trace_rank0.json`, `pytorch_profiler_trace_rank1.json`)
4349

44-
For MPI applications, users can distinguish between ranks by setting different output directories per rank using the `DP_PYTORCH_PROFILER_OUTPUT_DIR` environment variable.
50+
This ensures that each MPI rank saves its profiling data to a separate file, preventing conflicts in multi-rank simulations.
4551

4652
This is intended for development and debugging purposes.

source/api_cc/CMakeLists.txt

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,21 @@ set_target_properties(
4949
INSTALL_RPATH_USE_LINK_PATH TRUE
5050
BUILD_RPATH "$ORIGIN/../op/tf;$ORIGIN/../op/pt;$ORIGIN/../op/pd")
5151
target_compile_definitions(${libname} PRIVATE TF_PRIVATE)
52+
find_package(MPI)
53+
if(MPI_FOUND)
54+
include(CheckCXXSymbolExists)
55+
set(CMAKE_REQUIRED_INCLUDES ${MPI_CXX_INCLUDE_DIRS})
56+
set(CMAKE_REQUIRED_LIBRARIES ${MPI_CXX_LIBRARIES})
57+
check_cxx_symbol_exists(MPIX_Query_cuda_support "mpi.h" CUDA_AWARE)
58+
if(NOT CUDA_AWARE)
59+
check_cxx_symbol_exists(MPIX_Query_cuda_support "mpi.h;mpi-ext.h" OMP_CUDA)
60+
if(NOT OMP_CUDA)
61+
target_compile_definitions(${libname} PRIVATE NO_CUDA_AWARE)
62+
endif()
63+
endif()
64+
target_link_libraries(${libname} PRIVATE MPI::MPI_CXX)
65+
target_compile_definitions(${libname} PRIVATE USE_MPI)
66+
endif()
5267
if(CMAKE_TESTING_ENABLED)
5368
target_link_libraries(${libname} PRIVATE coverage_config)
5469
endif()

source/api_cc/src/common.cc

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,9 @@
1111
#include <sys/stat.h>
1212
#include <errno.h>
1313

14-
// Note: MPI rank detection has been removed from api_cc library
15-
// to avoid MPI linking dependencies. The profiler will use a generic
16-
// filename. Users can still distinguish between ranks by using different
17-
// output directories per rank if needed.
14+
#ifdef USE_MPI
15+
#include <mpi.h>
16+
#endif
1817

1918
#include "AtomMap.h"
2019
#include "device.h"
@@ -408,10 +407,19 @@ void deepmd::get_env_pytorch_profiler(bool& enable_profiler, std::string& output
408407
}
409408

410409
int deepmd::get_mpi_rank() {
411-
// MPI rank detection removed from api_cc to avoid MPI linking dependencies
412-
// Always return -1 to indicate no MPI rank available
413-
// Users can distinguish between ranks by using different output directories
410+
#ifdef USE_MPI
411+
int rank = -1; // Use -1 to indicate MPI not available/initialized
412+
int initialized = 0;
413+
if (MPI_Initialized(&initialized) == MPI_SUCCESS && initialized) {
414+
if (MPI_Comm_rank(MPI_COMM_WORLD, &rank) != MPI_SUCCESS) {
415+
rank = -1; // fallback to -1 if MPI_Comm_rank fails
416+
}
417+
}
418+
return rank;
419+
#else
420+
// MPI not available at compile time
414421
return -1;
422+
#endif
415423
}
416424

417425
bool deepmd::create_directories(const std::string& path) {

0 commit comments

Comments
 (0)