Skip to content

add mpi in neighbor_search#7537

Open
19hello wants to merge 11 commits into
deepmodeling:developfrom
19hello:parallel_neighlist
Open

add mpi in neighbor_search#7537
19hello wants to merge 11 commits into
deepmodeling:developfrom
19hello:parallel_neighlist

Conversation

@19hello

@19hello 19hello commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR adds MPI-aware domain decomposition to NeighborSearch and wires the LJ esolver path to use it when MPI is enabled.

Changes include:

  • add MPI domain decomposition for rank-local owned atoms and cutoff-relevant ghost atoms
  • keep the existing single-rank NeighborSearch interface available for non-MPI callers
  • update LJ neighbor-list construction to use distributed neighbor search under __MPI
  • optimize ghost atom exchange by building send buffers per atom instead of scanning all owned atoms for every exchange slot
  • add neighbor_search_mpi_benchmark.cpp and register an MPI ctest target for a 4-rank benchmark run

Motivation

The previous neighbor search path built the full expanded atom set per caller, which did not let MPI ranks own disjoint subsets of central atoms. This change lets each rank build only its local central atoms and the halo atoms needed for cutoff checks, reducing duplicated neighbor-search work in MPI runs while preserving the existing non-MPI behavior.

Linked Issue

No linked issue. This PR implements MPI distribution work for neighbor search and LJ force evaluation directly in the feature branch.

Unit Tests and/or Case Tests for my changes

  • Added and updated focused neighbor-search tests for distributed initialization, local neighbor IDs, and MPI halo behavior.
  • Added MODULE_CELL_NEIGHBOR_neighbor_search_mpi_benchmark_np4, which compares distributed neighbor pairs against the serial reference for a 4-rank run.
  • Existing non-MPI neighbor-search behavior remains covered by the serial neighbor-search tests.

Exact Verification Performed

  • Commands run:
    • cmake --build /tmp/abacus-cmake-neigh-check --target MODULE_CELL_NEIGHBOR_neighbor_search_mpi_benchmark -j4
    • env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblas.so.0 ctest --test-dir /tmp/abacus-cmake-neigh-check -R ^MODULE_CELL_NEIGHBOR_neighbor_search_mpi_benchmark_np4$ --output-on-failure
    • env LD_PRELOAD=/usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblas.so.0 /home/fy19/abacus-develop/toolchain/install/openmpi-5.0.7/bin/mpirun -np 1 /tmp/abacus-cmake-neigh-check/source/source_cell/module_neighlist/test/MODULE_CELL_NEIGHBOR_neighbor_search_mpi_benchmark 8 8 8 1 1.75 1.0 0.2 1
  • Result summary:
    • The benchmark target built successfully.
    • The 4-rank MPI ctest passed.
    • The 1-rank periodic ghost-exchange path reported ownership_ok 1, neighbor_pairs_ok 1, and neighbor_ids_ok 1.
  • Checks not run, with reason:
    • Full project CI was not run locally because the PR relies on GitHub Actions for the complete build matrix.

What's changed?

  • MPI builds can construct neighbor lists from rank-local owned atoms plus exchanged ghost atoms.
  • LJ force evaluation uses the distributed neighbor-search path under __MPI and reduces potential, virial, and force contributions across ranks.
  • The Makefile and CMake source lists now include domain_decomposition.cpp for the relevant build paths.
  • The serial NeighborSearch entry point remains available for existing non-MPI callers.

Governance Checklist

  • Global dependencies: no new GlobalV, GlobalC, or PARAM cross-layer control was introduced.
  • Default parameters: no new default arguments were added to existing interfaces.
  • Headers: new header includes are required for concrete member types and shared neighbor-list type aliases; forward declarations are not practical for the stored value types.
  • Line endings: text files use LF.
  • Build linkage: new source files are listed in the relevant CMakeLists.txt and source/Makefile.Objects.
  • Documentation: no documentation update required because this changes internal neighbor-search implementation and MPI execution behavior, not user-facing INPUT syntax.
  • CodeRabbit: automatic review can be requested if it has not started.

INPUT Parameter Changes

  • Parameters added/removed/changed: none.
  • docs/parameters.yaml updated: not applicable.
  • docs/advanced/input_files/input-main.md updated: not applicable.
  • No documentation update required for INPUT files because this PR does not add, remove, or change INPUT parameters.

Core Module Impact

  • Affected core modules: source/source_cell/module_neighlist and source/source_esolver/esolver_lj.cpp.
  • Risk summary: MPI neighbor ownership, ghost atom exchange, and LJ force accumulation are touched; the main risk is incorrect halo coverage or duplicate/missing pair contributions.
  • Compatibility or performance impact: serial neighbor search remains compatible; MPI LJ runs should reduce duplicated neighbor-search work by distributing owned atoms and exchanging only needed ghost atoms.

Governance Exception

  • Rule: Header dependency review warnings for new neighbor-list headers.
  • Reason: The headers store concrete value types and use shared index aliases, so the declarations require these includes.
  • Scope: source/source_cell/module_neighlist headers touched by this PR.
  • User or maintenance risk: limited to compile-time dependency growth in the neighbor-list module.
  • Why the normal rule cannot be followed now: replacing these includes with forward declarations would not work for value members and type aliases used in public declarations.
  • Follow-up cleanup plan: revisit header factoring if the neighbor-list module grows broader public dependencies.
  • Requested approver: ABACUS maintainers reviewing PR add mpi in neighbor_search #7537.

@mohanchen

Copy link
Copy Markdown
Collaborator

Should we revisit relevant code modifications in the future, we will reopen this PR for further discussion.

@mohanchen mohanchen closed this Jul 3, 2026
@19hello 19hello reopened this Jul 3, 2026
@mohanchen mohanchen added GPU & DCU & HPC GPU and DCU and HPC related any issues Features Needed The features are indeed needed, and developers should have sophisticated knowledge Refactor Refactor ABACUS codes Large Systems Issues related to large-size systems GeometryRelaxation Issues related to geometry relaxation MD & LAM MD and Larege Atomic Models Tests/Examples Issues/PR related to unit tests and integrate tests labels Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Features Needed The features are indeed needed, and developers should have sophisticated knowledge GeometryRelaxation Issues related to geometry relaxation GPU & DCU & HPC GPU and DCU and HPC related any issues Large Systems Issues related to large-size systems MD & LAM MD and Larege Atomic Models Refactor Refactor ABACUS codes Tests/Examples Issues/PR related to unit tests and integrate tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants