Commit dd55dff
perf: fix cuda-aware mpi in v3 (#4977)
This pull request updates the MPI CUDA-awareness detection and handling
logic in the `Border` autograd function, simplifying how CUDA support is
determined and removing some legacy checks. The changes ensure that
CUDA-aware MPI support is queried more directly, and some unnecessary
device synchronization calls are removed.
* The logic for checking CUDA-aware MPI support has been simplified:
version checks and redundant branches have been removed, and the code
now directly queries `MPIX_Query_cuda_support()` unless `NO_CUDA_AWARE`
is defined.
[[1]](diffhunk://#diff-7b7590fd4222d9c50f1dd7dde5ce7ed4b27695fbe591b536787db7575c35e32cL102-L112)
[[2]](diffhunk://#diff-7b7590fd4222d9c50f1dd7dde5ce7ed4b27695fbe591b536787db7575c35e32cL227-L237)
* Removed explicit `gpuDeviceSynchronize()` calls from both the forward
and backward paths, relying on PyTorch's internal synchronization
mechanisms instead.
[[1]](diffhunk://#diff-7b7590fd4222d9c50f1dd7dde5ce7ed4b27695fbe591b536787db7575c35e32cL196-L198)
[[2]](diffhunk://#diff-7b7590fd4222d9c50f1dd7dde5ce7ed4b27695fbe591b536787db7575c35e32cL332-L334)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- Performance
- Reduced explicit GPU synchronization, potentially improving throughput
during distributed forward/backward operations.
- Compatibility
- Safer default when CUDA-aware MPI isn’t present: automatically falls
back to CPU-based transfers unless support is detected, improving
stability across varied clusters.
- Reliability
- Simplified CUDA-aware detection reduces edge-case misconfigurations in
mixed MPI environments.
- No API Changes
- Public interfaces remain unchanged; existing workflows continue to
work.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>1 parent 34df2b4 commit dd55dff
1 file changed
Lines changed: 6 additions & 29 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
89 | | - | |
| 89 | + | |
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| |||
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
| 102 | + | |
| 103 | + | |
109 | 104 | | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | 105 | | |
114 | 106 | | |
115 | 107 | | |
| |||
193 | 185 | | |
194 | 186 | | |
195 | 187 | | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | 188 | | |
201 | 189 | | |
202 | 190 | | |
| |||
212 | 200 | | |
213 | 201 | | |
214 | 202 | | |
215 | | - | |
| 203 | + | |
216 | 204 | | |
217 | 205 | | |
218 | 206 | | |
| |||
224 | 212 | | |
225 | 213 | | |
226 | 214 | | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
| 215 | + | |
| 216 | + | |
234 | 217 | | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | 218 | | |
239 | 219 | | |
240 | 220 | | |
| |||
329 | 309 | | |
330 | 310 | | |
331 | 311 | | |
332 | | - | |
333 | | - | |
334 | | - | |
335 | 312 | | |
336 | 313 | | |
337 | 314 | | |
| |||
0 commit comments