Skip to content

Commit a830d10

Browse files
committed
add docs
1 parent 25a36d1 commit a830d10

1 file changed

Lines changed: 29 additions & 0 deletions

File tree

doc/env.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,4 +89,33 @@ These environment variables also apply to third-party programs using the C++ int
8989

9090
List of customized OP plugin libraries to load, such as `/path/to/plugin1.so:/path/to/plugin2.so` on Linux and `/path/to/plugin1.dll;/path/to/plugin2.dll` on Windows.
9191

92+
:::{envvar} DP_PROFILER
93+
94+
Enable the built-in PyTorch Kineto profiler for the PyTorch C++ (inference) backend.
95+
96+
**Type**: string (output file stem)
97+
**Default**: unset (disabled)
98+
99+
When set to a non-empty value, profiling is enabled for the lifetime of the loaded PyTorch model (e.g. during LAMMPS runs). A JSON trace file is written on finish. The final file name is constructed as:
100+
101+
- `<ENV_VALUE>_gpu<ID>.json` if running on GPU (multi-GPU safe: the CUDA device id is appended)
102+
- `<ENV_VALUE>.json` if running on CPU
103+
104+
The trace is compatible with [Chrome trace viewer](https://ui.perfetto.dev/) (alternatively chrome://tracing) and PyTorch profiler tooling. It includes:
105+
106+
- CPU operator activities (always)
107+
- CUDA activities (if GPU available)
108+
109+
Example:
110+
111+
```bash
112+
export DP_PROFILER=result
113+
mpirun -np 4 lmp -in in.lammps
114+
# Produces result_gpuX.json, where X is the GPU id used by each MPI rank.
115+
```
116+
117+
Tips:
118+
119+
- Large runs can generate sizable JSON files; consider limiting numbers of MD steps, like 20.
120+
- Currently this feature only supports single process, or multi-process runs where each process uses a distinct GPU on the same node.
92121
:::

0 commit comments

Comments
 (0)