You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: add theory section to DPA3 documentation (#5262)
Add a comprehensive theory section to the DPA3 descriptor documentation,
including:
- **Line Graph Series (LiGS)**: Construction and geometric
interpretation of the line graph transform
- **Message Passing on LiGS**: Feature update mechanism with residual
connections across multiple graph orders
- **Descriptor Construction**: How atomic descriptors are built and used
for energy prediction
- **Physical Symmetries**: Translational, rotational, permutational
invariance and energy conservation
- **Default Configuration**: Explanation of the default LiGS order K=2
and model scaling
This follows the style of other model documentation (e.g., `se_e2_a.md`)
in the repository.
Authored by OpenClaw (model: glm-5)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Documentation**
* Expanded DPA3 docs with a new Theory section introducing LiGS (Line
Graph Series), multi-order graph construction, and recursive cross-layer
message passing for vertex and edge features.
* Added explicit math for updates, descriptor construction, and
dataset-augmented energy aggregation.
* Documented physical symmetries, force/virial relations, default
configuration guidance (K=2, scalable orders), extended hyperparameter
experiments with RMSE/training-time results, and minor editorial
updates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: njzjz-bot <njzjz-bot@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: OpenClaw <njzjz2008@gmail.com>
Training example: `examples/water/dpa3/input_torch.json`.
17
17
18
+
## Theory
19
+
20
+
DPA3 is a graph neural network operating on the Line Graph Series (LiGS) constructed from atomic configurations.
21
+
22
+
### Line Graph Series (LiGS)
23
+
24
+
Given an initial graph $G^{(1)}$ representing the atomic system, where atoms are vertices and pairs of neighboring atoms within a cutoff radius $r_c$ are edges, the line graph transform $\mathcal{L}$ constructs a new graph $G^{(2)} = \mathcal{L}(G^{(1)})$ by:
25
+
26
+
1. Converting each edge in $G^{(1)}$ to a vertex in $G^{(2)}$
27
+
1. Creating edges in $G^{(2)}$ between vertices whose corresponding edges in $G^{(1)}$ share a common vertex
28
+
29
+
Recursively applying this transform generates a series of graphs $\{G^{(1)}, G^{(2)}, \ldots, G^{(K)}\}$, where $G^{(k)} = \mathcal{L}(G^{(k-1)})$. This sequence is called the Line Graph Series (LiGS) of order $K$.
30
+
31
+
Geometrically, vertices in $G^{(1)}$, $G^{(2)}$, $G^{(3)}$, and $G^{(4)}$ correspond to atoms, bonds (pairs of atoms), angles (three atoms with two bonds sharing a common atom), and dihedral angles (four atoms with two angles sharing a common bond), respectively.
32
+
33
+
### Message Passing on LiGS
34
+
35
+
DPA3 performs message passing across all graphs in the LiGS. At layer $l$, the vertex and edge features on graph $G^{(k)}$ are denoted as $\mathbf{v}_\alpha^{(k,l)} \in \mathbb{R}^{d_v^{(k)}}$ and $\mathbf{e}_{\alpha\beta}^{(k,l)} \in \mathbb{R}^{d_e^{(k)}}$, where $\alpha$ and $\alpha\beta$ denote vertex and edge indices, and $d_v^{(k)}$, $d_e^{(k)}$ are per-graph feature dimensions (for example, in `RepFlowArgs`: $d_v^{(1)}=n_\text{dim}$, $d_e^{(1)}=e_\text{dim}$, $d_v^{(2)}=e_\text{dim}$, and $d_e^{(2)}=a_\text{dim}$).
36
+
37
+
The feature update follows a recursive formulation with residual connections. We use $\text{Update}_V$ and $\text{Update}_E$ to distinguish vertex and edge update modules, respectively:
38
+
39
+
**Edge updates (all graphs $G^{(k)}$):**
40
+
Edge features are updated based on messages from connected vertices:
where $(\alpha,\beta)$ denotes the edge in $G^{(k-1)}$ corresponding to vertex $\alpha$ in $G^{(k)}$. This identity eliminates redundant storage.
61
+
62
+
The same edge update rule also applies to $G^{(1)}$ edge features $\mathbf{e}_{\alpha\beta}^{(1,l)}$ (i.e., with $k=1$ in $\text{Update}_E^{(k)}$). Therefore, these features evolve across layers and, via the $\mathbf{v}^{(2,l)}$-$\mathbf{e}^{(1,l)}$ identity, drive the updates on $G^{(2)}$.
63
+
64
+
### Descriptor Construction
65
+
66
+
The final vertex features of $G^{(1)}$ serve as the descriptor representing the local environment of each atom:
67
+
68
+
```math
69
+
\mathcal{D}^\alpha = \mathbf{v}_\alpha^{(1,L)}
70
+
```
71
+
72
+
where $L$ is the total number of layers.
73
+
74
+
The descriptor output is then consumed by downstream fitting/model components for property prediction (e.g., energy). See the model/fitting documentation for those equations and training objectives.
75
+
76
+
### Physical Symmetries and Conservative Forces
77
+
78
+
DPA3 respects the physical symmetries of the potential energy surface:
79
+
80
+
1.**Translational invariance**: The model depends only on relative coordinates $\mathbf{r}_{\alpha\beta} = \mathbf{r}_\beta - \mathbf{r}_\alpha$, not absolute positions.
81
+
82
+
1.**Rotational invariance**: The final descriptor is rotationally invariant; intermediate equivariant representations are used internally and contracted to produce invariant atomic features.
83
+
84
+
1.**Permutational invariance**: Atoms of the same chemical species are treated identically under permutation symmetry operations (re-labeling) of identical atoms.
85
+
86
+
In addition, DPA3 is inherently conservative: forces are derived from energy gradients:
Virials are similarly derived from cell tensor gradients, ensuring the model is conservative and suitable for molecular dynamics simulations.
93
+
94
+
### Default Configuration
95
+
96
+
DPA3 uses LiGS order $K=2$ as the default configuration, which was found effective in prior work ([DPA3 paper](https://arxiv.org/abs/2506.01686)). The model supports scaling through increasing the number of layers $L$ (e.g., DPA3-L3, DPA3-L6, DPA3-L12, DPA3-L24).
97
+
18
98
## Hyperparameter tests
19
99
20
-
We systematically conducted DPA3 training on six representative DFT datasets (available at [AIS-Square](https://www.aissquare.com/datasets/detail?pageType=datasets&name=DPA3_hyperparameter_search&id=316)):
21
-
metallic systems (`Alloy`, `AlMgCu`, `W`), covalent material (`Boron`), molecular system (`Drug`), and liquid water (`Water`).
22
-
Under consistent training conditions (0.5M training steps, batch_size "auto:128"),
23
-
we rigorously evaluated the impacts of some critical hyperparameters on validation accuracy.
100
+
We systematically trained DPA3 on six representative DFT datasets (available at [AIS-Square](https://www.aissquare.com/datasets/detail?pageType=datasets&name=DPA3_hyperparameter_search&id=316)): metallic systems (`Alloy`, `AlMgCu`, `W`), a covalent material (`Boron`), a molecular system (`Drug`), and liquid water (`Water`).
101
+
Under consistent training conditions (0.5M training steps, `batch_size` = `auto:128`),
102
+
we evaluated the impact of key hyperparameters on validation accuracy.
24
103
25
-
The comparative analysis focused on average RMSEs (Root Mean Square Error) for both energy, force and virial predictions across all six systems,
26
-
with results tabulated below to guide scenario-specific hyperparameter selection:
104
+
The comparative analysis focused on average RMSEs (Root Mean Square Error) for energy, force, and virial predictions across the six systems.
105
+
The results are summarized below to guide scenario-specific hyperparameter selection:
27
106
28
107
| Model | comment | nlayers | n_dim | e_dim | a_dim | e_sel | a_sel | start_lr | stop_lr | loss prefactors | rmse_e (meV/atom) | rmse_f (meV/Å) | rmse_v (meV/atom) | Training wall time (h) |
The loss prefactors (0.2|20, 100|60, 0.02|1) correspond to (`start_pref_e`|`limit_pref_e`, `start_pref_f`|`limit_pref_f`, `start_pref_v`|`limit_pref_v`) respectively.
117
+
The loss prefactors (0.2|20, 100|60, 0.02|1) correspond to (`start_pref_e`|`limit_pref_e`, `start_pref_f`|`limit_pref_f`, `start_pref_v`|`limit_pref_v`), respectively.
39
118
Virial RMSEs were averaged exclusively for systems containing virial labels (`Alloy`, `AlMgCu`, `W`, and `Boron`).
40
119
41
-
Note that we set `float32` in all DPA3 models, while `float64` in other models by default.
120
+
Note that all DPA3 models use `float32`, while other models use `float64` by default.
42
121
43
122
## Requirements of installation from source code {{ pytorch_icon }} {{ paddle_icon }}
0 commit comments