Skip to content

Commit 4147947

Browse files
njzjz-botpre-commit-ci[bot]OpenClaw
authored
docs: add theory section to DPA3 documentation (#5262)
Add a comprehensive theory section to the DPA3 descriptor documentation, including: - **Line Graph Series (LiGS)**: Construction and geometric interpretation of the line graph transform - **Message Passing on LiGS**: Feature update mechanism with residual connections across multiple graph orders - **Descriptor Construction**: How atomic descriptors are built and used for energy prediction - **Physical Symmetries**: Translational, rotational, permutational invariance and energy conservation - **Default Configuration**: Explanation of the default LiGS order K=2 and model scaling This follows the style of other model documentation (e.g., `se_e2_a.md`) in the repository. Authored by OpenClaw (model: glm-5) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Expanded DPA3 docs with a new Theory section introducing LiGS (Line Graph Series), multi-order graph construction, and recursive cross-layer message passing for vertex and edge features. * Added explicit math for updates, descriptor construction, and dataset-augmented energy aggregation. * Documented physical symmetries, force/virial relations, default configuration guidance (K=2, scalable orders), extended hyperparameter experiments with RMSE/training-time results, and minor editorial updates. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: njzjz-bot <njzjz-bot@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: OpenClaw <njzjz2008@gmail.com>
1 parent 314b946 commit 4147947

1 file changed

Lines changed: 93 additions & 14 deletions

File tree

doc/model/dpa3.md

Lines changed: 93 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,26 +4,105 @@
44
**Supported backends**: PyTorch {{ pytorch_icon }}, JAX {{ jax_icon }}, DP {{ dpmodel_icon }}
55
:::
66

7-
DPA3 is an advanced interatomic potential leveraging the message passing architecture.
8-
Designed as a large atomic model (LAM), DPA3 is tailored to integrate and simultaneously train on datasets from various disciplines,
9-
encompassing diverse chemical and materials systems across different research domains.
10-
Its model design ensures exceptional fitting accuracy and robust generalization both within and beyond the training domain.
11-
Furthermore, DPA3 maintains energy conservation and respects the physical symmetries of the potential energy surface,
12-
making it a dependable tool for a wide range of scientific applications.
7+
DPA3 is an advanced interatomic potential based on message passing.
8+
As a large atomic model (LAM), it is designed to integrate and jointly train on datasets from different domains,
9+
covering diverse chemical and materials systems.
10+
Its architecture provides high fitting accuracy and robust generalization both within and beyond the training domain.
11+
DPA3 also preserves energy conservation and the physical symmetries of the potential energy surface,
12+
making it a reliable model for a wide range of scientific applications.
1313

1414
Reference: [DPA3 paper](https://arxiv.org/abs/2506.01686).
1515

1616
Training example: `examples/water/dpa3/input_torch.json`.
1717

18+
## Theory
19+
20+
DPA3 is a graph neural network operating on the Line Graph Series (LiGS) constructed from atomic configurations.
21+
22+
### Line Graph Series (LiGS)
23+
24+
Given an initial graph $G^{(1)}$ representing the atomic system, where atoms are vertices and pairs of neighboring atoms within a cutoff radius $r_c$ are edges, the line graph transform $\mathcal{L}$ constructs a new graph $G^{(2)} = \mathcal{L}(G^{(1)})$ by:
25+
26+
1. Converting each edge in $G^{(1)}$ to a vertex in $G^{(2)}$
27+
1. Creating edges in $G^{(2)}$ between vertices whose corresponding edges in $G^{(1)}$ share a common vertex
28+
29+
Recursively applying this transform generates a series of graphs $\{G^{(1)}, G^{(2)}, \ldots, G^{(K)}\}$, where $G^{(k)} = \mathcal{L}(G^{(k-1)})$. This sequence is called the Line Graph Series (LiGS) of order $K$.
30+
31+
Geometrically, vertices in $G^{(1)}$, $G^{(2)}$, $G^{(3)}$, and $G^{(4)}$ correspond to atoms, bonds (pairs of atoms), angles (three atoms with two bonds sharing a common atom), and dihedral angles (four atoms with two angles sharing a common bond), respectively.
32+
33+
### Message Passing on LiGS
34+
35+
DPA3 performs message passing across all graphs in the LiGS. At layer $l$, the vertex and edge features on graph $G^{(k)}$ are denoted as $\mathbf{v}_\alpha^{(k,l)} \in \mathbb{R}^{d_v^{(k)}}$ and $\mathbf{e}_{\alpha\beta}^{(k,l)} \in \mathbb{R}^{d_e^{(k)}}$, where $\alpha$ and $\alpha\beta$ denote vertex and edge indices, and $d_v^{(k)}$, $d_e^{(k)}$ are per-graph feature dimensions (for example, in `RepFlowArgs`: $d_v^{(1)}=n_\text{dim}$, $d_e^{(1)}=e_\text{dim}$, $d_v^{(2)}=e_\text{dim}$, and $d_e^{(2)}=a_\text{dim}$).
36+
37+
The feature update follows a recursive formulation with residual connections. We use $\text{Update}_V$ and $\text{Update}_E$ to distinguish vertex and edge update modules, respectively:
38+
39+
**Edge updates (all graphs $G^{(k)}$):**
40+
Edge features are updated based on messages from connected vertices:
41+
42+
```math
43+
\mathbf{e}_{\alpha\beta}^{(k,l+1)} = \mathbf{e}_{\alpha\beta}^{(k,l)} + \text{Update}_E^{(k)}\left(\mathbf{e}_{\alpha\beta}^{(k,l)}, \mathbf{v}_\alpha^{(k,l)}, \mathbf{v}_\beta^{(k,l)}\right)
44+
```
45+
46+
**For $G^{(1)}$ (initial graph, vertex update):**
47+
Vertex features are updated through self-message and symmetrization:
48+
49+
```math
50+
\mathbf{v}_\alpha^{(1,l+1)} = \mathbf{v}_\alpha^{(1,l)} + \text{Update}_V^{(1)}\left(\mathbf{v}_\alpha^{(1,l)}, \{\mathbf{e}_{\alpha\beta}^{(1,l)}\}_{\beta \in \mathcal{N}(\alpha)}\right)
51+
```
52+
53+
**For $G^{(k)}$ with $k > 1$ (vertex identity):**
54+
The vertex feature of $G^{(k)}$ is identical to the edge feature of $G^{(k-1)}$:
55+
56+
```math
57+
\mathbf{v}_\alpha^{(k,l)} = \mathbf{e}_{\alpha\beta}^{(k-1,l)}
58+
```
59+
60+
where $(\alpha,\beta)$ denotes the edge in $G^{(k-1)}$ corresponding to vertex $\alpha$ in $G^{(k)}$. This identity eliminates redundant storage.
61+
62+
The same edge update rule also applies to $G^{(1)}$ edge features $\mathbf{e}_{\alpha\beta}^{(1,l)}$ (i.e., with $k=1$ in $\text{Update}_E^{(k)}$). Therefore, these features evolve across layers and, via the $\mathbf{v}^{(2,l)}$-$\mathbf{e}^{(1,l)}$ identity, drive the updates on $G^{(2)}$.
63+
64+
### Descriptor Construction
65+
66+
The final vertex features of $G^{(1)}$ serve as the descriptor representing the local environment of each atom:
67+
68+
```math
69+
\mathcal{D}^\alpha = \mathbf{v}_\alpha^{(1,L)}
70+
```
71+
72+
where $L$ is the total number of layers.
73+
74+
The descriptor output is then consumed by downstream fitting/model components for property prediction (e.g., energy). See the model/fitting documentation for those equations and training objectives.
75+
76+
### Physical Symmetries and Conservative Forces
77+
78+
DPA3 respects the physical symmetries of the potential energy surface:
79+
80+
1. **Translational invariance**: The model depends only on relative coordinates $\mathbf{r}_{\alpha\beta} = \mathbf{r}_\beta - \mathbf{r}_\alpha$, not absolute positions.
81+
82+
1. **Rotational invariance**: The final descriptor is rotationally invariant; intermediate equivariant representations are used internally and contracted to produce invariant atomic features.
83+
84+
1. **Permutational invariance**: Atoms of the same chemical species are treated identically under permutation symmetry operations (re-labeling) of identical atoms.
85+
86+
In addition, DPA3 is inherently conservative: forces are derived from energy gradients:
87+
88+
```math
89+
\mathbf{F}_\alpha = -\frac{\partial E}{\partial \mathbf{r}_\alpha}
90+
```
91+
92+
Virials are similarly derived from cell tensor gradients, ensuring the model is conservative and suitable for molecular dynamics simulations.
93+
94+
### Default Configuration
95+
96+
DPA3 uses LiGS order $K=2$ as the default configuration, which was found effective in prior work ([DPA3 paper](https://arxiv.org/abs/2506.01686)). The model supports scaling through increasing the number of layers $L$ (e.g., DPA3-L3, DPA3-L6, DPA3-L12, DPA3-L24).
97+
1898
## Hyperparameter tests
1999

20-
We systematically conducted DPA3 training on six representative DFT datasets (available at [AIS-Square](https://www.aissquare.com/datasets/detail?pageType=datasets&name=DPA3_hyperparameter_search&id=316)):
21-
metallic systems (`Alloy`, `AlMgCu`, `W`), covalent material (`Boron`), molecular system (`Drug`), and liquid water (`Water`).
22-
Under consistent training conditions (0.5M training steps, batch_size "auto:128"),
23-
we rigorously evaluated the impacts of some critical hyperparameters on validation accuracy.
100+
We systematically trained DPA3 on six representative DFT datasets (available at [AIS-Square](https://www.aissquare.com/datasets/detail?pageType=datasets&name=DPA3_hyperparameter_search&id=316)): metallic systems (`Alloy`, `AlMgCu`, `W`), a covalent material (`Boron`), a molecular system (`Drug`), and liquid water (`Water`).
101+
Under consistent training conditions (0.5M training steps, `batch_size` = `auto:128`),
102+
we evaluated the impact of key hyperparameters on validation accuracy.
24103

25-
The comparative analysis focused on average RMSEs (Root Mean Square Error) for both energy, force and virial predictions across all six systems,
26-
with results tabulated below to guide scenario-specific hyperparameter selection:
104+
The comparative analysis focused on average RMSEs (Root Mean Square Error) for energy, force, and virial predictions across the six systems.
105+
The results are summarized below to guide scenario-specific hyperparameter selection:
27106

28107
| Model | comment | nlayers | n_dim | e_dim | a_dim | e_sel | a_sel | start_lr | stop_lr | loss prefactors | rmse_e (meV/atom) | rmse_f (meV/Å) | rmse_v (meV/atom) | Training wall time (h) |
29108
| ---------------- | --------------- | ------- | ------- | ------ | ----- | ------- | ------ | -------- | -------- | ------------------------- | ----------------- | -------------- | ----------------- | ---------------------- |
@@ -35,10 +114,10 @@ with results tabulated below to guide scenario-specific hyperparameter selection
35114
| | Large sel | 6 | 256 | 128 | 32 | **154** | **48** | 1e-3 | 3e-5 | 0.2\|20, 100\|60, 0.02\|1 | 4.76 | 78.4 | 40.2 | 31.8 |
36115
| DPA2-L6 (medium) | Default | 6 | - | - | - | - | - | 1e-3 | 3.51e-08 | 0.02\|1, 1000\|1, 0.02\|1 | 12.12 | 109.3 | 83.1 | 12.2 |
37116

38-
The loss prefactors (0.2|20, 100|60, 0.02|1) correspond to (`start_pref_e`|`limit_pref_e`, `start_pref_f`|`limit_pref_f`, `start_pref_v`|`limit_pref_v`) respectively.
117+
The loss prefactors (0.2|20, 100|60, 0.02|1) correspond to (`start_pref_e`|`limit_pref_e`, `start_pref_f`|`limit_pref_f`, `start_pref_v`|`limit_pref_v`), respectively.
39118
Virial RMSEs were averaged exclusively for systems containing virial labels (`Alloy`, `AlMgCu`, `W`, and `Boron`).
40119

41-
Note that we set `float32` in all DPA3 models, while `float64` in other models by default.
120+
Note that all DPA3 models use `float32`, while other models use `float64` by default.
42121

43122
## Requirements of installation from source code {{ pytorch_icon }} {{ paddle_icon }}
44123

0 commit comments

Comments
 (0)