You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(pt): type embedding can still be compress even if attn_layer != 0 (deepmodeling#5066)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Split compression into independent type-embedding (TEBD) and geometric
modes, enabling partial compression when attention layer ≠ 0.
* **Documentation**
* Expanded backend-specific guidance describing full vs partial
compression rules and prerequisites (e.g., TEBD input mode).
* **Tests**
* Added tests covering non-zero attention-layer scenarios to validate
partial compression behavior.
* **Bug Fixes**
* Improved eligibility checks and clearer runtime warnings when
geometric compression is skipped.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
doc_exclude_types="The excluded pairs of types which have no interaction with each other. For example, `[[0, 1]]` means no interaction between type 0 and type 1."
506
506
doc_env_protection="Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection."
507
507
doc_attn="The length of hidden vectors in attention layers"
508
-
doc_attn_layer="The number of attention layers. Note that model compression of `se_atten` is only enabled when attn_layer==0 and tebd_input_mode=='strip'"
508
+
doc_attn_layer="The number of attention layers. Note that model compression of `se_atten` works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode=='strip'. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed."
509
509
doc_attn_dotr="Whether to do dot product with the normalized relative coordinates"
510
510
doc_attn_mask="Whether to do mask on the diagonal in the attention matrix"
Copy file name to clipboardExpand all lines: doc/model/dpa2.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,6 +38,10 @@ Type embedding is within this descriptor with the {ref}`tebd_dim <model[standard
38
38
39
39
## Model compression
40
40
41
-
Model compression is supported when {ref}`repinit/tebd_input_mode <model[standard]/descriptor[dpa2]/repinit/tebd_input_mode>` is `strip`, but only the `repinit` part is compressed.
41
+
Model compression is supported when {ref}`repinit/tebd_input_mode <model[standard]/descriptor[dpa2]/repinit/tebd_input_mode>` is `strip`.
42
+
43
+
- If {ref}`repinit/attn_layer <model[standard]/descriptor[dpa2]/repinit/attn_layer>` is `0`, both the type embedding and geometric parts inside `repinit` are compressed.
44
+
- If `repinit/attn_layer` is not `0`, only the type embedding tables are compressed and the geometric attention layers remain as neural networks.
45
+
42
46
An example is given in `examples/water/dpa2/input_torch_compressible.json`.
43
47
The performance improvement will be limited if other parts are more expensive.
Copy file name to clipboardExpand all lines: doc/model/train-se-atten.md
+10-2Lines changed: 10 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -134,7 +134,9 @@ You can use descriptor `"se_atten_v2"` and is not allowed to set `tebd_input_mod
134
134
135
135
Practical evidence demonstrates that `"se_atten_v2"` offers better and more stable performance compared to `"se_atten"`.
136
136
137
-
Notice: Model compression for the `se_atten_v2` descriptor is exclusively designed for models with the training parameter {ref}`attn_layer <model[standard]/descriptor[se_atten_v2]/attn_layer>` set to 0.
137
+
:::{note}
138
+
Model compression support differs across backends. See [Model compression](#model-compression) for backend-specific requirements.
139
+
:::
138
140
139
141
## Type embedding
140
142
@@ -182,7 +184,13 @@ DPA-1 supports both the [standard data format](../data/system.md) and the [mixed
182
184
183
185
## Model compression
184
186
185
-
Model compression is supported only when there is no attention layer (`attn_layer` is 0) and `tebd_input_mode` is `strip`.
187
+
### TensorFlow {{ tensorflow_icon }}
188
+
189
+
Model compression is supported only when the descriptor attention depth {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is 0 and {ref}`tebd_input_mode <model[standard]/descriptor[se_atten]/tebd_input_mode>` is `"strip"`. Attention layers higher than 0 cannot be compressed in the TensorFlow implementation because the geometric part is tabulated from the static computation graph.
190
+
191
+
### PyTorch {{ pytorch_icon }}
192
+
193
+
Model compression is supported for any {ref}`attn_layer <model[standard]/descriptor[se_atten_v2]/attn_layer>` value when {ref}`tebd_input_mode <model[standard]/descriptor[se_atten_v2]/tebd_input_mode>` is `"strip"`. When `attn_layer` is 0, both the type embedding and geometric parts are compressed. When `attn_layer` is not 0, only the type embedding is compressed while the geometric part keeps the neural network implementation (a warning is emitted during compression).
0 commit comments