Batched workflows for TorchSim by akwarii · Pull Request #1505 · materialsproject/atomate2

akwarii · 2026-06-26T15:18:26Z

Summary

Include a summary of major changes in bullet points:

When using TorchSim, the deformed structures needed to compute elastic properties are batched
Similarly, the displaced structures on the phonon workflow are also batched

Note that batching can be disabled by setting socket=False. Also, I didn't perform any benchmark (yet) to get an idea of the performance improvement introduced by this PR.

Additional dependencies introduced (if any)

I don't expect to introduce new dependencies in this PR

TODO (if any)

If this is a work-in-progress, write something about what else needs to be done.

Debug the elastic workflow (currently returns a bulk and shear modulus of ~0.04 GPa instead of 9.7, as expected from the forcefields test)
Batch the phonon displacements (the workflow supports TS but displacements are currently processed in serial mode)

Not sure if I will implement the others workflows for the moment as I don't really need them for my work but if someone is interested feel free to get in touch or to contribute to this PR.

Checklist

Work-in-progress pull requests are encouraged, but please put [WIP] in the pull request
title.

Before a pull request can be merged, the following items must be checked:

Code is in the standard Python style.
The easiest way to handle this is to run the following in the correct sequence on
your local machine. Start with running ruff and ruff format on your new code. This will
automatically reformat your code to PEP8 conventions and fix many linting issues.
Doc strings have been added in the Numpy docstring format.
Run ruff on your code.
Type annotations are highly encouraged. Run mypy to
type check your code.
Tests have been added for any new functionality or bug fixes.
All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply run
pre-commit install and a check will be run prior to allowing commits.

jobflow.job

JaGeo · 2026-06-26T15:21:06Z

@akwarii Please note that the elastic workflow depends a lot on the symmetry of the optimization step. If you can enforce symmtry there, it might stabilize the results. At least, this was true for our other force field implementations

akwarii · 2026-06-26T15:28:49Z

Yes I saw the issue discussing about this. Next week I plan to try to enforce the symmetry using torchsim FixSymmetry filter but I will probably have to update atomate2.torchsim.core.TorchSimOptimizeMaker to do so. From what I know, the phonon workflow doesn't have this problem right?

JaGeo · 2026-06-26T15:37:46Z

@akwarii Yep, exactly. The phonon workflow is more robust in this regard.

I am fine with you updating the other optimizer as long as it is not a breaking change.

akwarii · 2026-06-29T14:03:37Z

Note:

When using the test mace model ({test_dir}/forcefields/mace/MACE.model) the output is very different from ASE, but bigger models such as mace medium (default from download_mace_mp_checkpoint) will actually give similar results. I only tested the si_structure.

Energies:

ASE (mace test)	TS (mace test)	ASE (mace medium)	TS (mace medium)
-0.0710307133271508	-0.07015640801005693	-10.827855209161402	-10.82781378519521
-0.07112084574852455	-0.06979949936921036	-10.829091596667464	-10.829070856677937
-0.07112152875655092	-0.06893745091158408	-10.829097121116778	-10.829116852080723
-0.07103527927723186	-0.068435834774206	-10.827897623996325	-10.827937286730187
-0.07115043598034938	-0.06937597101480968	-10.82606770845616	-10.826066640116384
-0.0711505573759968	-0.06938844285313492	-10.828641285443723	-10.828640736180976

Elastic constants

	ASE (mace test)	TS (mace test)	ASE (mace medium)	TS (mace medium)
$C_{11}$	9.703	7.650	126.967	127.03
$C_{12}$	9.699	7.647	65.841	65.89
$C_{44}$	0.002	0.334	67.297	67.33

@JaGeo any idea about what can be the problem here?

JaGeo · 2026-06-29T14:46:19Z

Is there maybe still a problem with the equillibrium structure? Have you compared the optimized structures? i am not familiar with the optimization in torchsim

akwarii · 2026-06-29T15:13:36Z

Good call! The lattice parameters are changing a lot when using the test model with the ASE backend. The bigger mace problem doesn't have this problem. This seems weird to me since both backend are using the same algorithm to relax the box, ie Frechet cell filter

Original cell

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.866975   3.866975   3.866975
angles:  60.000000  60.000000  60.000000
pbc   :       True       True       True
Sites (2)
  #  SP       a     b     c
---  ----  ----  ----  ----
  0  Si    0.75  0.75  0.75
  1  Si    0.5   0.5   0.5

TorchSim (test model)

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.851698   3.851698   3.851698
angles:  60.000000  60.000000  60.000000
pbc   :       True       True       True
Sites (2)
  #  SP       a     b     c
---  ----  ----  ----  ----
  0  Si    0.75  0.75  0.75
  1  Si    0.5   0.5   0.5

ASE (test model)

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.801120   3.801120   3.801120
angles:  60.000000  60.000000  60.000000
pbc   :       True       True       True
Sites (2)
  #  SP       a     b     c
---  ----  ----  ----  ----
  0  Si    0.75  0.75  0.75
  1  Si    0.5   0.5   0.5

TorchSim (mace medium)

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.865821   3.865821   3.865821
angles:  60.000000  60.000000  60.000000
pbc   :       True       True       True
Sites (2)
  #  SP       a     b     c
---  ----  ----  ----  ----
  0  Si    0.75  0.75  0.75
  1  Si    0.5   0.5   0.5

ASE (mace medium)

Full Formula (Si2)
Reduced Formula: Si
abc   :   3.866058   3.866058   3.866058
angles:  60.000000  60.000000  60.000000
pbc   :       True       True       True
Sites (2)
  #  SP       a     b     c
---  ----  ----  ----  ----
  0  Si    0.75  0.75  0.75
  1  Si    0.5   0.5   0.5

JaGeo · 2026-06-29T15:19:13Z

Different floating point precision or stopping criterion?

akwarii · 2026-06-29T15:25:50Z

In both tests I have set fmax / force_tol=1e-5 and the dtype is the default torch.float64, I'm also running both on the cpu.

JaGeo · 2026-06-29T17:31:39Z

Any other specifics of the optimizer? Do they differ in some way? Potentially, the small test mace model has Additional local minima that one algorithm falls into, the other one mot.

akwarii · 2026-06-30T10:17:15Z

I was unable to find any differences between the two after investigating the classes / functions signatures. I also tried to bump TS to a newer commit since they had an issue with constrained optimization (TorchSim/torch-sim#552) but it didn't change the results.

As you pointed out, the small model probably has more minima and might get stuck into one due to small implementation differences. However, I also ran TorchSim's test_optimizers_vs_ase.py using the same model and structure as here, and the test passed. At this point, I think someone with deeper knowledge of the differences between ASE and TorchSim will have to step in.

If we want to merge I can update the TS test values to make them pass (since real production models still give the same results), but I think we still need to see this issue through to the end.

JaGeo · 2026-06-30T10:30:45Z

Thanks.

It might be an option to ask the TorchSim developers for help. After all, it should be in their interest to get the same results or at least have an explanation for the differences.
I personally don't have experience with TorchSim

akwarii · 2026-06-30T15:16:37Z

side note: i just saw that my modifications from #1504 are also included here but in any case it should be ready

JaGeo · 2026-07-01T08:34:27Z

Thanks! i will take a look until beginning of next week!

JaGeo · 2026-07-01T08:42:43Z

+    assert task_doc.output.stress is not None
+    assert len(task_doc.output.stress) == 2
+    # Each stress should be a 3x3 matrix
+    for stress in task_doc.output.stress:


Does it make sense to test at least one numerical value herw?

JaGeo · 2026-07-01T08:44:56Z

I think I have one real comment: to check one of the computed results at least to spot drastic implementation changes etc. Beyond this, I am happy!

akwarii · 2026-07-01T08:59:33Z

I completely agree, I will add the check.

Also, it seems like the problem with the elastic test was due to an overlook from my side: in TS, enabling a cell filter doesn't mean by default that the cell forces are used in the convergence check (see TorchSim/torch-sim#582).

JaGeo · 2026-07-01T09:01:24Z

Ah! That's great to know!

JaGeo · 2026-07-01T09:17:02Z

One more point: do you think the current documentation is sufficient? maybe you can use an llm and one of your tests to provide more info on the socket implementations with torchsim?

akwarii · 2026-07-01T09:21:41Z

I can extend a bit the docstring explanation and add examples to the torchsim tutorial notebook, would that be alright?

JaGeo · 2026-07-01T09:27:56Z

Yes! Absolutely!

JaGeo · 2026-07-01T14:35:05Z

Thanks! Do you want to add yourself to the list of contributors? And, is #1504 still needed?

akwarii · 2026-07-01T14:45:59Z

#1504 was superseded by this PR so it can be closed without problem. As for the list of contributors, i will gladly be part of it.

JaGeo · 2026-07-01T15:31:03Z

I will close the other PR. Please raise a short PR to add your details 😃

akwarii and others added 13 commits June 10, 2026 16:22

Support batching for the elastic workflow

a9106cb

Update model construction due to API changes

526bc8b

Remove unsupported models

e574d06

Remove legacy models from TorchSimModelType

8df6fb0

Add tests for torchsim model wrappers

455b6df

Revert mattersim changes

cb3f1fc

skip test if torchsim is not installed

5e8a0a3

Merge remote-tracking branch 'origin/main' into batching

49ed367

Minimalistic elastic workflow for torchsim

b121a35

Add tests

5fb7b64

WIP: adapt common elastic jobs

b75e599

Fix Optimizer being casted to its string value when passed to a

daf6f43

jobflow.job

squeeze the stress tensor

e0e842c

akwarii added 2 commits June 29, 2026 15:39

Support TorchSim symmetry constraints

5b44710

Convert the stress to kbar

406536b

Unit conversion without ASE

96934f3

akwarii added 3 commits June 29, 2026 18:20

Batched phonon workflow

de78ab8

Remove ase and torchsim dependencies from common phonon jobs

9e3c513

Use lighter model

affbceb

Merge branch 'main' into batching

993fd8e

Fix tests

7d0aa50

akwarii mentioned this pull request Jun 30, 2026

Relaxation discrepency with ASE when using small MACE model TorchSim/torch-sim#582

Open

akwarii added 2 commits June 30, 2026 16:49

Disable torchsim in test-non-ase jobs

2ce7666

Fix phonon maker type check and improve readability

b1865c0

JaGeo reviewed Jul 1, 2026

View reviewed changes

Comment thread src/atomate2/common/jobs/phonons.py

JaGeo reviewed Jul 1, 2026

View reviewed changes

Include cell forces to the convergence check

cc47120

akwarii added 3 commits July 1, 2026 12:12

Add values check on different phonon derived properties

c2b0dad

Improve socket keyword documentation

08ffbea

Add elastic workflow tutorial

81804d6

JaGeo changed the title ~~[WIP] Batched workflows for TorchSim~~ Batched workflows for TorchSim Jul 1, 2026

JaGeo reviewed Jul 1, 2026

View reviewed changes

Comment thread tests/torchsim/flows/test_phonons.py

Loosen test tolerance

3d9667a

JaGeo merged commit 9c1ef51 into materialsproject:main Jul 1, 2026
18 checks passed

akwarii deleted the batching branch July 1, 2026 15:37

Uh oh!

Conversation

akwarii commented Jun 26, 2026

Summary

Additional dependencies introduced (if any)

TODO (if any)

Checklist

Uh oh!

JaGeo commented Jun 26, 2026

Uh oh!

akwarii commented Jun 26, 2026

Uh oh!

JaGeo commented Jun 26, 2026

Uh oh!

akwarii commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JaGeo commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akwarii commented Jun 29, 2026

Uh oh!

JaGeo commented Jun 29, 2026

Uh oh!

akwarii commented Jun 29, 2026

Uh oh!

JaGeo commented Jun 29, 2026

Uh oh!

akwarii commented Jun 30, 2026

Uh oh!

JaGeo commented Jun 30, 2026

Uh oh!

akwarii commented Jun 30, 2026

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

Uh oh!

JaGeo Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

akwarii commented Jul 1, 2026

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

akwarii commented Jul 1, 2026

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

Uh oh!

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

akwarii commented Jul 1, 2026

Uh oh!

JaGeo commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akwarii commented Jun 29, 2026 •

edited

Loading

JaGeo commented Jun 29, 2026 •

edited

Loading