Skip to content

Implement the molecule module internally#789

Merged
kfir4444 merged 107 commits intomainfrom
molecule
Aug 24, 2025
Merged

Implement the molecule module internally#789
kfir4444 merged 107 commits intomainfrom
molecule

Conversation

@alongd
Copy link
Copy Markdown
Member

@alongd alongd commented May 9, 2025

This is a large PR that fundamentally adds the Molecule class internally in ARC. Now ARC is compatible with Py 3.12
This PR follows the work done in #754 where ARC were given capabilities to work directly with the RMG-database without using the RMG API. Now, finally, ARC does not depend on Julia.

The major change made in this PR required additional modifications. Although we appreciate small and smart PR's it is very hard to decouple these changes. The main modifications are as follows:

  1. Naturally, the interface to Arkane had to be modified since now RMG is not a direct dependency. We now use Arkane only as a subprocess for statmech.
  2. ESS paersing was previously somewhat facilitated through Arkane, now there's a new parser module in ARC with adapters for each ESS.
  3. Bugs were discovered in our two perception algorithms (we used to have species/xyz_to_2d and species/xyz_to_smiles, falling back to a single bond version of the molecule if it cannot be perceived). Now we have a new species/perceive algorithm, with a fallback to species/xyz_to_smiles if needed. Success rates are higher, and we always return a molecule with bond orders. This might be the end of the allow_nonisomorphic_2d flag in ARC. We'll keep it around for a while, but may deprecate it in the future.
  4. QCElemental was removed as a dependency. Combined with the removal of Arkane's API, this means that now we provide translations from atomic numbers to atomic mass. this is done in common.py with data stored under data/elements.yml.
  5. The removal of QCElemental also impacted our atom mapping algorithm. It turns our that we used this as a fall back quite often for isomerization. The atom mapping engine and driver were updated, which is another positive outcome of this PR.
  6. Better and automated installation scripts were added under devtools for all the external dependencies, and the CI was updated as well, along with the Makefile. A big thanks to @calvinp0 for the endless hours he invested in this.
  7. ZMatrices and the H Abstraction heuristics modules were updated as well.
  8. The TS NMD checks have been updated, incorporating Fix the normal mode displacement TS check #768 into this PR.

Tests were of course added. We still need to updated the docs, specifically for the installation instructions, and check the installation scripts again, they were mainly tested in the context of the CI.

With this merged, we should soon tag a new version of ARC.

Comment thread arc/job/adapters/ts/heuristics_test.py Fixed
Comment thread arc/job/adapters/ts/heuristics_test.py Fixed
Comment thread arc/species/converter.py Fixed
Comment thread arc/species/converter.py Fixed
@alongd alongd force-pushed the molecule branch 13 times, most recently from a790669 to 5be4ca3 Compare May 16, 2025 17:35
@alongd alongd force-pushed the molecule branch 4 times, most recently from ccab586 to b4e5422 Compare May 17, 2025 17:55
@alongd alongd force-pushed the molecule branch 2 times, most recently from 147f6e2 to 5280a3c Compare May 25, 2025 18:34
@alongd alongd force-pushed the molecule branch 2 times, most recently from c3aadb0 to 0a5102f Compare June 12, 2025 00:41
alongd added 27 commits August 24, 2025 15:11
Previously it was in setter, and we would get a report every time rxn.multiplicity is called. Now we should only get one report of the reaction multiplicity, after it is determined.
copy other only once
consider aromatic structures in other as well
And improved the logic of get_angle_in_180_range()
Added more tests to get_angle_in_180_range()

Also, added a minor test to is_str_float
So that species can be used in dictionaries and sets
Streamlines the execution commands for xtb_gsm and sella
Also uses `set -euo pipefail` to ensure scripts fail on errors.
Also, it adds an explicit error message when a conda environment
manager isn't found.
To avoid the warning:
invalid value encountered in divide
    v2_x_v1 /= float(np.linalg.norm(v2_x_v1))
Adds error handling to capture and display information
when a Sella run fails. This includes the return code,
command, last parts of standard error and standard output,
and the tail of the ts.log file. Also adds error
handling to display information when the output.yml
file is missing.
to save a YAML file of the thermo data from an Arkane run
Copy link
Copy Markdown
Collaborator

@kfir4444 kfir4444 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

Copy link
Copy Markdown
Member

@calvinp0 calvinp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants