Add OrbMol-v2 with learnable electrostatics by timduignan · Pull Request #162 · orbital-materials/orb-models

timduignan · 2026-05-07T07:26:29Z

Summary

Adds OrbMol-v2 — extends the OrbMol architecture with learnable per-atom electrostatics: a LatentChargeHead predicts charges that satisfy the system total-charge constraint and a new CoulombModule adds long-range Coulomb energy on top of the GNN — bare 1/r direct sum for non-periodic systems, Particle Mesh Ewald via nvalchemiops for periodic. The energy head (ChargeConditionedEnergyHead) is conditioned on the predicted charges and spins per atom.

The published checkpoint is at https://huggingface.co/orbital-materials/orbmol-v2, verified to reproduce internal reference values for H₂O and Cu fcc to ≤1e-5 eV / eV/Å / eV/Å³ on both CPU and H100.

What changed

New files

orb_models/forcefield/models/coulomb_module.py — CoulombModule, direct + PME paths
tests/forcefield/test_orbmol_v2_smoke.py — opt-in network test (ORB_RUN_NETWORK_TESTS=1) checking H₂O / Cu predictions against reference values

New classes in existing files

forcefield_heads.py: ChargeConditionedEnergyHead, LatentChargeHead, LatentSpinHead
pretrained.py: orbmol_v2() loader (s3-hosted weights), registry entry

Surgical edits

conservative_regressor.py: new coulomb_module and pair_repulsion_node_aggregation kwargs; predicts charges/spins before energy; bifurcated forward path for ChargeConditionedEnergyHead; adds Coulomb energy and explicit forces/virial
pyproject.toml: bump nvalchemi-toolkit-ops to >=0.3.1,<0.4 (PME hybrid_forces API)
MODELS.md: documents orbmol-v2

Backwards compatibility

EnergyHead and other base heads are untouched — same forward, same predict(), same denormalize(). All existing v3 conservative omol/omat/mpa models behave identically.
ZBLBasis default changed to node_aggregation="sum". All existing architectures (orb-v3-conservative) keep their training-time aggregation ("mean").
4 BC guard tests added.

Test plan

All 92 forcefield unit tests pass on CPU (88 existing + 4 new BC guards)
Network smoke test reproduces reference H₂O / Cu values from a fresh clone, on both Mac CPU and H100 GPU

Wholesale port from the reference codebase (post the ZBL-sum + Coulomb constant fix, with ZBL-sum + correct Coulomb constant fixes), targeting only the s11doh8x:v199 public release. Drops all backwards-compat for prior electrostatics_config checkpoints; skips Fukui, global_context, and self_message internal-only features. New: - orb_models/forcefield/models/coulomb_module.py: CoulombModule + direct/PME - scripts/convert_orbmol_v2_ckpt.py: extract EMA-applied flat state_dict from wandb-format checkpoint (orbmol_v2() expects flat state_dicts per orb-models S3 convention) Modified: - forcefield_heads.py: ChargeConditionedEnergyHead, LatentChargeHead, LatentSpinHead. Added EnergyHead.absolute_energy() helper. - conservative_regressor.py: coulomb_module field, latent_charges/spins predicted before energy, ChargeConditionedEnergyHead path, Coulomb energy + explicit forces/virial plumbing. - pair_repulsion.py: default node_aggregation "mean" -> "sum". - pretrained.py: orbmol_v2_architecture() and orbmol_v2() loader mirroring the source codebase. CoulombModule() defaults (no erf damping); enforce_total_charge=True; no coulomb_constant override. Verified against gold values internal reference values for s11doh8x:v199 (with EMA applied, CPU fp32): - H2O energy: -2079.86339 eV (diff 2.7e-7) - H2O forces[0]: matches all components within 4e-6 eV/A - Cu fcc energy: -178549.38592 eV (relative diff ~6e-10) - Cu fcc stress (Voigt 6): matches within 5e-6 eV/A^3 All 88 existing forcefield tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Revert pair_repulsion.py default to "mean" (preserves BC for older orb-v3 conservative models trained pre ZBL-sum cutoff). - ConservativeForcefieldRegressor accepts pair_repulsion_node_aggregation kwarg (defaults to "mean"); orbmol_v2_architecture passes "sum" explicitly. - EnergyHead.absolute_energy: drop fp64 arg, always do the addition in fp64 and return fp64. OMol references reach ~1e5 eV so kJ/mol resolution requires fp64; option only added confusion. Used only by ChargeConditioned path so legacy heads are unaffected. - ConservativeForcefieldRegressor.predict drops fp64_energy arg accordingly. - Delete scripts/convert_orbmol_v2_ckpt.py — core/scripts/misc/export_model.py is the existing tool used for all other public orb-models releases. - Update orbmol_v2() docstring to point at core's export_model.py. Re-verified gold values match (H2O 2.7e-7; Cu fp32-noise level on energy, 1e-6 on stress). All 88 forcefield tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- test_pair_repulsion_default_aggregation_is_mean: catches anyone changing the regressor default from "mean" to "sum", which would silently break all public orb-v3 conservative models on reload. - test_pair_repulsion_sum_when_specified: confirms orbmol-v2 opt-in via the kwarg works. - test_energy_head_does_not_have_absolute_energy: guards against the fp64-promoting helper migrating onto the base EnergyHead, where it would alter v3 conservative omol/omat/mpa predictions. - test_orbmol_v2_architecture_uses_sum_zbl: integration check that the architecture wires CoulombModule + sum-aggregation ZBL + latent_charges/latent_spins heads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Default weights_path now points at HF (orbital-materials/orbmol-v2/resolve/main). Matches the date-in-filename versioning convention used by other public orb-models S3 ckpts; no separate revision pin (HF main behaves like a stable S3 path). - Bump nvalchemi-toolkit-ops>=0.3.1: the orbmol_v2 PME path uses particle_mesh_ewald(..., hybrid_forces=True) which only exists on >=0.3.1. - New tests/forcefield/test_orbmol_v2_smoke.py: end-to-end check that the published weights produce the gold H2O / Cu predictions from internal reference tests. Gated by ORB_RUN_NETWORK_TESTS=1 since it downloads ~100 MB; opt-in only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Upstream bumped APIs between 0.3.0 and 0.3.1 (added the hybrid_forces parameter we depend on for PME). Upper-bounding at the next minor prevents 0.4.x from silently breaking the orbmol-v2 PME path on user upgrades — matches the bound style of other deps internally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds an entry under OrbMol Models describing the learnable electrostatics extension (LatentChargeHead, LatentSpinHead, CoulombModule, and the ChargeConditionedEnergyHead) with usage example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…cy note s11doh8x trains with CoulombModule defaults (no erf damping); the non-periodic Coulomb path is bare 1/r, not erf-damped. Also drops the size-consistency wording from the head description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- README.md: add orbmol-v2 release note in the "What's new" section - coulomb_module.py: fix misleading docstrings (the non-periodic path is bare 1/r when sigma is None, not erf-damped); update stale 14.33 → 14.40 reference for COULOMB_CONSTANT; minor typo fix - forcefield_heads.py: drop size-consistency claims from ChargeConditionedEnergyHead docstrings (separate concern; not asserted in the public release) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Don't assert per-atom spins are physical observables (we haven't validated this; reword as "auxiliary per-atom features that take the system's spin multiplicity into account") in MODELS.md and README.md. - Add a note in the README orbmol-v2 update that energies are now fp64 by default for kJ/mol resolution against OMol25-scale references; opt out via fp64_energy=False on predict(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- MODELS.md and README.md: rephrase LatentChargeHead/LatentSpinHead as predicting per-atom *latent features* (constrained at the system level) rather than asserting per-atom charges and spins. Caveat blockquote added in the previous commit explains the emergent nature. - pyproject.toml: bump dev `torch-sim-atomistic` from >=0.5.1 to >=0.6.0. The 0.6.0 release adds `SimState.has_extras()`, which the existing `test_forcefield_adapter_parses_spin_and_charge_from_simstate` test already calls; 0.5.2 (allowed by the old constraint) does not have this method. All 121 forcefield tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per Ben's review feedback (PR orbital-materials#162 thread): - First bullet now leads with the CoulombModule, not the latent heads - Adds Speed (H100 QPS at 1k/10k atoms, periodic systems) and Accuracy (GSCDB138 Normalized Error Ratio, ex single-atom-species reactions) highlights - Moves the LatentChargeHead / LatentSpinHead detail and the Caution blockquote out of the README; they remain in MODELS.md - Links out to MODELS.md for the full architecture description Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

aggregate_nodes(..., reduction="mean") now forwards n_node directly to scatter_mean, which takes pre-computed group sizes instead of building a divisor inside the graph from scatter_sum(ones, ...). The old form triggered a torch.compile + autograd miscompile (~1 eV on H2O / ethanol / NaCl). Math is bit-identical in eager. Drops segment_mean() and its test (unused after this change). Ports orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LatentChargeHead / LatentSpinHead centering now uses repeat_interleave over n_node instead of gather-by-node_batch_index. Compiles cleanly under dynamic shapes. conservative_regressor drops the in-place interaction_energy += coulomb_energy. pair_repulsion casts the poly cutoff exponent with p.to(torch.float32) instead of float(p), which avoids a Tensor.item() graph break, and zeros stress entries beyond 1e10. Ports orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removes the backbone-only .compile() overrides on the {Conservative,Direct}ForcefieldRegressor classes. model.compile(...) now wraps the full forward via standard nn.Module.compile(). Note: the GraphRegressor.compile() override was already absent in public, so no change there. CoulombModule's PME path is split into two @torch.compiler.disable helpers (_particle_mesh_ewald and _estimate_pme_params_and_neighbors) so dynamo skips nvalchemiops's ctypes-using PME entirely while the rest of the regressor compiles. End-to-end (per core PR): ~1.5-1.6x faster inference (1.62x at 10k atoms, single H100); compiled-vs-eager numerical diff at float-noise level. Adds tests/forcefield/test_direct_regressor.py and tests/expensive_tests/test_compilation.py; updates conftest and test_conservative compile tests. Adds scripts/compile_numerical_check.py for offline diagnosis (adapted to public pretrained.orbmol_v2 loader). Ports orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Updates weights_path to the new teqabfhg checkpoint (orbmol-v2-teqabfhg-20260523.ckpt) which has no LatentSpinHead. Adds use_per_atom_spins to orb_v3_conservative_architecture (default False) so the spin head and ChargeConditionedEnergyHead.use_spins are controlled together. System-level charge/spin conditioning via ChargeSpinConditioner is unchanged. A future spin-enabled checkpoint can pass use_per_atom_spins=True from its pretrained.* entry without further architecture changes. Updates test_backwards_compatibility goldens to match the new checkpoint: - H2O: -2079.86339 -> -2079.86222 eV - Cu fcc: -178549.3860 -> -178550.9810 eV; stress retuned - forces/stress vectors regenerated on Mac arm64 CPU Old checkpoint URL (orbmol-v2-s11doh8x-20260507.ckpt) was removed from S3 and now 404s; users loading that file via state_dict manually will hit a shape mismatch on the energy head MLP and the missing latent spin head. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

orbmol_v2's default checkpoint (teqabfhg) has no per-atom spin head. Updates the MODELS.md description to mention only LatentChargeHead and clarifies that system-level total charge / spin multiplicity still flow through the ChargeSpinConditioner. Caution paragraph updated to drop the per-atom spin language. README headline GSCDB138 numbers reflect the previous s11doh8x checkpoint and will be refreshed once a full per-category breakdown for teqabfhg is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Updates GSCDB138 numbers from the s11doh8x measurement to the new teqabfhg checkpoint (same 109 evaluable subsets / 5152 reactions): - Overall NER: 6.05 -> 1.62 (was 1.83 on s11doh8x) - NC, TC, TM, BH, INC, ISO per-category numbers refreshed - ISO regresses slightly (1.03 -> 1.36); other categories improve Adds a second bullet covering GMTKN55 WTMAD-2 (5.41 -> 4.37 kcal/mol), WIGGLE150 RMSE (1.23 -> 1.19), and the two new evaluation entries BEGDB (MAE 0.235) and ACONFL (RMSE 0.40). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

orbmol_v2's ChargeSpinConditioner requires total_charge and spin_multiplicity in system_features. Without them the adapter's _get_charge_and_spin returns an empty dict and the conditioner errors. Defaulting to charge=0, spin=1 (singlet) matches the convention used in the internal speed harness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the s11doh8x-era "44 QPS at 1k atoms / 9 QPS at 10k atoms / within ~5% of v1" claim with absolute ms timings from BENCH on the new teqabfhg checkpoint (full-model torch.compile, single GPU, periodic systems): 30/42/116/191 ms at 100/1k/5k/10k atoms. Notes that the no-spin variant is 1-12% faster than the prior spin-having v2 development variant at every size we tested. The v1 parity claim is removed pending a fresh v1-vs-v2 comparison run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the standalone v2-teqabfhg ms numbers with a v1-vs-v2 table at 100/1k/5k/10k atoms. v2 is slower at small sizes (electrostatics overhead) and faster at large sizes; crossover around 5k atoms, 1.46x speedup at 10k. v1 runs with backbone-only compile (its best available), v2 with full-model compile thanks to the port of orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

timduignan · 2026-05-22T16:32:46Z

@vsimkus mind giving this another look? Since your earlier pass it now also includes:

Port of orbital-materials/orb#3074 (full-model torch.compile for orbmol-v2)
Switch of the orbmol_v2 default checkpoint to teqabfhg (no per-atom spin head) with use_per_atom_spins parameterized for the future spin-having checkpoint
Refreshed GSCDB138 + GMTKN55 / WIGGLE150 / BEGDB / ACONFL numbers
v1-vs-v2 speed table in the README

BC test goldens were regenerated on Mac arm64; Linux CI may want a small atol tweak if it triggers.

The full-model torch.compile changes ported from #3074 also benefit OrbMol-v1, not just v2. New BENCH numbers show v1-full-compile is 1.3-1.8x faster than the old backbone-only workaround (biggest at 10k atoms: 278 -> 158 ms). v2-teqabfhg is ~20-60% slower than v1 full-compile at the same system size, reflecting the real cost of the PME / Coulomb path; the previous narrative ("v2 crosses over and becomes faster than v1 at 5k+ atoms") was an artifact of comparing v1-backbone-compile to v2-full-compile. Also softens the lead-in line about LES being free; it isn't, but the accuracy gain (3.7x lower NER) is the trade. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previously is_conservative_model = "conservative" in args.base_model, which misclassified orbmol_v2 as direct. Build the model first, then check "grad_forces" in model.loss_weights, and push custom CLI weights via model.loss_weights.update(...).

- Remove scripts/compile_numerical_check.py (covered by internal tests now) - MODELS.md: reword charge/spin requirement to match orbmol-v1 phrasing - README.md: drop internal commit ref; trim May 2026 update to two bullets (electrostatics+headline result; full-model compile speedup) - test_backwards_compatibility.py: restore full-precision cu_energy_gold (-178550.98098)

timduignan · 2026-05-25T11:49:36Z

Thanks for the review @vsimkus — hopefully all addressed now:

Removed scripts/compile_numerical_check.py
MODELS.md: reworded to "Similar to orbmol-v1, system-level total charge and spin are required."
README.md: dropped the internal commit reference; trimmed to two bullets (electrostatics + GSCDB138 result; full-model compile speedup)
test_backwards_compatibility.py: restored cu_energy_gold = -178550.98098

vsimkus

LGTM! 🥳 A few minor comments, but otherwise it's good.

- finetune.py: use isinstance(model, ConservativeForcefieldRegressor) instead of inspecting loss_weights; fix stale comment - pretrained.py:418: drop "spins" from orbmol_v2 docstring - README.md:27: drop "applies to v1 and v2" parenthetical (compile speedup applies to all models, the parenthetical was misleading)

benrhodes26

LGTM!

timduignan · 2026-05-26T10:29:59Z

Brilliant thanks guys!

timduignan requested review from benrhodes26, jg8610, reactiv and vsimkus as code owners May 7, 2026 07:26

vsimkus reviewed May 7, 2026

View reviewed changes

Comment thread orb_models/forcefield/pretrained.py Outdated

Comment thread orb_models/forcefield/pretrained.py Outdated

Comment thread MODELS.md Outdated

Comment thread MODELS.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

timduignan and others added 8 commits May 7, 2026 19:28

vsimkus force-pushed the orbmol-v2-port branch 3 times, most recently from a7587b9 to 7dfdd99 Compare May 7, 2026 22:37

Ported changes from the internal repo

7ac3c29

vsimkus force-pushed the orbmol-v2-port branch from 7dfdd99 to 7ac3c29 Compare May 7, 2026 22:43

timduignan force-pushed the orbmol-v2-port branch from d2af74c to d359837 Compare May 8, 2026 03:03

benrhodes26 reviewed May 8, 2026

View reviewed changes

Comment thread README.md Outdated

benrhodes26 reviewed May 8, 2026

View reviewed changes

Comment thread README.md Outdated

vsimkus force-pushed the orbmol-v2-port branch from ee7f097 to f4a2a09 Compare May 8, 2026 16:37

Handle node/graph features/targets in from_ase_atoms_list

d5e3fae

vsimkus force-pushed the orbmol-v2-port branch from f4a2a09 to d5e3fae Compare May 8, 2026 16:43

Update examples to orbmol_v2

7dd0e20

vsimkus force-pushed the orbmol-v2-port branch from befe1e6 to 7dd0e20 Compare May 8, 2026 17:30

timduignan and others added 9 commits May 22, 2026 16:05

vsimkus reviewed May 25, 2026

View reviewed changes

Comment thread scripts/compile_numerical_check.py Outdated

Comment thread MODELS.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

Comment thread tests/expensive_tests/test_backwards_compatibility.py Outdated

timduignan added 2 commits May 25, 2026 12:49

vsimkus approved these changes May 26, 2026

View reviewed changes

Comment thread orb_models/forcefield/pretrained.py Outdated

Comment thread README.md Outdated

Comment thread finetune.py Outdated

benrhodes26 approved these changes May 26, 2026

View reviewed changes

vsimkus merged commit 05c86ea into orbital-materials:main May 26, 2026
5 checks passed

Conversation

timduignan commented May 7, 2026 • edited by vsimkus Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Backwards compatibility

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

timduignan commented May 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

timduignan commented May 25, 2026

Uh oh!

vsimkus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

benrhodes26 left a comment

Choose a reason for hiding this comment

Uh oh!

timduignan commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

timduignan commented May 7, 2026 •

edited by vsimkus

Loading