Add OrbMol-v2 with learnable electrostatics#162
Merged
Conversation
vsimkus
reviewed
May 7, 2026
vsimkus
reviewed
May 7, 2026
Wholesale port from the reference codebase (post the ZBL-sum + Coulomb constant fix, with
ZBL-sum + correct Coulomb constant fixes), targeting only the s11doh8x:v199
public release. Drops all backwards-compat for prior electrostatics_config
checkpoints; skips Fukui, global_context, and self_message internal-only
features.
New:
- orb_models/forcefield/models/coulomb_module.py: CoulombModule + direct/PME
- scripts/convert_orbmol_v2_ckpt.py: extract EMA-applied flat state_dict
from wandb-format checkpoint (orbmol_v2() expects flat state_dicts per
orb-models S3 convention)
Modified:
- forcefield_heads.py: ChargeConditionedEnergyHead, LatentChargeHead,
LatentSpinHead. Added EnergyHead.absolute_energy() helper.
- conservative_regressor.py: coulomb_module field, latent_charges/spins
predicted before energy, ChargeConditionedEnergyHead path, Coulomb
energy + explicit forces/virial plumbing.
- pair_repulsion.py: default node_aggregation "mean" -> "sum".
- pretrained.py: orbmol_v2_architecture() and orbmol_v2() loader
mirroring the source codebase. CoulombModule() defaults (no
erf damping); enforce_total_charge=True; no coulomb_constant override.
Verified against gold values internal reference values for s11doh8x:v199 (with EMA
applied, CPU fp32):
- H2O energy: -2079.86339 eV (diff 2.7e-7)
- H2O forces[0]: matches all components within 4e-6 eV/A
- Cu fcc energy: -178549.38592 eV (relative diff ~6e-10)
- Cu fcc stress (Voigt 6): matches within 5e-6 eV/A^3
All 88 existing forcefield tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Revert pair_repulsion.py default to "mean" (preserves BC for older orb-v3 conservative models trained pre ZBL-sum cutoff). - ConservativeForcefieldRegressor accepts pair_repulsion_node_aggregation kwarg (defaults to "mean"); orbmol_v2_architecture passes "sum" explicitly. - EnergyHead.absolute_energy: drop fp64 arg, always do the addition in fp64 and return fp64. OMol references reach ~1e5 eV so kJ/mol resolution requires fp64; option only added confusion. Used only by ChargeConditioned path so legacy heads are unaffected. - ConservativeForcefieldRegressor.predict drops fp64_energy arg accordingly. - Delete scripts/convert_orbmol_v2_ckpt.py — core/scripts/misc/export_model.py is the existing tool used for all other public orb-models releases. - Update orbmol_v2() docstring to point at core's export_model.py. Re-verified gold values match (H2O 2.7e-7; Cu fp32-noise level on energy, 1e-6 on stress). All 88 forcefield tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- test_pair_repulsion_default_aggregation_is_mean: catches anyone changing the regressor default from "mean" to "sum", which would silently break all public orb-v3 conservative models on reload. - test_pair_repulsion_sum_when_specified: confirms orbmol-v2 opt-in via the kwarg works. - test_energy_head_does_not_have_absolute_energy: guards against the fp64-promoting helper migrating onto the base EnergyHead, where it would alter v3 conservative omol/omat/mpa predictions. - test_orbmol_v2_architecture_uses_sum_zbl: integration check that the architecture wires CoulombModule + sum-aggregation ZBL + latent_charges/latent_spins heads. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Default weights_path now points at HF (orbital-materials/orbmol-v2/resolve/main). Matches the date-in-filename versioning convention used by other public orb-models S3 ckpts; no separate revision pin (HF main behaves like a stable S3 path). - Bump nvalchemi-toolkit-ops>=0.3.1: the orbmol_v2 PME path uses particle_mesh_ewald(..., hybrid_forces=True) which only exists on >=0.3.1. - New tests/forcefield/test_orbmol_v2_smoke.py: end-to-end check that the published weights produce the gold H2O / Cu predictions from internal reference tests. Gated by ORB_RUN_NETWORK_TESTS=1 since it downloads ~100 MB; opt-in only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream bumped APIs between 0.3.0 and 0.3.1 (added the hybrid_forces parameter we depend on for PME). Upper-bounding at the next minor prevents 0.4.x from silently breaking the orbmol-v2 PME path on user upgrades — matches the bound style of other deps internally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an entry under OrbMol Models describing the learnable electrostatics extension (LatentChargeHead, LatentSpinHead, CoulombModule, and the ChargeConditionedEnergyHead) with usage example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cy note s11doh8x trains with CoulombModule defaults (no erf damping); the non-periodic Coulomb path is bare 1/r, not erf-damped. Also drops the size-consistency wording from the head description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README.md: add orbmol-v2 release note in the "What's new" section - coulomb_module.py: fix misleading docstrings (the non-periodic path is bare 1/r when sigma is None, not erf-damped); update stale 14.33 → 14.40 reference for COULOMB_CONSTANT; minor typo fix - forcefield_heads.py: drop size-consistency claims from ChargeConditionedEnergyHead docstrings (separate concern; not asserted in the public release) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a7587b9 to
7dfdd99
Compare
- Don't assert per-atom spins are physical observables (we haven't validated this; reword as "auxiliary per-atom features that take the system's spin multiplicity into account") in MODELS.md and README.md. - Add a note in the README orbmol-v2 update that energies are now fp64 by default for kJ/mol resolution against OMol25-scale references; opt out via fp64_energy=False on predict(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- MODELS.md and README.md: rephrase LatentChargeHead/LatentSpinHead as predicting per-atom *latent features* (constrained at the system level) rather than asserting per-atom charges and spins. Caveat blockquote added in the previous commit explains the emergent nature. - pyproject.toml: bump dev `torch-sim-atomistic` from >=0.5.1 to >=0.6.0. The 0.6.0 release adds `SimState.has_extras()`, which the existing `test_forcefield_adapter_parses_spin_and_charge_from_simstate` test already calls; 0.5.2 (allowed by the old constraint) does not have this method. All 121 forcefield tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
benrhodes26
reviewed
May 8, 2026
benrhodes26
reviewed
May 8, 2026
Per Ben's review feedback (PR orbital-materials#162 thread): - First bullet now leads with the CoulombModule, not the latent heads - Adds Speed (H100 QPS at 1k/10k atoms, periodic systems) and Accuracy (GSCDB138 Normalized Error Ratio, ex single-atom-species reactions) highlights - Moves the LatentChargeHead / LatentSpinHead detail and the Caution blockquote out of the README; they remain in MODELS.md - Links out to MODELS.md for the full architecture description Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aggregate_nodes(..., reduction="mean") now forwards n_node directly to scatter_mean, which takes pre-computed group sizes instead of building a divisor inside the graph from scatter_sum(ones, ...). The old form triggered a torch.compile + autograd miscompile (~1 eV on H2O / ethanol / NaCl). Math is bit-identical in eager. Drops segment_mean() and its test (unused after this change). Ports orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LatentChargeHead / LatentSpinHead centering now uses repeat_interleave over n_node instead of gather-by-node_batch_index. Compiles cleanly under dynamic shapes. conservative_regressor drops the in-place interaction_energy += coulomb_energy. pair_repulsion casts the poly cutoff exponent with p.to(torch.float32) instead of float(p), which avoids a Tensor.item() graph break, and zeros stress entries beyond 1e10. Ports orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the backbone-only .compile() overrides on the
{Conservative,Direct}ForcefieldRegressor classes. model.compile(...)
now wraps the full forward via standard nn.Module.compile(). Note: the
GraphRegressor.compile() override was already absent in public, so no
change there.
CoulombModule's PME path is split into two @torch.compiler.disable
helpers (_particle_mesh_ewald and _estimate_pme_params_and_neighbors)
so dynamo skips nvalchemiops's ctypes-using PME entirely while the
rest of the regressor compiles.
End-to-end (per core PR): ~1.5-1.6x faster inference (1.62x at 10k
atoms, single H100); compiled-vs-eager numerical diff at float-noise
level.
Adds tests/forcefield/test_direct_regressor.py and
tests/expensive_tests/test_compilation.py; updates conftest and
test_conservative compile tests. Adds scripts/compile_numerical_check.py
for offline diagnosis (adapted to public pretrained.orbmol_v2 loader).
Ports orbital-materials/orb#3074.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates weights_path to the new teqabfhg checkpoint (orbmol-v2-teqabfhg-20260523.ckpt) which has no LatentSpinHead. Adds use_per_atom_spins to orb_v3_conservative_architecture (default False) so the spin head and ChargeConditionedEnergyHead.use_spins are controlled together. System-level charge/spin conditioning via ChargeSpinConditioner is unchanged. A future spin-enabled checkpoint can pass use_per_atom_spins=True from its pretrained.* entry without further architecture changes. Updates test_backwards_compatibility goldens to match the new checkpoint: - H2O: -2079.86339 -> -2079.86222 eV - Cu fcc: -178549.3860 -> -178550.9810 eV; stress retuned - forces/stress vectors regenerated on Mac arm64 CPU Old checkpoint URL (orbmol-v2-s11doh8x-20260507.ckpt) was removed from S3 and now 404s; users loading that file via state_dict manually will hit a shape mismatch on the energy head MLP and the missing latent spin head. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
orbmol_v2's default checkpoint (teqabfhg) has no per-atom spin head. Updates the MODELS.md description to mention only LatentChargeHead and clarifies that system-level total charge / spin multiplicity still flow through the ChargeSpinConditioner. Caution paragraph updated to drop the per-atom spin language. README headline GSCDB138 numbers reflect the previous s11doh8x checkpoint and will be refreshed once a full per-category breakdown for teqabfhg is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates GSCDB138 numbers from the s11doh8x measurement to the new teqabfhg checkpoint (same 109 evaluable subsets / 5152 reactions): - Overall NER: 6.05 -> 1.62 (was 1.83 on s11doh8x) - NC, TC, TM, BH, INC, ISO per-category numbers refreshed - ISO regresses slightly (1.03 -> 1.36); other categories improve Adds a second bullet covering GMTKN55 WTMAD-2 (5.41 -> 4.37 kcal/mol), WIGGLE150 RMSE (1.23 -> 1.19), and the two new evaluation entries BEGDB (MAE 0.235) and ACONFL (RMSE 0.40). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
orbmol_v2's ChargeSpinConditioner requires total_charge and spin_multiplicity in system_features. Without them the adapter's _get_charge_and_spin returns an empty dict and the conditioner errors. Defaulting to charge=0, spin=1 (singlet) matches the convention used in the internal speed harness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the s11doh8x-era "44 QPS at 1k atoms / 9 QPS at 10k atoms / within ~5% of v1" claim with absolute ms timings from BENCH on the new teqabfhg checkpoint (full-model torch.compile, single GPU, periodic systems): 30/42/116/191 ms at 100/1k/5k/10k atoms. Notes that the no-spin variant is 1-12% faster than the prior spin-having v2 development variant at every size we tested. The v1 parity claim is removed pending a fresh v1-vs-v2 comparison run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the standalone v2-teqabfhg ms numbers with a v1-vs-v2 table at 100/1k/5k/10k atoms. v2 is slower at small sizes (electrostatics overhead) and faster at large sizes; crossover around 5k atoms, 1.46x speedup at 10k. v1 runs with backbone-only compile (its best available), v2 with full-model compile thanks to the port of orbital-materials/orb#3074. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
@vsimkus mind giving this another look? Since your earlier pass it now also includes:
BC test goldens were regenerated on Mac arm64; Linux CI may want a small atol tweak if it triggers. |
The full-model torch.compile changes ported from #3074 also benefit
OrbMol-v1, not just v2. New BENCH numbers show v1-full-compile is
1.3-1.8x faster than the old backbone-only workaround (biggest at 10k
atoms: 278 -> 158 ms). v2-teqabfhg is ~20-60% slower than v1
full-compile at the same system size, reflecting the real cost of the
PME / Coulomb path; the previous narrative ("v2 crosses over and
becomes faster than v1 at 5k+ atoms") was an artifact of comparing
v1-backbone-compile to v2-full-compile.
Also softens the lead-in line about LES being free; it isn't, but the
accuracy gain (3.7x lower NER) is the trade.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
vsimkus
reviewed
May 25, 2026
Previously is_conservative_model = "conservative" in args.base_model, which misclassified orbmol_v2 as direct. Build the model first, then check "grad_forces" in model.loss_weights, and push custom CLI weights via model.loss_weights.update(...).
- Remove scripts/compile_numerical_check.py (covered by internal tests now) - MODELS.md: reword charge/spin requirement to match orbmol-v1 phrasing - README.md: drop internal commit ref; trim May 2026 update to two bullets (electrostatics+headline result; full-model compile speedup) - test_backwards_compatibility.py: restore full-precision cu_energy_gold (-178550.98098)
Contributor
Author
|
Thanks for the review @vsimkus — hopefully all addressed now:
|
vsimkus
approved these changes
May 26, 2026
Contributor
vsimkus
left a comment
There was a problem hiding this comment.
LGTM! 🥳 A few minor comments, but otherwise it's good.
- finetune.py: use isinstance(model, ConservativeForcefieldRegressor) instead of inspecting loss_weights; fix stale comment - pretrained.py:418: drop "spins" from orbmol_v2 docstring - README.md:27: drop "applies to v1 and v2" parenthetical (compile speedup applies to all models, the parenthetical was misleading)
Contributor
Author
|
Brilliant thanks guys! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds OrbMol-v2 — extends the OrbMol architecture with learnable per-atom electrostatics: a
LatentChargeHeadpredicts charges that satisfy the system total-charge constraint and a newCoulombModuleadds long-range Coulomb energy on top of the GNN — bare 1/r direct sum for non-periodic systems, Particle Mesh Ewald vianvalchemiopsfor periodic. The energy head (ChargeConditionedEnergyHead) is conditioned on the predicted charges and spins per atom.The published checkpoint is at https://huggingface.co/orbital-materials/orbmol-v2, verified to reproduce internal reference values for H₂O and Cu fcc to ≤1e-5 eV / eV/Å / eV/ų on both CPU and H100.
What changed
New files
orb_models/forcefield/models/coulomb_module.py—CoulombModule, direct + PME pathstests/forcefield/test_orbmol_v2_smoke.py— opt-in network test (ORB_RUN_NETWORK_TESTS=1) checking H₂O / Cu predictions against reference valuesNew classes in existing files
forcefield_heads.py:ChargeConditionedEnergyHead,LatentChargeHead,LatentSpinHeadpretrained.py:orbmol_v2()loader (s3-hosted weights), registry entrySurgical edits
conservative_regressor.py: newcoulomb_moduleandpair_repulsion_node_aggregationkwargs; predicts charges/spins before energy; bifurcated forward path forChargeConditionedEnergyHead; adds Coulomb energy and explicit forces/virialpyproject.toml: bumpnvalchemi-toolkit-opsto>=0.3.1,<0.4(PMEhybrid_forcesAPI)MODELS.md: documentsorbmol-v2Backwards compatibility
EnergyHeadand other base heads are untouched — same forward, samepredict(), samedenormalize(). All existing v3 conservative omol/omat/mpa models behave identically.ZBLBasisdefault changed tonode_aggregation="sum". All existing architectures (orb-v3-conservative) keep their training-time aggregation ("mean").Test plan