Skip to content

Commit 68aa1b3

Browse files
committed
validate: charged ligand → warning (with biologist-actionable advice), not error
Previously a charged ligand (formal_charge != 0) hard-FAILED validation, which blocked any bench run on ionic carboxylates, ammoniums, etc. — a huge swath of real medicinal chemistry. The 'Phase-2 item' message offered no path forward. The actual physics: PME on a net-charged box adds a uniform neutralising background, which is a finite-size artefact that corrects out via the Rocklin 2013 5-term correction (EMP + NET + RIP + DSC + ANA). Without that correction, ABSOLUTE ΔG_bind on a charged ligand carries a 5-15 kcal/mol systematic error that depends on box size and protein charge. BUT — and this is the biologist-actionable point — it CANCELS in ΔΔG within a same-charge congeneric series. So Kendall-τ rank-correlation gates remain valid for ionic ligand series; only absolute MAE is meaningful when the correction is applied. New validator behaviour: ? acetate: formal charge -1 — absolute ΔG_bind carries a 5-15 kcal/mol PBC self-interaction error (Rocklin 2013 J Chem Phys 139:184103). For ΔΔG within a same-charge congeneric series the error cancels; for absolute ΔG apply Rocklin post-hoc. Warning, exit 0 — biologist proceeds informed. Also added a stub comment block where the analytical _rocklin_pbc_correction would live, documenting the 5 terms and why the full implementation is Phase-3 (needs protein charge + ion-pairing distribution as auxiliary state). Test test_charged_ligand_flagged updated to assert the new warning shape including the 'ΔΔG cancels' caveat — that sentence is the load-bearing biologist-actionability claim. 19/19 validator smoke tests still PASS.
1 parent 4c76625 commit 68aa1b3

2 files changed

Lines changed: 50 additions & 11 deletions

File tree

src/fep/binding.py

Lines changed: 37 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -396,6 +396,28 @@ def _harmonic_restraint_free_energy_kcalmol(
396396
return dG_kJ / 4.184
397397

398398

399+
# Charged-ligand finite-size PBC correction (Rocklin 2013 J Chem
400+
# Phys 139:184103) is NOT yet implemented. The full correction is
401+
# 5 terms (EMP + NET + RIP + DSC + ANA) and requires knowing the
402+
# protein's net charge + ion pairing distribution — too much
403+
# auxiliary state to derive from the YAML alone in a way I trust
404+
# to ship publication-grade numbers.
405+
#
406+
# Practical guidance for biologists hitting the validator's
407+
# 'formal_charge ≠ 0' message:
408+
# - For a CONGENERIC SERIES of equally-charged ligands (e.g.
409+
# all-anionic carboxylates), the PBC self-interaction error
410+
# cancels in ΔΔG. Rank-correlation gates (Kendall τ) are still
411+
# valid; absolute MAE is offset by a constant.
412+
# - For absolute ΔG_bind on a charged ligand vs neutral protein,
413+
# extend the Rocklin correction post-hoc per the cited paper.
414+
# A typical -1 ligand in a 5 nm cubic TIP3P box adds ~5-15
415+
# kcal/mol depending on protein charge.
416+
# - When the Phase-3 implementation lands, this comment block
417+
# gets replaced with an analytical _rocklin_pbc_correction
418+
# function + autowiring into compute_absolute_binding_dg.
419+
420+
399421
def _build_complex_alchemical_system(
400422
smiles: str,
401423
receptor_pdb: str | Path,
@@ -1795,15 +1817,23 @@ def _similar(a, b):
17951817
"est_sampled_wall_cpu_s": sampled_cpu_s,
17961818
"est_sampled_wall_gpu_s": sampled_gpu_s,
17971819
})
1798-
# Biologist gotchas: formal charge ≠ 0 means the
1799-
# alchemical FEP needs a counter-ion correction
1800-
# we haven't wired; flag it.
1820+
# Charged ligand: PBC self-interaction error in
1821+
# absolute ΔG. Cancels in ΔΔG within a same-charge
1822+
# series (so still valid for rank-correlation
1823+
# gates) but makes absolute MAE off by 5-15
1824+
# kcal/mol depending on protein charge. Warning,
1825+
# not error — biologists studying ionic ligand
1826+
# series shouldn't be blocked, just informed.
18011827
if formal_charge != 0:
1802-
issues.append(
1828+
warnings.append(
18031829
f"{name}: formal charge "
1804-
f"{formal_charge:+d} — alchemical FEP "
1805-
"needs a Rocklin / Warren correction "
1806-
"(not wired in Phase-1); Phase-2 item")
1830+
f"{formal_charge:+d} — absolute ΔG_bind "
1831+
"carries a 5-15 kcal/mol PBC self-"
1832+
"interaction error (Rocklin 2013 J Chem "
1833+
"Phys 139:184103). For ΔΔG within a "
1834+
"same-charge congeneric series the error "
1835+
"cancels; for absolute ΔG apply Rocklin "
1836+
"post-hoc.")
18071837
if n_heavy > 60:
18081838
issues.append(
18091839
f"{name}: {n_heavy} heavy atoms > 60 — "

tests/fep/test_validate_smoke.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -203,8 +203,11 @@ def test_pinned_stereo_passes():
203203

204204

205205
def test_charged_ligand_flagged():
206-
"""Formal-charge != 0 needs a Rocklin/Warren correction we
207-
haven't wired. Validator must flag so the biologist knows."""
206+
"""Formal-charge != 0 carries a PBC self-interaction error in
207+
absolute ΔG (Rocklin 2013). Validator emits a WARNING (not
208+
a hard error) so biologists studying ionic series aren't
209+
blocked — the error cancels in same-charge ΔΔG.
210+
"""
208211
yaml = """
209212
receptor:
210213
pdb_path: benchmarks/dock/1stp.pdb
@@ -215,8 +218,14 @@ def test_charged_ligand_flagged():
215218
dG_bind_kcalmol: -5.0
216219
"""
217220
rc, out = _run(yaml)
218-
assert rc != 0, "charged ligand should be flagged as an issue"
219-
assert "formal charge" in out.lower() or "rocklin" in out.lower()
221+
# Warning, not hard fail — proceed.
222+
assert rc == 0, (
223+
f"charged ligand should warn but pass; got rc={rc}\n{out}")
224+
assert "formal charge" in out.lower()
225+
assert "rocklin" in out.lower()
226+
assert "PBC self-interaction error" in out
227+
assert "ΔΔG within a same-charge" in out, (
228+
"biologist actionability: must mention ΔΔG-cancels caveat")
220229

221230

222231
def test_duplicate_names_flagged():

0 commit comments

Comments
 (0)