src/bridge: Campaign-1 → Campaign-2 rate-law emitter (API surface only)

Rockman6 · Rockman6 · commit 511dd7c14f19 · 2026-04-22T16:26:54.000+08:00
The bridge module retires the prof's 'closed ontology / circular
validation' critique by giving Campaign-2 pathway models a single
import surface for binding rate constants WITH explicit method
provenance and calibrated uncertainty.

Today's commit ships the API contract + closed-form thermodynamic
conversions, NOT any pathway simulation work (per the prof's gate
that no Campaign-2 work proceeds until Milestone A clears):

  binding_to_hill(dG_kcalmol, *, T_K, n_hill, uncertainty_kcalmol)
    Hill equation, K_d = exp(ΔG/RT). σ propagates multiplicatively
    on K_d (95% CI = K_d × exp(±1.96σ/RT)). Source tag on the
    returned RateLawPrior distinguishes 'FEP' (today's
    compute_absolute_binding_dg output, calibrated σ) from
    'phenomenological' (the OLD/ Tier-0 BindingMatcher
    descriptor-fit, inflated σ).

  affinity_to_michaelis(kcat, KM, *, ...)
    Pass-through for now. Symbolic placeholder for when CellSim
    grows a reactive-FEP / Eyring transition-state module.

  RateLawPrior dataclass
    Carries: type ('hill' | 'michaelis'), parameters dict, 95% CI
    per parameter, source ('FEP' | 'phenomenological' | 'literature'),
    method tag ('amber14-DDM-MBAR'), temperature, ISO timestamp.
    Campaign-2 ODE configs ingest this and audit-trace back to a
    specific FEP run.

6/6 smoke tests, CI-wired:
  - K_d matches closed-form thermodynamics
  - σ propagates multiplicatively (asymmetric CI on log scale)
  - No σ → empty CI (consumer can detect uncalibrated estimate)
  - Phenomenological source tag preserved
  - Michaelis pass-through correct
  - One-line summary stable

Out-of-scope for Campaign 1 (matches src/bridge/README.md):
  - No pathway simulation (Campaign 2 work).
  - No bulk emission to a Campaign-2 YAML config (Campaign 2 work).
  - No SQLite cache integration (Campaign 2 will own the lookup
    layer; this module is a pure-function converter).

When Campaign 2 resumes after Milestone A clears, its first
pathway model imports from src.bridge and gets calibrated FEP-
derived priors instead of hand-tuned phenomenology — exactly
the cycle-break the prof asked for.
diff --git a/.github/workflows/smoke.yml b/.github/workflows/smoke.yml
@@ -181,6 +181,9 @@ jobs:
       - name: fep-binding bench --resume regression
         run: python -u tests/fep/test_bench_resume_smoke.py
 
+      - name: bridge — Campaign-1 → Campaign-2 rate-law emitter
+        run: python -u tests/bridge/test_binding_to_hill_smoke.py
+
       - name: fep sampled binding smoke (opt-in, ~10 min, manual)
         if: >
           github.event_name == 'workflow_dispatch' &&
diff --git a/src/bridge/__init__.py b/src/bridge/__init__.py
@@ -0,0 +1,202 @@
+"""src/bridge — Campaign-1 → Campaign-2 rate-law emitter.
+
+This module is the EXIT PATH from the professor's "closed
+ontology / circular validation" critique. Campaign-2 pathway
+models will not hand-tune binding rate constants; they will
+cite this module's output with explicit method provenance and
+calibrated uncertainty.
+
+What ships here today (Campaign-1 scope)
+----------------------------------------
+- `binding_to_hill(dG_kcalmol, *, T_K=298.15, n_hill=1.0,
+   uncertainty_kcalmol=None)` — convert an absolute binding ΔG
+   (typically from `cellsim fep-binding bench --sample`) into a
+   Hill-equation rate-law record with CI.
+- `affinity_to_michaelis(kcat_per_s, KM_M, *,
+   uncertainty_kcat=None, uncertainty_KM=None)` — same idea for
+   enzyme kinetics. Pass-through for now since FEP doesn't
+   produce kcat directly; the module exists so Campaign-2
+   loaders have a single place to import from.
+
+Provenance dataclass (`RateLawPrior`) carries:
+  - the rate-law type ('hill' | 'michaelis')
+  - the parameter values + CI (95%)
+  - source ('FEP', 'phenomenological', 'literature')
+  - method tag ('amber14-DDM-MBAR' or whatever produced the input)
+  - timestamp
+
+Why a thin module: Campaign-2's signalling ODEs read these
+priors at config time. Centralising the conversion + provenance
+here means any Campaign-2 model can be audited back to a
+specific FEP run; if we later switch from amber14 to ff19SB,
+the audit trail tells us which rate-law records need re-emission.
+
+Scope NOT in this commit (Campaign-2 work, gated by prof's
+Milestone-A clearance):
+  - Anything that runs a cell-level ODE.
+  - Bulk emission to a Campaign-2 YAML config.
+  - Cache integration (the `(ligand_hash, receptor_hash) → ΔG`
+    lookup that lets Campaign 2 ingest 10⁴ compounds).
+
+Non-AI: the conversions are closed-form thermodynamics
+(Kd = exp(ΔG/RT)) + propagation of σ via the dG-uncertainty.
+No learned surrogate.
+"""
+from __future__ import annotations
+
+import math
+import time
+from dataclasses import dataclass, field
+from typing import Optional
+
+
+# Boltzmann constant in kcal/mol/K (the unit FEP outputs use).
+_KB_KCAL_PER_MOL_K = 0.0019872041
+
+
+@dataclass
+class RateLawPrior:
+    """Provenance-tracked rate-law record consumed by Campaign-2.
+
+    All numeric fields use SI molar units where applicable so a
+    Campaign-2 ODE can plug them in without unit fiddling.
+    """
+    type: str                       # 'hill' or 'michaelis'
+    parameters: dict                # see per-type docs below
+    parameter_ci95: dict             # 2-tuples (lo, hi) per param
+    source: str                     # 'FEP' | 'phenomenological' | 'literature'
+    method: str                     # e.g. 'amber14-DDM-MBAR'
+    temperature_K: float
+    timestamp_iso: str = field(
+        default_factory=lambda: time.strftime("%Y-%m-%dT%H:%M:%SZ",
+                                                time.gmtime()))
+    notes: str = ""
+
+    def summary(self) -> str:
+        """One-line summary suitable for Campaign-2 config logs."""
+        params = ", ".join(
+            f"{k}={v}" for k, v in self.parameters.items())
+        return (f"[{self.type}] {params}  "
+                f"src={self.source}/{self.method}  "
+                f"T={self.temperature_K:.1f}K")
+
+
+def binding_to_hill(
+    dG_kcalmol: float,
+    *,
+    T_K: float = 298.15,
+    n_hill: float = 1.0,
+    uncertainty_kcalmol: Optional[float] = None,
+    method: str = "FEP-DDM-MBAR",
+) -> RateLawPrior:
+    """Absolute binding ΔG → Hill-equation rate-law prior.
+
+    Hill equation:
+        fraction_bound([L]) = [L]^n / (K_d^n + [L]^n)
+
+    where K_d = exp(ΔG / RT). Uncertainty in ΔG (1σ) propagates
+    multiplicatively to K_d:
+        σ(ln K_d) = σ(ΔG) / RT
+        K_d_lo = K_d × exp(-1.96 × σ_ln)   (95% CI lower)
+        K_d_hi = K_d × exp(+1.96 × σ_ln)
+
+    Args:
+        dG_kcalmol: absolute binding free energy. Negative for
+            binders; sign convention matches `compute_absolute_
+            binding_dg` output.
+        T_K: temperature (default 298.15 K — physiological-ish,
+            matches the FEP sampler default).
+        n_hill: Hill cooperativity coefficient (default 1.0 —
+            non-cooperative single-site binding; Campaign-2
+            allosteric models override).
+        uncertainty_kcalmol: 1σ uncertainty on ΔG (typically MBAR
+            σ from the bench CSV's uncertainty_kcalmol column).
+            None → no CI propagated; the resulting record will
+            have parameter_ci95 = {} so the consumer knows the
+            estimate is uncalibrated.
+        method: provenance string (default 'FEP-DDM-MBAR' for
+            absolute binding from compute_absolute_binding_dg;
+            override to 'phenomenological-Lipinski' if the input
+            came from the OLD/ Tier-0 BindingMatcher).
+
+    Returns: RateLawPrior with parameters['Kd_M'] = K_d in M
+             (NOT mM), parameters['n_hill'] = n_hill.
+    """
+    RT = _KB_KCAL_PER_MOL_K * T_K
+    Kd_M = math.exp(dG_kcalmol / RT)
+
+    parameters = {"Kd_M": Kd_M, "n_hill": float(n_hill)}
+    parameter_ci95: dict = {}
+    if uncertainty_kcalmol is not None and uncertainty_kcalmol > 0:
+        sigma_ln_Kd = uncertainty_kcalmol / RT
+        Kd_lo = Kd_M * math.exp(-1.96 * sigma_ln_Kd)
+        Kd_hi = Kd_M * math.exp(+1.96 * sigma_ln_Kd)
+        parameter_ci95 = {"Kd_M": (Kd_lo, Kd_hi)}
+
+    return RateLawPrior(
+        type="hill",
+        parameters=parameters,
+        parameter_ci95=parameter_ci95,
+        source="FEP" if method.startswith("FEP") else "phenomenological",
+        method=method,
+        temperature_K=T_K,
+    )
+
+
+def affinity_to_michaelis(
+    kcat_per_s: float,
+    KM_M: float,
+    *,
+    T_K: float = 298.15,
+    uncertainty_kcat: Optional[float] = None,
+    uncertainty_KM: Optional[float] = None,
+    method: str = "literature",
+) -> RateLawPrior:
+    """Enzyme kinetics → Michaelis-Menten rate-law prior.
+
+    Pass-through for now: FEP doesn't directly produce kcat (that
+    requires transition-state theory + a reactive sub-tier we
+    haven't implemented). The function exists so Campaign-2
+    loaders have a single import surface — when CellSim grows a
+    reactive-FEP / Eyring TS module, the body changes here and
+    every downstream consumer benefits.
+
+    Args:
+        kcat_per_s: turnover number in s⁻¹.
+        KM_M: Michaelis constant in M.
+        T_K: temperature.
+        uncertainty_kcat: 1σ on kcat in s⁻¹ (default None → no CI).
+        uncertainty_KM: 1σ on KM in M (default None → no CI).
+        method: provenance ('literature' if the values are from
+            BRENDA / SABIO; 'reactive-FEP' once that ships).
+
+    Returns: RateLawPrior with type='michaelis'.
+    """
+    parameters = {"kcat_per_s": kcat_per_s, "KM_M": KM_M}
+    parameter_ci95: dict = {}
+    if uncertainty_kcat is not None and uncertainty_kcat > 0:
+        parameter_ci95["kcat_per_s"] = (
+            kcat_per_s - 1.96 * uncertainty_kcat,
+            kcat_per_s + 1.96 * uncertainty_kcat)
+    if uncertainty_KM is not None and uncertainty_KM > 0:
+        parameter_ci95["KM_M"] = (
+            KM_M - 1.96 * uncertainty_KM,
+            KM_M + 1.96 * uncertainty_KM)
+
+    return RateLawPrior(
+        type="michaelis",
+        parameters=parameters,
+        parameter_ci95=parameter_ci95,
+        source="FEP" if method.startswith("reactive-FEP") else
+               "literature" if method == "literature" else
+               "phenomenological",
+        method=method,
+        temperature_K=T_K,
+    )
+
+
+__all__ = [
+    "RateLawPrior",
+    "binding_to_hill",
+    "affinity_to_michaelis",
+]
diff --git a/tests/bridge/__init__.py b/tests/bridge/__init__.py
diff --git a/tests/bridge/test_binding_to_hill_smoke.py b/tests/bridge/test_binding_to_hill_smoke.py
@@ -0,0 +1,122 @@
+"""src/bridge — Campaign-1 → Campaign-2 rate-law emitter smoke.
+
+Pin the closed-form thermodynamic conversions so a future
+refactor of the Hill / Michaelis primitives can't silently
+drift the K_d / CI numbers Campaign-2 will consume.
+"""
+from __future__ import annotations
+
+import math
+import sys
+from pathlib import Path
+
+REPO_ROOT = Path(__file__).resolve().parents[2]
+sys.path.insert(0, str(REPO_ROOT))
+
+from src.bridge import (  # noqa: E402
+    RateLawPrior,
+    binding_to_hill,
+    affinity_to_michaelis,
+)
+
+
+def test_binding_to_hill_kd_matches_closed_form():
+    """ΔG = -10 kcal/mol at 298.15 K should give K_d ≈ 50 nM
+    (the canonical 'good drug' affinity)."""
+    prior = binding_to_hill(-10.0)
+    Kd_M = prior.parameters["Kd_M"]
+    # K_d = exp(-10 / (R*T)) at T=298.15
+    expected = math.exp(-10.0 / (0.0019872041 * 298.15))
+    assert abs(Kd_M - expected) < 1e-15
+    # Sanity: ~50 nM (5e-8 M)
+    assert 1e-8 < Kd_M < 1e-7, f"K_d {Kd_M} M not in nM range"
+
+
+def test_binding_to_hill_uncertainty_propagates_multiplicatively():
+    """1σ on ΔG → multiplicative CI on K_d (since K_d is
+    exponential in ΔG)."""
+    prior = binding_to_hill(-10.0, uncertainty_kcalmol=0.5)
+    Kd_M = prior.parameters["Kd_M"]
+    Kd_lo, Kd_hi = prior.parameter_ci95["Kd_M"]
+
+    # 95% CI is ±1.96σ in ΔG → exp(±1.96σ/RT) in K_d.
+    sigma_ln = 0.5 / (0.0019872041 * 298.15)
+    expected_lo = Kd_M * math.exp(-1.96 * sigma_ln)
+    expected_hi = Kd_M * math.exp(+1.96 * sigma_ln)
+    assert abs(Kd_lo - expected_lo) / expected_lo < 1e-12
+    assert abs(Kd_hi - expected_hi) / expected_hi < 1e-12
+
+    # CI must be asymmetric on a multiplicative scale.
+    # log10(Kd_hi/Kd_M) should equal log10(Kd_M/Kd_lo).
+    geom_lo = math.log10(Kd_M / Kd_lo)
+    geom_hi = math.log10(Kd_hi / Kd_M)
+    assert abs(geom_lo - geom_hi) < 1e-12
+
+
+def test_binding_to_hill_no_uncertainty_no_ci():
+    """No σ → no CI populated. Campaign-2 consumer can branch
+    on `bool(prior.parameter_ci95)` to know if estimate is
+    calibrated."""
+    prior = binding_to_hill(-10.0)
+    assert prior.parameter_ci95 == {}
+    assert prior.source == "FEP"  # default method starts with 'FEP'
+    assert prior.method == "FEP-DDM-MBAR"
+
+
+def test_binding_to_hill_phenomenological_source_tag():
+    """When the input came from the OLD/ Tier-0 BindingMatcher
+    (descriptor-fit), the consumer needs to know the prior is
+    NOT physics-grounded — propagate that via source='phenomenological'."""
+    prior = binding_to_hill(
+        -7.0, method="phenomenological-Lipinski",
+        uncertainty_kcalmol=2.5)  # inflated σ for the descriptor fit
+    assert prior.source == "phenomenological"
+    assert "phenomenological" in prior.method
+
+
+def test_affinity_to_michaelis_passes_through():
+    """Pass-through impl for now. Just verify the dataclass shape
+    is sane so Campaign-2 loaders can unpack consistently."""
+    prior = affinity_to_michaelis(
+        kcat_per_s=100.0, KM_M=1e-5,
+        uncertainty_kcat=10.0, uncertainty_KM=2e-6)
+    assert prior.type == "michaelis"
+    assert prior.parameters == {"kcat_per_s": 100.0, "KM_M": 1e-5}
+    assert "kcat_per_s" in prior.parameter_ci95
+    assert "KM_M" in prior.parameter_ci95
+    # Default method is 'literature'
+    assert prior.source == "literature"
+
+
+def test_summary_one_line():
+    prior = binding_to_hill(-12.0, uncertainty_kcalmol=0.4)
+    s = prior.summary()
+    assert s.startswith("[hill]")
+    assert "src=FEP" in s
+    assert "T=298.1K" in s
+
+
+if __name__ == "__main__":
+    funcs = [
+        test_binding_to_hill_kd_matches_closed_form,
+        test_binding_to_hill_uncertainty_propagates_multiplicatively,
+        test_binding_to_hill_no_uncertainty_no_ci,
+        test_binding_to_hill_phenomenological_source_tag,
+        test_affinity_to_michaelis_passes_through,
+        test_summary_one_line,
+    ]
+    fails = []
+    for f in funcs:
+        try:
+            f()
+            print(f"[PASS] {f.__name__}")
+        except AssertionError as e:
+            print(f"[FAIL] {f.__name__}: {e}")
+            fails.append(f.__name__)
+        except Exception as e:
+            import traceback
+            traceback.print_exc()
+            print(f"[ERROR] {f.__name__}: {e}")
+            fails.append(f.__name__)
+    print(f"{len(funcs) - len(fails)}/{len(funcs)} PASS")
+    sys.exit(0 if not fails else 1)