Skip to content

Commit f1f3773

Browse files
docs: add contacts, distances, dssp, projections example notebooks
New examples/gromacs/ notebooks completing coverage of mdpp.analysis: - contacts.ipynb: compute_contacts, compute_contact_frequency, compute_native_contacts, contact map heatmap, Q(t) plot - distances.ipynb: compute_distances, compute_minimum_distance, backend selection comparison - dssp.ipynb: compute_dssp, per-residue frequency bar chart, time evolution heatmap - projections.ipynb: PCA scree/cumulative variance, PCA/TICA projection scatter (colored by time), Ramachandran diagram Updated CLAUDE.md to list all 9 example notebooks individually.
1 parent e987f54 commit f1f3773

5 files changed

Lines changed: 686 additions & 1 deletion

File tree

CLAUDE.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,16 @@ scripts/ # shell scripts (NOT packaged, copy to MD working directori
6666
│ ├── quickrun/ # quickrun.sh, quickrun.sbatch
6767
│ └── runtime/ # check_status.sh, monitor.sbatch
6868
examples/ # worked examples and notebooks
69-
├── gromacs/ # GROMACS analysis notebooks (RMSD, RMSF, DCCM, FES, I/O)
69+
├── gromacs/ # GROMACS analysis notebooks
70+
│ ├── io_preprocessing.ipynb # trajectory loading, atom selection, alignment
71+
│ ├── rmsd_rmsf.ipynb # RMSD, RMSF, SASA, radius of gyration
72+
│ ├── dccm.ipynb # DCCM, hydrogen bonds
73+
│ ├── contacts.ipynb # inter-residue contacts, contact frequency, Q(t)
74+
│ ├── distances.ipynb # pairwise distances, minimum distance, backends
75+
│ ├── dssp.ipynb # secondary structure (DSSP)
76+
│ ├── fes.ipynb # PCA/TICA + free energy surfaces
77+
│ ├── projections.ipynb # scree plot, cumulative variance, Ramachandran
78+
│ └── clustering.ipynb # all 7 clustering methods (Gromos/DBSCAN/...)
7079
├── openfe/ # OpenFE RBFE workflow notebook + input PDBs
7180
└── browndye/ # BrownDye2 complex PQR preparation
7281

examples/gromacs/contacts.ipynb

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "a1b2c3d4",
6+
"metadata": {},
7+
"source": [
8+
"# `mdpp` Example: Contact Analysis\n",
9+
"\n",
10+
"This notebook demonstrates inter-residue contact analysis:\n",
11+
"\n",
12+
"- `compute_contacts` -- per-frame inter-residue distances\n",
13+
"- `compute_contact_frequency` -- contact frequency maps\n",
14+
"- `compute_native_contacts` -- fraction of native contacts Q(t)\n",
15+
"\n",
16+
"Use this notebook after the IO/alignment workflow.\n"
17+
]
18+
},
19+
{
20+
"cell_type": "code",
21+
"execution_count": null,
22+
"id": "e5f6a7b8",
23+
"metadata": {},
24+
"outputs": [],
25+
"source": [
26+
"from __future__ import annotations\n",
27+
"\n",
28+
"from pathlib import Path\n",
29+
"\n",
30+
"import matplotlib.pyplot as plt\n",
31+
"import numpy as np\n",
32+
"from mplplots.utils import auto_ticks\n",
33+
"\n",
34+
"from mdpp.analysis.contacts import (\n",
35+
" compute_contact_frequency,\n",
36+
" compute_contacts,\n",
37+
" compute_native_contacts,\n",
38+
")\n",
39+
"from mdpp.core.trajectory import align_trajectory, load_trajectory\n",
40+
"from mdpp.plots.contacts import contact_frequency_to_matrix, plot_contact_map\n",
41+
"from mdpp.plots.timeseries import plot_native_contacts\n",
42+
"\n",
43+
"plt.style.use(\"mplplots.styles.GraphPadPrism\")"
44+
]
45+
},
46+
{
47+
"cell_type": "code",
48+
"execution_count": null,
49+
"id": "c9d0e1f2",
50+
"metadata": {},
51+
"outputs": [],
52+
"source": [
53+
"TOPOLOGY_PATH = Path(\"/path/to/topology.pdb\")\n",
54+
"TRAJECTORY_PATH = Path(\"/path/to/trajectory.xtc\")\n",
55+
"STRIDE = 5\n",
56+
"\n",
57+
"if not TOPOLOGY_PATH.exists() or not TRAJECTORY_PATH.exists():\n",
58+
" raise FileNotFoundError(\n",
59+
" \"Update TOPOLOGY_PATH and TRAJECTORY_PATH before running analysis cells.\"\n",
60+
" )\n",
61+
"\n",
62+
"traj = load_trajectory(\n",
63+
" trajectory_path=TRAJECTORY_PATH,\n",
64+
" topology_path=TOPOLOGY_PATH,\n",
65+
" stride=STRIDE,\n",
66+
" atom_selection=\"protein\",\n",
67+
")\n",
68+
"traj = align_trajectory(traj, atom_selection=\"name CA\")\n",
69+
"\n",
70+
"print(f\"Frames: {traj.n_frames}, Atoms: {traj.n_atoms}\")"
71+
]
72+
},
73+
{
74+
"cell_type": "markdown",
75+
"id": "a3b4c5d6",
76+
"metadata": {},
77+
"source": [
78+
"## Inter-Residue Contacts\n"
79+
]
80+
},
81+
{
82+
"cell_type": "code",
83+
"execution_count": null,
84+
"id": "e7f8a9b0",
85+
"metadata": {},
86+
"outputs": [],
87+
"source": [
88+
"contacts = compute_contacts(traj, scheme=\"closest-heavy\")\n",
89+
"\n",
90+
"print(f\"Pairs: {contacts.residue_pairs.shape[0]}\")\n",
91+
"print(f\"Mean distance: {contacts.distances_nm.mean():.3f} nm\")"
92+
]
93+
},
94+
{
95+
"cell_type": "markdown",
96+
"id": "c1d2e3f4",
97+
"metadata": {},
98+
"source": [
99+
"## Contact Frequency Map\n"
100+
]
101+
},
102+
{
103+
"cell_type": "code",
104+
"execution_count": null,
105+
"id": "a5b6c7d8",
106+
"metadata": {},
107+
"outputs": [],
108+
"source": [
109+
"frequency, pairs = compute_contact_frequency(traj, cutoff_nm=0.45, scheme=\"closest-heavy\")\n",
110+
"n_residues = traj.topology.n_residues\n",
111+
"matrix = contact_frequency_to_matrix(frequency, pairs, n_residues)\n",
112+
"\n",
113+
"fig, ax = plt.subplots(figsize=(7, 6), dpi=120)\n",
114+
"plot_contact_map(matrix, ax=ax)\n",
115+
"ax.set_title(\"Contact Frequency Map (cutoff = 0.45 nm)\")\n",
116+
"auto_ticks(ax)\n",
117+
"fig.tight_layout()"
118+
]
119+
},
120+
{
121+
"cell_type": "markdown",
122+
"id": "e9f0a1b2",
123+
"metadata": {},
124+
"source": [
125+
"## Native Contacts Q(t)\n"
126+
]
127+
},
128+
{
129+
"cell_type": "code",
130+
"execution_count": null,
131+
"id": "c3d4e5f6",
132+
"metadata": {},
133+
"outputs": [],
134+
"source": [
135+
"native = compute_native_contacts(traj, reference_frame=0, cutoff_nm=0.45, scheme=\"closest-heavy\")\n",
136+
"\n",
137+
"print(f\"Native pairs: {native.native_pairs.shape[0]}\")\n",
138+
"print(f\"Mean Q: {native.fraction.mean():.3f}\")\n",
139+
"\n",
140+
"fig, ax = plt.subplots(figsize=(10, 4), dpi=120)\n",
141+
"plot_native_contacts(native, ax=ax, label=\"Q(t)\")\n",
142+
"ax.set_title(\"Fraction of Native Contacts\")\n",
143+
"auto_ticks(ax)\n",
144+
"fig.tight_layout()"
145+
]
146+
}
147+
],
148+
"metadata": {
149+
"kernelspec": {
150+
"display_name": "mdpp",
151+
"language": "python",
152+
"name": "python3"
153+
},
154+
"language_info": {
155+
"name": "python",
156+
"version": "3.13.12"
157+
}
158+
},
159+
"nbformat": 4,
160+
"nbformat_minor": 5
161+
}

examples/gromacs/distances.ipynb

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "a1b2c3d4",
6+
"metadata": {},
7+
"source": [
8+
"# `mdpp` Example: Pairwise Distances\n",
9+
"\n",
10+
"Track specific atom-pair and inter-group distances over time.\n",
11+
"\n",
12+
"- `compute_distances`\n",
13+
"- `compute_minimum_distance`\n",
14+
"\n",
15+
"Use this notebook after the IO/alignment workflow.\n"
16+
]
17+
},
18+
{
19+
"cell_type": "code",
20+
"execution_count": null,
21+
"id": "e5f6a7b8",
22+
"metadata": {},
23+
"outputs": [],
24+
"source": [
25+
"from __future__ import annotations\n",
26+
"\n",
27+
"from pathlib import Path\n",
28+
"\n",
29+
"import matplotlib.pyplot as plt\n",
30+
"import numpy as np\n",
31+
"from mplplots.utils import auto_ticks\n",
32+
"\n",
33+
"from mdpp.analysis.distance import compute_distances, compute_minimum_distance\n",
34+
"from mdpp.core.trajectory import load_trajectory, select_atom_indices\n",
35+
"from mdpp.plots.timeseries import plot_distances\n",
36+
"\n",
37+
"plt.style.use(\"mplplots.styles.GraphPadPrism\")"
38+
]
39+
},
40+
{
41+
"cell_type": "code",
42+
"execution_count": null,
43+
"id": "c9d0e1f2",
44+
"metadata": {},
45+
"outputs": [],
46+
"source": [
47+
"TOPOLOGY_PATH = Path(\"/path/to/topology.pdb\")\n",
48+
"TRAJECTORY_PATH = Path(\"/path/to/trajectory.xtc\")\n",
49+
"STRIDE = 5\n",
50+
"\n",
51+
"if not TOPOLOGY_PATH.exists() or not TRAJECTORY_PATH.exists():\n",
52+
" raise FileNotFoundError(\n",
53+
" \"Update TOPOLOGY_PATH and TRAJECTORY_PATH before running analysis cells.\"\n",
54+
" )\n",
55+
"\n",
56+
"traj = load_trajectory(\n",
57+
" trajectory_path=TRAJECTORY_PATH,\n",
58+
" topology_path=TOPOLOGY_PATH,\n",
59+
" stride=STRIDE,\n",
60+
" atom_selection=\"protein\",\n",
61+
")\n",
62+
"\n",
63+
"print(f\"Frames: {traj.n_frames}, Atoms: {traj.n_atoms}\")"
64+
]
65+
},
66+
{
67+
"cell_type": "markdown",
68+
"id": "a3b4c5d6",
69+
"metadata": {},
70+
"source": [
71+
"## Pairwise Atom Distances\n"
72+
]
73+
},
74+
{
75+
"cell_type": "code",
76+
"execution_count": null,
77+
"id": "e7f8a9b0",
78+
"metadata": {},
79+
"outputs": [],
80+
"source": [
81+
"# Track distances between specific atom pairs (e.g., CA atoms)\n",
82+
"ca_indices = select_atom_indices(traj.topology, \"name CA\")\n",
83+
"pairs = np.array([[ca_indices[0], ca_indices[-1]], [ca_indices[10], ca_indices[50]]])\n",
84+
"\n",
85+
"result = compute_distances(traj, atom_pairs=pairs)\n",
86+
"print(f\"Pairs monitored: {result.atom_pairs.shape[0]}\")\n",
87+
"print(f\"Frames: {result.distances_nm.shape[0]}\")\n",
88+
"\n",
89+
"fig, ax = plt.subplots(figsize=(10, 4), dpi=120)\n",
90+
"plot_distances(result, ax=ax, pair_labels=[\"CA_first-CA_last\", \"CA_10-CA_50\"])\n",
91+
"ax.set_title(\"Pairwise CA Distances\")\n",
92+
"auto_ticks(ax)\n",
93+
"fig.tight_layout()"
94+
]
95+
},
96+
{
97+
"cell_type": "markdown",
98+
"id": "c1d2e3f4",
99+
"metadata": {},
100+
"source": [
101+
"## Minimum Distance Between Groups\n"
102+
]
103+
},
104+
{
105+
"cell_type": "code",
106+
"execution_count": null,
107+
"id": "a5b6c7d8",
108+
"metadata": {},
109+
"outputs": [],
110+
"source": [
111+
"# Minimum distance between two residue groups\n",
112+
"min_dist = compute_minimum_distance(traj, group1=\"resid 10\", group2=\"resid 50\")\n",
113+
"\n",
114+
"fig, ax = plt.subplots(figsize=(10, 4), dpi=120)\n",
115+
"plot_distances(min_dist, ax=ax, pair_labels=[\"min(resid 10 - resid 50)\"])\n",
116+
"ax.set_title(\"Minimum Inter-Residue Distance\")\n",
117+
"auto_ticks(ax)\n",
118+
"fig.tight_layout()"
119+
]
120+
},
121+
{
122+
"cell_type": "markdown",
123+
"id": "e9f0a1b2",
124+
"metadata": {},
125+
"source": [
126+
"## Backend Selection\n",
127+
"\n",
128+
"Five backends are available for pairwise distance computation:\n",
129+
"\n",
130+
"| Backend | Device | PBC | Dependency |\n",
131+
"|---------|--------|-----|------------|\n",
132+
"| `mdtraj` (default) | CPU (1 thread) | Yes | built-in |\n",
133+
"| `numba` | CPU (all cores) | No | built-in (numba) |\n",
134+
"| `cupy` | GPU (CUDA) | No | `pip install cupy-cuda12x` |\n",
135+
"| `torch` | GPU (CUDA) / CPU | No | `pip install torch` |\n",
136+
"| `jax` | GPU / TPU / CPU | No | `pip install jax[cuda12]` |\n",
137+
"\n",
138+
"Only `mdtraj` supports periodic boundary conditions. Use `numba` for\n",
139+
"faster non-periodic computation on multi-core machines.\n"
140+
]
141+
},
142+
{
143+
"cell_type": "code",
144+
"execution_count": null,
145+
"id": "c3d4e5f6",
146+
"metadata": {},
147+
"outputs": [],
148+
"source": [
149+
"# Numba backend: faster for non-periodic systems\n",
150+
"result_numba = compute_distances(traj, atom_pairs=pairs, backend=\"numba\", periodic=False)\n",
151+
"np.testing.assert_allclose(result.distances_nm, result_numba.distances_nm, atol=1e-5)\n",
152+
"print(\"mdtraj and numba backends agree.\")"
153+
]
154+
}
155+
],
156+
"metadata": {
157+
"kernelspec": {
158+
"display_name": "mdpp",
159+
"language": "python",
160+
"name": "python3"
161+
},
162+
"language_info": {
163+
"name": "python",
164+
"version": "3.13.12"
165+
}
166+
},
167+
"nbformat": 4,
168+
"nbformat_minor": 5
169+
}

0 commit comments

Comments
 (0)