Skip to content

Commit cea2ef6

Browse files
committed
Small edits
1 parent ae1eabd commit cea2ef6

2 files changed

Lines changed: 33 additions & 13 deletions

File tree

-36.3 KB
Loading

education/HADDOCK24/HADDOCK24-protein-glycan/index.md

Lines changed: 33 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ by glycosidic bonds. Glycans are involved in a wide range of biological processe
2424
cell-cell recognition, cell adhesion, and immune response. Glycan are highly diverse and complex
2525
in their structure, as they can involve multiple *branches* and different *linkages*, namely different ways
2626
in which a glycosidic bond can connect two monosaccharides. This complexity together with their flexibility
27-
makes the prediction of glycan-protein interactions a challenging task.
27+
makes the prediction of protein-glycan interactions a challenging task.
2828

2929
In this tutorial we will be working with *Family 16 Cabohydrate Binding Domain Module 1* of the *Caldanaerobius polysaccharolyticus* thermophile
3030
(PDB code [2ZEW](https://www.ebi.ac.uk/pdbe/entry/pdb/2ZEW){:target="_blank"}) and a linear homopolymer,
@@ -92,13 +92,13 @@ Can you already identify a possible binding site for a long, linear, unbranched
9292

9393
Here we assume that we have enough information about the glycan binding site on the protein, but no knowledge about which monosaccharide units are relevant for the binding. In this case (see Fig. 1), all the five monosaccharide units are at the interface, although this might not be true in general, especially when longer glycans are considered.
9494

95-
The following residues correspond to the protein binding site, as calculated from the crystal structure of the complex are:
95+
The residues corresponding to the glycan binding site on the protein (calculated from the crystal structure of the complex) are:
9696

9797
<pre style="background-color:#DAE4E7">
9898
23,24,80,82,84,96,98,100,124,126,128
9999
</pre>
100100

101-
Let us visualize the interface on our unbound protein structure. For this start PyMol and load the PDB file of the unbound protein:
101+
Let us visualize this interface on our unbound protein structure. For this start PyMol and load the PDB file of the unbound protein:
102102

103103
<a class="prompt prompt-pymol">
104104
File menu -> Open -> select 2ZEW_clean.pdb
@@ -160,13 +160,13 @@ fetch 2ZEX
160160
align 2ZEX_l_u, 2ZEX, cycles=0
161161
</a>
162162

163-
The `cycles=0` option will make sure that no atoms are neglected during the alignment and RMSD calculation. You can check what happens if you don't use this option.
163+
The `cycles=0` option will make sure that no atoms are neglected during the alignment and RMSD calculation. You can check what happens if you do not use this option.
164164

165165
<a class="prompt prompt-pymol">
166166
align 2ZEX_l_u, 2ZEX
167167
</a>
168168

169-
<a class="prompt prompt-question">What is the RMSD between the two glycan structures? In which of the five monosaccharide units is the model accurate? In which ones is it not?</a>
169+
<a class="prompt prompt-question">What is the RMSD between the two glycan structures (find the value reported by PyMOL)? In which of the five monosaccharide units is the model the most accurate? In which ones is it not?</a>
170170

171171
<figure align="center">
172172
<img width="90%" src="/education/HADDOCK24/HADDOCK24-protein-glycan/glycan_2zex_comparison.png">
@@ -176,7 +176,7 @@ align 2ZEX_l_u, 2ZEX
176176
</center>
177177
<br>
178178

179-
The two structures are pretty close to each other..let's see if HADDOCK can create a reasonable model of the interaction!
179+
The two structures are pretty close to each other... Let us next see if HADDOCK can create a reasonable model of the interaction!
180180

181181
<hr>
182182
<hr>
@@ -247,7 +247,7 @@ If everything went well, the interface window should have updated itself and it
247247
Active residues (directly involved in the interaction) -> 23,24,80,82,84,96,98,100,124,126,128
248248
</a>
249249

250-
Then uncheck the option to automatically define passive residues: in our case we're defining the whole protein pocket as active, so this is not required.
250+
Then uncheck the option to automatically define passive residues: in our case we are defining the whole protein pocket as active, and the glycan will be defined as passive only, so this is not required.
251251

252252
<a class="prompt prompt-info">Automatically define passive residues around the active residues -> **uncheck** (checked by default)
253253
</a>
@@ -263,15 +263,15 @@ By default HADDOCK will automatically filter our residues that have a relative s
263263

264264
* **Step 7:** Specify the residues for the second molecule. For this, unfold the "Molecule 2 - parameters" if not already unfolded.
265265

266-
Here we want to select the full glycan as passive, as we don't know whether all the monosaccharide units take part in the interaction.
266+
Here we want to select the full glycan as passive, as we do not know whether all the monosaccharide units take part in the interaction.
267267

268268
<a class="prompt prompt-info">Automatically define passive residues around the active residues -> **uncheck** (checked by default)
269269
</a>
270270

271271
<a class="prompt prompt-info">Click on the sequence (XXXXX) box to select the whole sequence of the glycan.
272272
</a>
273273

274-
<a class="prompt prompt-info">Cut the active residues selection (1,2,3,4,5) and paste it to the passive residues box.
274+
<a class="prompt prompt-info">This should automatically add the 5 glycan residues (1,2,3,4,5) to the passive residues box.
275275
</a>
276276

277277
<a class="prompt prompt-info">Click on the Visualize residues button and make sure all the glycan monosaccharide units have been selected. They should be highlighted in green to indicate that they are selected as passive.
@@ -283,7 +283,7 @@ Here we want to select the full glycan as passive, as we don't know whether all
283283

284284
Here we will tweak a few parameters to make the docking more accurate.
285285

286-
**Sampling parameters** : here let's just remove the final refinement step, as it is not needed for this type of docking.
286+
**Sampling parameters** : Here we will remove the final refinement step, as, as in the case of small ligands, it is not recommended for this type of molecules.
287287

288288
<a class="prompt prompt-info">Sampling parameters -> Perform final refinement -> **uncheck** (checked by default)
289289
</a>
@@ -330,6 +330,7 @@ In case you do not want to wait for your runs to be finished, a precalculated ru
330330
<a class="prompt prompt-question">Inspect the result page</a>
331331

332332
<a class="prompt prompt-question">How many clusters are generated?</a>
333+
333334
In the figure below you can see different parts of the result page.
334335

335336
**In A** the result page reports the number of clusters and for the top 10 clusters also the related statistics (HADDOCK score, Size, RMSD, Energies, BSA and Z-score).
@@ -418,15 +419,34 @@ rms_cur 2ZEX_target and chain B, cluster10_1 and chain B <br>
418419
What is the l-RMSD of the best model of the top cluster? What about the second and third clusters? Which of them is the best one?
419420
</a>
420421

421-
<a class="prompt prompt-question">Did the glycan conformation improve thanks to the refinement in any of the selected models?</a>
422+
423+
Let us now focus on the conformation of the glycan itself.
424+
425+
<a class="prompt prompt-question">Did the flexible refinement improved the glycan conformation?</a>
422426

423427
To address this question you can use the standard align command, focusing on chain B:
424428

425429
<a class="prompt prompt-pymol">
426430
align cluster10_1 and chain B, 2ZEX_target and chain B, cycles=0
427431
</a>
428432

429-
Let’s now check if the active residues which we have defined (the protein binding site) are actually part of the interface. In the PyMOL command window type:
433+
Compare the RMSD values you obtained with that of the conformation we used originally for docking.
434+
435+
<details style="background-color:#DAE4E7">
436+
<summary style="bold">
437+
<i>See RMSD values of the orignal and docked glycan conformations with respect to that of the reference crystal structure</i>
438+
</summary>
439+
<pre>
440+
2ZEX_l_u.pdb
441+
cluster1_1.pdb
442+
cluster2_1.pdb
443+
cluster3_1.pdb
444+
...
445+
</pre>
446+
<br>
447+
</details>
448+
449+
Let us now check if the active residues which we have defined (the protein binding site) are actually part of the interface. In the PyMOL command window type:
430450

431451
<a class="prompt prompt-pymol">
432452
select binding_site, chain A and (resi 23+24+80+82+84+96+98+100+124+126+128) and not 2ZEX_target<br>
@@ -455,7 +475,7 @@ Are the residues of the binding_site at the interface with the glycan?
455475

456476
## Conclusions
457477

458-
In this tutorial we have demonstrated the use of the HADDOCK 2.4 webserver to predict the structure of a protein-glycan complex using information about the protein binding site. Always check and compare multiple clusters, don't blindly trust the cluster with the best HADDOCK score! We have also discussed the analysis of the docking results and the comparison with the reference structure.
478+
In this tutorial we have demonstrated the use of the HADDOCK 2.4 webserver to predict the structure of a protein-glycan complex using information about the protein binding site. Always check and compare multiple clusters, do not blindly trust the cluster with the best HADDOCK score! We have also discussed the analysis of the docking results and the comparison with the reference structure.
459479

460480
We hope you have enjoyed this tutorial and that you have learned something new. If you have any questions or feedback, please do not hesitate to contact us on the [HADDOCK forum][link-forum]{:target="_blank"}.
461481

0 commit comments

Comments
 (0)