You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: education/HADDOCK24/HADDOCK24-protein-glycan/index.md
+33-13Lines changed: 33 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ by glycosidic bonds. Glycans are involved in a wide range of biological processe
24
24
cell-cell recognition, cell adhesion, and immune response. Glycan are highly diverse and complex
25
25
in their structure, as they can involve multiple *branches* and different *linkages*, namely different ways
26
26
in which a glycosidic bond can connect two monosaccharides. This complexity together with their flexibility
27
-
makes the prediction of glycan-protein interactions a challenging task.
27
+
makes the prediction of protein-glycan interactions a challenging task.
28
28
29
29
In this tutorial we will be working with *Family 16 Cabohydrate Binding Domain Module 1* of the *Caldanaerobius polysaccharolyticus* thermophile
30
30
(PDB code [2ZEW](https://www.ebi.ac.uk/pdbe/entry/pdb/2ZEW){:target="_blank"}) and a linear homopolymer,
@@ -92,13 +92,13 @@ Can you already identify a possible binding site for a long, linear, unbranched
92
92
93
93
Here we assume that we have enough information about the glycan binding site on the protein, but no knowledge about which monosaccharide units are relevant for the binding. In this case (see Fig. 1), all the five monosaccharide units are at the interface, although this might not be true in general, especially when longer glycans are considered.
94
94
95
-
The following residues correspond to the protein binding site, as calculated from the crystal structure of the complex are:
95
+
The residues corresponding to the glycan binding site on the protein (calculated from the crystal structure of the complex) are:
96
96
97
97
<prestyle="background-color:#DAE4E7">
98
98
23,24,80,82,84,96,98,100,124,126,128
99
99
</pre>
100
100
101
-
Let us visualize the interface on our unbound protein structure. For this start PyMol and load the PDB file of the unbound protein:
101
+
Let us visualize this interface on our unbound protein structure. For this start PyMol and load the PDB file of the unbound protein:
102
102
103
103
<aclass="prompt prompt-pymol">
104
104
File menu -> Open -> select 2ZEW_clean.pdb
@@ -160,13 +160,13 @@ fetch 2ZEX
160
160
align 2ZEX_l_u, 2ZEX, cycles=0
161
161
</a>
162
162
163
-
The `cycles=0` option will make sure that no atoms are neglected during the alignment and RMSD calculation. You can check what happens if you don't use this option.
163
+
The `cycles=0` option will make sure that no atoms are neglected during the alignment and RMSD calculation. You can check what happens if you do not use this option.
164
164
165
165
<aclass="prompt prompt-pymol">
166
166
align 2ZEX_l_u, 2ZEX
167
167
</a>
168
168
169
-
<aclass="prompt prompt-question">What is the RMSD between the two glycan structures? In which of the five monosaccharide units is the model accurate? In which ones is it not?</a>
169
+
<aclass="prompt prompt-question">What is the RMSD between the two glycan structures (find the value reported by PyMOL)? In which of the five monosaccharide units is the model the most accurate? In which ones is it not?</a>
The two structures are pretty close to each other..let's see if HADDOCK can create a reasonable model of the interaction!
179
+
The two structures are pretty close to each other... Let us next see if HADDOCK can create a reasonable model of the interaction!
180
180
181
181
<hr>
182
182
<hr>
@@ -247,7 +247,7 @@ If everything went well, the interface window should have updated itself and it
247
247
Active residues (directly involved in the interaction) -> 23,24,80,82,84,96,98,100,124,126,128
248
248
</a>
249
249
250
-
Then uncheck the option to automatically define passive residues: in our case we're defining the whole protein pocket as active, so this is not required.
250
+
Then uncheck the option to automatically define passive residues: in our case we are defining the whole protein pocket as active, and the glycan will be defined as passive only, so this is not required.
251
251
252
252
<aclass="prompt prompt-info">Automatically define passive residues around the active residues -> **uncheck** (checked by default)
253
253
</a>
@@ -263,15 +263,15 @@ By default HADDOCK will automatically filter our residues that have a relative s
263
263
264
264
***Step 7:** Specify the residues for the second molecule. For this, unfold the "Molecule 2 - parameters" if not already unfolded.
265
265
266
-
Here we want to select the full glycan as passive, as we don't know whether all the monosaccharide units take part in the interaction.
266
+
Here we want to select the full glycan as passive, as we do not know whether all the monosaccharide units take part in the interaction.
267
267
268
268
<aclass="prompt prompt-info">Automatically define passive residues around the active residues -> **uncheck** (checked by default)
269
269
</a>
270
270
271
271
<aclass="prompt prompt-info">Click on the sequence (XXXXX) box to select the whole sequence of the glycan.
272
272
</a>
273
273
274
-
<aclass="prompt prompt-info">Cut the active residues selection (1,2,3,4,5) and paste it to the passive residues box.
274
+
<aclass="prompt prompt-info">This should automatically add the 5 glycan residues (1,2,3,4,5) to the passive residues box.
275
275
</a>
276
276
277
277
<aclass="prompt prompt-info">Click on the Visualize residues button and make sure all the glycan monosaccharide units have been selected. They should be highlighted in green to indicate that they are selected as passive.
@@ -283,7 +283,7 @@ Here we want to select the full glycan as passive, as we don't know whether all
283
283
284
284
Here we will tweak a few parameters to make the docking more accurate.
285
285
286
-
**Sampling parameters** : here let's just remove the final refinement step, asit is not needed for this type of docking.
286
+
**Sampling parameters** : Here we will remove the final refinement step, as, as in the case of small ligands, it is not recommended for this type of molecules.
287
287
288
288
<aclass="prompt prompt-info">Sampling parameters -> Perform final refinement -> **uncheck** (checked by default)
289
289
</a>
@@ -330,6 +330,7 @@ In case you do not want to wait for your runs to be finished, a precalculated ru
330
330
<aclass="prompt prompt-question">Inspect the result page</a>
331
331
332
332
<aclass="prompt prompt-question">How many clusters are generated?</a>
333
+
333
334
In the figure below you can see different parts of the result page.
334
335
335
336
**In A** the result page reports the number of clusters and for the top 10 clusters also the related statistics (HADDOCK score, Size, RMSD, Energies, BSA and Z-score).
@@ -418,15 +419,34 @@ rms_cur 2ZEX_target and chain B, cluster10_1 and chain B <br>
418
419
What is the l-RMSD of the best model of the top cluster? What about the second and third clusters? Which of them is the best one?
419
420
</a>
420
421
421
-
<aclass="prompt prompt-question">Did the glycan conformation improve thanks to the refinement in any of the selected models?</a>
422
+
423
+
Let us now focus on the conformation of the glycan itself.
424
+
425
+
<aclass="prompt prompt-question">Did the flexible refinement improved the glycan conformation?</a>
422
426
423
427
To address this question you can use the standard align command, focusing on chain B:
424
428
425
429
<aclass="prompt prompt-pymol">
426
430
align cluster10_1 and chain B, 2ZEX_target and chain B, cycles=0
427
431
</a>
428
432
429
-
Let’s now check if the active residues which we have defined (the protein binding site) are actually part of the interface. In the PyMOL command window type:
433
+
Compare the RMSD values you obtained with that of the conformation we used originally for docking.
434
+
435
+
<detailsstyle="background-color:#DAE4E7">
436
+
<summarystyle="bold">
437
+
<i>See RMSD values of the orignal and docked glycan conformations with respect to that of the reference crystal structure</i>
438
+
</summary>
439
+
<pre>
440
+
2ZEX_l_u.pdb
441
+
cluster1_1.pdb
442
+
cluster2_1.pdb
443
+
cluster3_1.pdb
444
+
...
445
+
</pre>
446
+
<br>
447
+
</details>
448
+
449
+
Let us now check if the active residues which we have defined (the protein binding site) are actually part of the interface. In the PyMOL command window type:
430
450
431
451
<aclass="prompt prompt-pymol">
432
452
select binding_site, chain A and (resi 23+24+80+82+84+96+98+100+124+126+128) and not 2ZEX_target<br>
@@ -455,7 +475,7 @@ Are the residues of the binding_site at the interface with the glycan?
455
475
456
476
## Conclusions
457
477
458
-
In this tutorial we have demonstrated the use of the HADDOCK 2.4 webserver to predict the structure of a protein-glycan complex using information about the protein binding site. Always check and compare multiple clusters, don't blindly trust the cluster with the best HADDOCK score! We have also discussed the analysis of the docking results and the comparison with the reference structure.
478
+
In this tutorial we have demonstrated the use of the HADDOCK 2.4 webserver to predict the structure of a protein-glycan complex using information about the protein binding site. Always check and compare multiple clusters, do not blindly trust the cluster with the best HADDOCK score! We have also discussed the analysis of the docking results and the comparison with the reference structure.
459
479
460
480
We hope you have enjoyed this tutorial and that you have learned something new. If you have any questions or feedback, please do not hesitate to contact us on the [HADDOCK forum][link-forum]{:target="_blank"}.
0 commit comments