|
21 | 21 | "source": [ |
22 | 22 | "## Overview\n", |
23 | 23 | "\n", |
24 | | - "Bandage can be used to visualize and interact with pangenomic graphs. You will learn how to visualize graphs with Bandage and how to BLAST DNA sequences and visualize them on the graph." |
| 24 | + "In this module, you will learn how to visualize graphs with Bandage and how to BLAST DNA sequences and visualize them on the graph." |
25 | 25 | ] |
26 | 26 | }, |
27 | 27 | { |
|
38 | 38 | "cell_type": "markdown", |
39 | 39 | "metadata": {}, |
40 | 40 | "source": [ |
41 | | - "## Get Started\n", |
| 41 | + "## Getting Started\n", |
42 | 42 | "\n", |
43 | 43 | "In this submodule you will learn how to visualize graphs with Bandage, how to BLAST query sequences directly against the graph in Bandage, and how to trace paths through the graph.\n", |
44 | 44 | "\n", |
|
98 | 98 | "\n", |
99 | 99 | "1. Open up Bandage in your browser:\n", |
100 | 100 | "\n", |
101 | | - "From the Launcher tab (File : New Launcher), scroll down to the bottom. In the \"Visualization Software\" section, click on \"Bandage\".\n", |
102 | | - "\n", |
103 | | - "<details>\n", |
104 | | - " \n", |
105 | | - "<summary>Resetting Bandage</summary>\n", |
106 | | - "\n", |
107 | | - "Sometimes the Bandage software can break, i.e. there will be a message that say \"KasmVNC encountered an error.\"\n", |
108 | | - "When this occurs, you can reset the Bandage software by opening a Terminal in JupyterLab (\"File\" -> \"New Launcher\" -> \"Terminal\") and running the following commands:\n", |
109 | | - "```bash\n", |
110 | | - "cd ~\n", |
111 | | - "docker compose -f NIGMS-Sandbox-Pangenomics-Module/bandage/compose.yml up -d --build --force-recreate\n", |
112 | | - "```\n", |
113 | | - "\n", |
114 | | - "</details>" |
| 101 | + "From the Launcher tab (\"File\" -> \"New Launcher\"), scroll down to the bottom. In the \"Visualization Software\" section, click on \"Bandage\"." |
115 | 102 | ] |
116 | 103 | }, |
117 | 104 | { |
|
120 | 107 | "source": [ |
121 | 108 | "### Load Graph\n", |
122 | 109 | "\n", |
123 | | - "2. Once in Bandage, choose \"File : Load Graph\". Navigate to \"yprp.chrVIII.pggb.gfa\" (click on \"Computer\" then \"/\" then navigate to /home/jupyter/NIGMS-Sandbox-Pangenomics-Module/module_notebooks/graphs/yprp.chrVIII.pggb.gfa). Click \"Open Graph\".\n", |
| 110 | + "2. Once in Bandage, choose \"File\" -> \"Load Graph\". Navigate to \"yprp.chrVIII.pggb.gfa\" (click on \"Computer\" then \"/\" then navigate to /home/jupyter/NIGMS-Sandbox-Pangenomics-Module/module_notebooks/graphs/yprp.chrVIII.pggb.gfa). Click \"Open Graph\".\n", |
124 | 111 | "\n", |
125 | 112 | "\n", |
126 | | - "3. Click on \"Draw Graph\"." |
| 113 | + "3. Click on \"Draw graph\"." |
127 | 114 | ] |
128 | 115 | }, |
129 | 116 | { |
|
137 | 124 | "cell_type": "markdown", |
138 | 125 | "metadata": {}, |
139 | 126 | "source": [ |
140 | | - "4. Under \"Graph Information\", choose \"More Info\" to see more stats about the graph." |
| 127 | + "4. Under \"Graph information\", choose \"More info\" to see more stats about the graph." |
141 | 128 | ] |
142 | 129 | }, |
143 | 130 | { |
|
152 | 139 | "\n", |
153 | 140 | "2. Click on \"Build BLAST database\" to build a database from the graph.\n", |
154 | 141 | "\n", |
155 | | - "3. Click on \"Load from FASTA file\" and choose the \"genes.fa\" file.\n", |
| 142 | + "3. Click on \"Load from FASTA file\" and choose the \"genes.fa\" file (click on \"Computer\" then \"/\" then navigate to /home/jupyter/NIGMS-Sandbox-Pangenomics-Module/module_notebooks/genes/genes.fa).\n", |
156 | 143 | "\n", |
157 | 144 | "4. Click \"Run BLAST search\"." |
158 | 145 | ] |
|
163 | 150 | "source": [ |
164 | 151 | "5. Take a look at the table down below and see the hits that both of the genes have and the colors each has been assigned.\n", |
165 | 152 | "\n", |
166 | | - "6. Hit \"close\"." |
| 153 | + "6. Hit \"Close\"." |
167 | 154 | ] |
168 | 155 | }, |
169 | 156 | { |
170 | 157 | "cell_type": "markdown", |
171 | 158 | "metadata": {}, |
172 | 159 | "source": [ |
173 | | - "7. Under \"Graph Drawing\", change scope to \"Around Query Hits\" and in the \"Distance\" box, type in 100. This will show the CUP1 and YHR054C gene hits plus the surrounding 100 nodes.\n", |
| 160 | + "7. Under \"Graph drawing\", change scope to \"Around query hits\" and in the \"Distance\" box, type in 100. This will show the CUP1 and YHR054C gene hits plus the surrounding 100 nodes. Then click \"Draw graph.\"\n", |
174 | 161 | "\n", |
175 | | - "8. Under \"Graph Display\", change \"Random Colours\" to \"Gray Color\" so that the gene hits will stand out.\n", |
| 162 | + "8. Under \"Graph Display\", change \"Random Colours\" to \"Gray color\" so that the gene hits will stand out.\n", |
176 | 163 | "\n", |
177 | | - "9. Scroll down on the lefthand side and click on \"Annotations\". Double-click on the \"Blast Hits\" that appears under annotations. Click on \"Solid\" and click on the \"x\" in the upper right to close the screen. This will show the blast hits in solid colors." |
| 164 | + "9. Scroll down on the lefthand side and click on \"Annotations.\" Double-click on the \"Blast Hits\" that appears under annotations. Click on \"Solid\" and click on the \"x\" in the upper-right corner to close the window. This will show the BLAST hits in solid colors." |
178 | 165 | ] |
179 | 166 | }, |
180 | 167 | { |
181 | 168 | "cell_type": "markdown", |
182 | 169 | "metadata": {}, |
183 | 170 | "source": [ |
184 | | - "<div class=\"alert alert-block alert-info\"> <b>NOTE:</b> If you don't see both blue (CUP1) and green (YHR054C) genes then choose \"Query: All\" under \"Graph Search\"." |
| 171 | + "<div class=\"alert alert-block alert-info\"> <b>NOTE:</b> If you don't see both blue (CUP1) and green (YHR054C) genes then choose \"Query: all\" under \"Graph search\"." |
185 | 172 | ] |
186 | 173 | }, |
187 | 174 | { |
|
230 | 217 | "cell_type": "markdown", |
231 | 218 | "metadata": {}, |
232 | 219 | "source": [ |
233 | | - "1. Look at the SK1 path through the CUP1 region of the graph. On the righthand side under \"Find paths\", in the name box, start typing \"SK1_chrVIII\" (it should pop up as you type and you can choose it).\n", |
| 220 | + "1. Look at the SK1 path through the CUP1 region of the graph. On the righthand side under \"Find paths,\" in the name box, start typing \"SK1_chrVIII\" (it should pop up as you type and you can choose it).\n", |
234 | 221 | "\n", |
235 | | - "2. Click on \"Find Path\" and dismiss the \"Nodes not found\" dialog that pops up (\"x\" in the upper right of the dialog box). The nodes not found are the nodes in the SK1 genome assembly path that are not in our view.\n", |
| 222 | + "2. Click on \"Find path\" and close the \"Nodes not found\" dialog that pops up (\"x\" in the upper-right of the dialog box). The nodes not found are the nodes in the SK1 genome assembly path that are not in our view.\n", |
236 | 223 | "\n", |
237 | | - "3. Click on \"recolor\" and then \"set colour\" and choose a color. This will color any nodes this genome goes through, though the coloring will be underneath the BLAST hit colors. It will also change the color under \"Graph Display\" to \"Custom Colours.\"" |
| 224 | + "3. Under \"Selected nodes\" click on \"Set colour\" and choose a color. This will color any nodes this genome goes through, though the coloring will be underneath the BLAST hit colors. It will also change the color under \"Graph display\" to \"Custom colours.\"" |
238 | 225 | ] |
239 | 226 | }, |
240 | 227 | { |
|
248 | 235 | "cell_type": "markdown", |
249 | 236 | "metadata": {}, |
250 | 237 | "source": [ |
251 | | - "4. See if you can figure out how to trace the highlighted path through the graph. To help, you can have Bandage show the node labels (`Node labels: Name` on the lefthand panel) and compare them to the paths from the GFA file (see more information in the video linked below).\n", |
| 238 | + "4. See if you can figure out how to trace the highlighted path through the graph. To help, you can have Bandage show the node labels (by selecting \"Name\" under \"Node labels\" in the lefthand panel) and compare them to the paths from the GFA file (see more information in the video linked below).\n", |
252 | 239 | "\n", |
253 | 240 | "Example: Extract the S288C subpath for the CUP1 region out of the GFA file. Based on the Bandage visualization with node labels on, the relevant nodes range from ~7200-7700. We will use `grep` to get the S288C_chrVIII path line, `sed` to introduce hard returns so that each node is on its own line, and then `awk` to get the relevant node numbers. We will redirect the output into a file called *S288C_CUP1_subpath.txt*. Then take a look at the file using `head`." |
254 | 241 | ] |
|
347 | 334 | "cell_type": "markdown", |
348 | 335 | "metadata": {}, |
349 | 336 | "source": [ |
350 | | - "And here is the published figure that you saw in the previous chapter. The number of CUP1 genes and YHR054C genes we found in our graph match those in the figures (see video XXX for more details)." |
| 337 | + "And here is the published figure that you saw in the previous chapter. The number of CUP1 genes and YHR054C genes we found in our graph match those in the figures." |
351 | 338 | ] |
352 | 339 | }, |
353 | 340 | { |
|
376 | 363 | "----------------------\n", |
377 | 364 | "\n", |
378 | 365 | "## Conclusion\n", |
379 | | - "You have learned how to visualize a pangenomic graph, find genes using BLAST, and interact with the graph structures.\n", |
| 366 | + "In this submodule, you learned how to visualize a pangenomic graph, find genes using BLAST, and interact with the graph structures.\n", |
380 | 367 | "\n", |
381 | 368 | "In the next submodule, you will learn how to index graphs to get them ready for downstream analyses.\n", |
382 | 369 | "\n", |
|
391 | 378 | "\n", |
392 | 379 | "<div class=\"alert alert-warning\">No cleanup is necessary for this submodule. Don't forget to shutdown your Workbench when you are done working through this module!.</div>" |
393 | 380 | ] |
| 381 | + }, |
| 382 | + { |
| 383 | + "cell_type": "markdown", |
| 384 | + "metadata": {}, |
| 385 | + "source": [ |
| 386 | + "## Video\n", |
| 387 | + "\n", |
| 388 | + "Here is a video that walks through the contents of this submodule: XXX" |
| 389 | + ] |
394 | 390 | } |
395 | 391 | ], |
396 | 392 | "metadata": { |
|
401 | 397 | "uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m127" |
402 | 398 | }, |
403 | 399 | "kernelspec": { |
404 | | - "display_name": "nigms-pangenomics", |
| 400 | + "display_name": "nigms-pangenomics (Local)", |
405 | 401 | "language": "python", |
406 | 402 | "name": "conda-env-nigms-pangenomics-nigms-pangenomics" |
407 | 403 | }, |
|
415 | 411 | "name": "python", |
416 | 412 | "nbconvert_exporter": "python", |
417 | 413 | "pygments_lexer": "ipython3", |
418 | | - "version": "3.12.9" |
| 414 | + "version": "3.12.10" |
419 | 415 | } |
420 | 416 | }, |
421 | 417 | "nbformat": 4, |
|
0 commit comments