Skip to content

Commit 18e024e

Browse files
authored
Update closing_remarks.md
1 parent 537e0fd commit 18e024e

1 file changed

Lines changed: 48 additions & 3 deletions

File tree

closing_remarks.md

Lines changed: 48 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,12 @@
1414
----------
1515

1616
## Final take-aways from the workshop
17+
- Spend the time to plan, consult, practice, (and money) to generate high quality data sets
18+
19+
- If you are going to do a lot of Bioinformatics, you should get really good at the command-line (Bash), otherwise, pre-processing will be slow & painful
20+
21+
- Identify, understand, and check key QC metrics to ensure the quality of your results
22+
1723
- Spend the time to plan your experiment well (enough replicates, appropriate sequencing depth, etc.). No analysis can rescue a bas dataset.
1824

1925
- If you will perform differential expression analysis regularly, you should build your experience in R, as well as you knowledge of fundamental statistical concepts
@@ -29,6 +35,8 @@
2935

3036
- Edit/annotate the code, run sub-sections, and read the `help` pages for important functions
3137

38+
- Practice with the complete dataset (all chromosomes), that is available to you for approx. 1 month on discovery in /scratch/rnaseq1/data/. This will give you experience running data from the entire genome, and an appreciation for the computational resources and required time to complete these tasks.
39+
3240
- Read the methods sections of published papers that perform differential expression analyses. This will help you to understand how many of the concepts we have discussed are applied and reported in practice
3341

3442
- Read reviews like [this one](https://pubmed.ncbi.nlm.nih.gov/31341269/) from Stark *et al*, 2019, *Nat. Rev. Genetics*, `RNA Sequencing: The Teenage Years`.
@@ -37,6 +45,44 @@
3745

3846
----------
3947

48+
## High performance computing (HPC)
49+
50+
51+
During the workshop, we all logged onto Discovery and ran our workflow interactively. Generally, this type of 'interactive' work is not encouraged on Discovery, and is better performed using other servers such as Andes & polaris (see article [here](https://rc.dartmouth.edu/index.php/hrf_faq/how-do-i-run-interactive-jobs/) on this topic from research computing).
52+
53+
However, working interactively with genomics data can be quite slow since many operations will need to be run in sequence across multiple samples. As an alternative, you can use the scheduler system on discovery, that controls how jobs are submitted and managed on the Discovery cluster.
54+
55+
Using the scheduler will allow you to make use of specific resources (such as high memory nodes etc.) and streamline your workflows more efficiently. Dartmouth just transitioned to using [Slurm](https://services.dartmouth.edu/TDClient/1806/Portal/KB/ArticleDet?ID=132625).
56+
57+
We encourage you to get comfortable using the scheduler and submitting jobs using an HPC system. [Research Computing](https://rc.dartmouth.edu/) has a lot of great material regarding using Discovery on their website.
58+
59+
----------
60+
61+
## Suggested reading:
62+
63+
Reading manuscripts that use RNA-seq, or reviews specifically focused on RNA-seq are excellent ways to further consolidate your learning.
64+
65+
In addition, reading the original manuscripts behind common tools will improve your understanding of how that tool works, and allow you to leverage more complicated options and implementations of that tool when required.
66+
67+
Below we provide some suggested reading to help get you on your way:
68+
69+
#### Review articles
70+
- [Stark *et al*, 2019, *Nat. Rev. Genetics*.](https://pubmed.ncbi.nlm.nih.gov/31341269/) `RNA Sequencing: The Teenage Years`
71+
- [Conesa *et al*, 2016, *Genome Biology*.](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0881-8) `A survey of best practices for RNA-seq data analysis`
72+
- [Wang, *et al*, 2009, *Nat. Rev. Genetics*.](https://www.nature.com/articles/nrg2484) `RNA-Seq: a revolutionary tool for transcriptomics`
73+
- [Cresko Lab, Univeristy of Oregon.](https://rnaseq.uoregon.edu/) `RNA-seqlopedia
74+
: provides an overview of RNA-seq and of the choices necessary to carry out a successful RNA-seq experiment.`
75+
76+
#### Original manuscripts: Popular RNA-seq tools
77+
- [Cutadapt:](http://journal.embnet.org/index.php/embnetjournal/article/view/200) `Cutadapt Removes Adapter Sequences From High-Throughput Sequencing Reads`.
78+
- [STAR:](https://academic.oup.com/bioinformatics/article/29/1/15/272537) `STAR: ultrafast universal RNA-seq aligner`
79+
- [HISAT2:](https://www.nature.com/articles/s41587-019-0201-4) `Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype`
80+
- [Bowtie2:](https://www.nature.com/articles/nmeth.1923)`Fast gapped-read alignment with Bowtie 2`
81+
- [HTSeq-count:](https://academic.oup.com/bioinformatics/article/31/2/166/2366196)`HTSeq—a Python framework to work with high-throughput sequencing data `
82+
- [DESeq2:](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8)`Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2`
83+
84+
----------
85+
4086
## Post DE analysis
4187

4288
After completing a DE analysis, we are usually left with a handful of genes that we wish to extract further meaning from. Depending on our hypothesis and what data is available to us, there are several ways to do this.
@@ -81,8 +127,7 @@ This workshop will be offered again, in addition to our other bioinformatics wor
81127

82128
---------
83129

84-
## Bioinformatics office hours & consultations
130+
## Bioinformatics consultations
85131

86-
Please reach out to us with questions related to content from this workshop, or for analysis consultations. We also host **bioinformatics office hours** on **Fridays 1-2pm** for general questions and inquiries (currently on Zoom: https://dartmouth.zoom.us/s/96998379866, password: *bioinfo*)
132+
Please reach out to us with questions related to content from this workshop, or for analysis consultations. We also host office hours and consultations by request.
87133

88-
### Now.... Discussion/question time!

0 commit comments

Comments
 (0)