You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: appyters/Bulk_RNA_seq/RNA_seq_Analysis_Pipeline.ipynb
+22-25Lines changed: 22 additions & 25 deletions
Original file line number
Diff line number
Diff line change
@@ -479,7 +479,7 @@
479
479
"cell_type": "markdown",
480
480
"metadata": {},
481
481
"source": [
482
-
"# Load datasets"
482
+
"# Loaded datasets"
483
483
]
484
484
},
485
485
{
@@ -630,7 +630,7 @@
630
630
"source": [
631
631
"%%appyter markdown\n",
632
632
"{% if visualization_method.value == \"PCA\" %}\n",
633
-
"Principal Component Analysis (PCA) (Clark et al. 2011) is a statistical technique used to identify global patterns in high-dimensional datasets. It is commonly used to explore the similarity of biological samples in RNA-seq datasets. To achieve this, gene expression values are transformed into Principal Components (PCs), a set of linearly uncorrelated features which represent the most relevant sources of variance in the data, and subsequently visualized using a scatter plot.\n",
633
+
"Principal Component Analysis (PCA) [1] is a statistical technique used to identify global patterns in high-dimensional datasets. It is commonly used to explore the similarity of biological samples in RNA-seq datasets. To achieve this, gene expression values are transformed into Principal Components (PCs), a set of linearly uncorrelated features which represent the most relevant sources of variance in the data, and subsequently visualized using a scatter plot.\n",
"Clustergrammer (Fernandez et al. 2017) is a web-based tool for visualizing and analyzing high-dimensional data as interactive and hierarchically clustered heatmaps. It is commonly used to explore the similarity between samples in an RNA-seq dataset. In addition to identifying clusters of samples, it also allows to identify the genes which contribute to the clustering."
674
+
"Clustergrammer [2] is a web-based tool for visualizing and analyzing high-dimensional data as interactive and hierarchically clustered heatmaps. It is commonly used to explore the similarity between samples in an RNA-seq dataset. In addition to identifying clusters of samples, it also allows to identify the genes which contribute to the clustering."
676
675
]
677
676
},
678
677
{
@@ -720,7 +719,6 @@
720
719
" fig.show(renderer=\"png\")\n",
721
720
"else:\n",
722
721
" fig.show()\n",
723
-
724
722
"plot_name = \"library_size_plot.png\"\n",
725
723
"fig.write_image(plot_name)\n",
726
724
"figure_counter, notebook_metadata = display_object(figure_counter, \"Histogram of the total number of reads mapped for each sample. The figure contains an interactive bar chart which displays the number of samples according to the total number of reads mapped to each RNA-seq sample in the dataset. Additional information for each sample is available by hovering over the bars.\", notebook_metadata, saved_filename=plot_name, istable=False)"
@@ -737,7 +735,7 @@
737
735
"cell_type": "markdown",
738
736
"metadata": {},
739
737
"source": [
740
-
"Gene expression signatures are alterations in the patterns of gene expression that occur as a result of cellular perturbations such as drug treatments, gene knock-downs or diseases. They can be quantified using differential gene expression (DGE) methods (Ritchie et al. 2015, Clark et al. 2014), which compare gene expression between two groups of samples to identify genes whose expression is significantly altered in the perturbation. "
738
+
"Gene expression signatures are alterations in the patterns of gene expression that occur as a result of cellular perturbations such as drug treatments, gene knock-downs or diseases. They can be quantified using differential gene expression (DGE) methods [3, 4], which compare gene expression between two groups of samples to identify genes whose expression is significantly altered in the perturbation. "
741
739
]
742
740
},
743
741
{
@@ -798,7 +796,7 @@
798
796
"cell_type": "markdown",
799
797
"metadata": {},
800
798
"source": [
801
-
"Enrichment analysis is a statistical procedure used to identify biological terms which are over-represented in a given gene set. These include signaling pathways, molecular functions, diseases, and a wide variety of other biological terms obtained by integrating prior knowledge of gene function from multiple resources. Enrichr (Kuleshov et al. 2016) is a web-based application which allows to perform enrichment analysis using a large collection of gene-set libraries and various interactive approaches to display enrichment results."
799
+
"Enrichment analysis is a statistical procedure used to identify biological terms which are over-represented in a given gene set. These include signaling pathways, molecular functions, diseases, and a wide variety of other biological terms obtained by integrating prior knowledge of gene function from multiple resources. Enrichr [5] is a web-based application which allows to perform enrichment analysis using a large collection of gene-set libraries and various interactive approaches to display enrichment results."
"table_counter, notebook_metadata = display_object(table_counter, \"The table displays links to Enrichr containing the results of enrichment analyses generated by analyzing the up-regulated and down-regulated genes from a differential expression analysis. By clicking on these links, users can interactively explore and download the enrichment results from the Enrichr website.\", notebook_metadata=notebook_metadata, saved_filename=\"enrichr_links.csv\", df=enrichr_link_df, ishtml=True)"
" figure_counter, notebook_metadata = display_object(figure_counter, \"Enrichment Analysis Results for {} in {}. The figure contains interactive bar charts displaying the results of the pathway enrichment analysis generated using Enrichr. The x axis indicates the -log10(P-value) for each term. Significant terms are highlighted in bold. Additional information about enrichment results is available by hovering over each bar.\".format(label, gene_set_library), notebook_metadata, saved_filename=plot_name, istable=False)\n",
930
926
"{% endif %}"
931
927
]
@@ -1124,45 +1120,46 @@
1124
1120
"cell_type": "markdown",
1125
1121
"metadata": {},
1126
1122
"source": [
1123
+
"1. Clark, N.R. and Ma’ayan, A. (2011) Introduction to statistical methods to analyze large data sets: principal components analysis. Sci. Signal., 4, tr3-tr3.\n",
1124
+
"<br>\n",
1125
+
"2. Fernandez, Nicolas F., et al. \"Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data.\" Scientific data 4 (2017): 170151.\n",
1126
+
"<br>\n",
1127
+
"3. Ritchie, Matthew E., et al. \"limma powers differential expression analyses for RNA-sequencing and microarray studies.\" Nucleic acids research 43.7 (2015): e47-e47.\n",
1128
+
"<br>\n",
1129
+
"4. Clark, Neil R., et al. \"The characteristic direction: a geometrical approach to identify differentially expressed genes.\" BMC bioinformatics 15.1 (2014): 79.\n",
1130
+
"<br>\n",
1131
+
"5. Kuleshov, M.V., Jones, M.R., Rouillard, A.D., Fernandez, N.F., Duan, Q., Wang, Z., Koplev, S., Jenkins, S.L., Jagodnik, K.M. and Lachmann, A. (2016) Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic acids research, 44, W90-W97.\n",
1132
+
"<br>\n",
1133
+
"\n",
1127
1134
"Agarwal, Vikram, et al. \"Predicting effective microRNA target sites in mammalian mRNAs.\" elife 4 (2015): e05005.\n",
1128
1135
"<br>\n",
1129
1136
"Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S. and Eppig, J.T. (2000) Gene Ontology: tool for the unification of biology. Nature genetics, 25, 25.\n",
1130
1137
"<br>\n",
1131
1138
"Chou, Chih-Hung, et al. \"miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database.\" Nucleic acids research 44.D1 (2016): D239-D247.\n",
1132
1139
"<br>\n",
1133
-
"Clark, N.R. and Ma’ayan, A. (2011) Introduction to statistical methods to analyze large data sets: principal components analysis. Sci. Signal., 4, tr3-tr3.\n",
1134
-
"<br>\n",
1135
-
"Clark, Neil R., et al. \"The characteristic direction: a geometrical approach to identify differentially expressed genes.\" BMC bioinformatics 15.1 (2014): 79.\n",
1136
-
"<br>\n",
1137
1140
"Consortium, E.P. (2004) The ENCODE (ENCyclopedia of DNA elements) project. Science, 306, 636-640.\n",
1138
1141
"<br>\n",
1139
1142
"Croft, David, et al. \"The Reactome pathway knowledgebase.\" Nucleic acids research 42.D1 (2014): D472-D477.\n",
1140
1143
"<br>\n",
1141
1144
"Duan, Q., et al. \"L1000cds2: Lincs l1000 characteristic direction signatures search engine. NPJ Syst Biol Appl. 2016; 2: 16015.\" (2016).\n",
1142
1145
"<br>\n",
1143
-
"Fernandez, Nicolas F., et al. \"Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data.\" Scientific data 4 (2017): 170151.\n",
1144
-
"<br>\n",
1145
1146
"Kanehisa, M. and Goto, S. (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research, 28, 27-30.\n",
1146
1147
"<br>\n",
1147
1148
"Kelder, Thomas, et al. \"WikiPathways: building research communities on biological pathways.\" Nucleic acids research 40.D1 (2012): D1301-D1307.\n",
1148
1149
"<br>\n",
1149
-
"Kuleshov, M.V., Jones, M.R., Rouillard, A.D., Fernandez, N.F., Duan, Q., Wang, Z., Koplev, S., Jenkins, S.L., Jagodnik, K.M. and Lachmann, A. (2016) Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic acids research, 44, W90-W97.\n",
1150
-
"<br>\n",
1151
1150
"Lachmann, A., Xu, H., Krishnan, J., Berger, S.I., Mazloom, A.R. and Ma'ayan, A. (2010) ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics, 26, 2438-2444.\n",
1152
1151
"<br>\n",
1153
1152
"Lachmann, Alexander, and Avi Ma'ayan. \"KEA: kinase enrichment analysis.\" Bioinformatics 25.5 (2009): 684-686.\n",
1154
1153
"<br>\n",
1155
-
"Ritchie, Matthew E., et al. \"limma powers differential expression analyses for RNA-sequencing and microarray studies.\" Nucleic acids research 43.7 (2015): e47-e47.\n",
1156
-
"<br>\n",
1157
1154
"Wang, Zichen, et al. \"L1000FWD: fireworks visualization of drug-induced transcriptomic signatures.\" Bioinformatics 34.12 (2018): 2150-2152."
0 commit comments