You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,8 @@
12
12
13
13
## Introduction
14
14
15
-
**rich_longTranscriptomics** is a nextflow pipeline that is used for the processing of direct RNA nanopore sequencing data, providing multiple transcript reconstruction options, and quantification with the use of a reference genome, and transcriptome annotation.
15
+
**LongTranscriptomics** is a nextflow pipeline that is used for the processing of direct RNA nanopore sequencing data, providing multiple transcript reconstruction options, and quantification with the use of a reference genome, and transcriptome annotation.
16
+
16
17
<!-- Additionally, it performs post transcriptome reconstruction assessment, and recovery. -->
17
18
18
19
The pipeline currently _only_ accepts sequencing data from directRNA Oxford
Copy file name to clipboardExpand all lines: assets/multiqc_config.yml
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,11 @@ report_comment: >
2
2
This report has been generated by the <a href="https://github.com/number-25/LongTranscriptomics/tree/dev" target="_blank">number-25/LongTranscriptomics</a>
-[Extensive QC of alignments](#Alignment-quality-control)
@@ -150,7 +150,7 @@ toolkit for working with SAM/BAM files. It is used to sort the output from
150
150
minimap2 (SAM format) and output it in compressed BAM format, and then index
151
151
this file.
152
152
153
-
## Create files to visualise mapping
153
+
## Create files to visualise mapping coverage
154
154
155
155
### bedtools
156
156
@@ -184,7 +184,15 @@ tool that is part of a broad UCSC software suite. It has one specific
184
184
function that can be guessed from it's very name. You guessed it, it converts a
185
185
bedgraph to a BigWig file, that's it. Once created, the BigWig files can be
186
186
loaded into a genome browser such as IGV, allowing the mapping to be visualised
187
-
in a lightweight way.
187
+
in a lightweight way. The bigWig format is an indexed binary format useful for
188
+
displaying dense, continuous data in Genome Browsers such as the UCSC and IGV.
189
+
This mitigates the need to load the much larger BAM file for data visualisation
190
+
purposes which will be slower and result in memory issues. The bigWig format is
191
+
also supported by various bioinformatics software for downstream processing
192
+
such as meta-profile plotting.
193
+
194
+
bigBed are more useful for displaying distribution of reads across exon
195
+
intervals as is typically observed for RNA-seq dat
188
196
189
197
## Alignment quality control
190
198
@@ -360,8 +368,7 @@ variants for each gene locus. StringTie does not perform read correction.
360
368
361
369
</details>
362
370
363
-
[gffcompare](https://ccb.jhu.edu/software/stringtie/gff.shtml#gffcompare) can be used to compare, merge, annotate and estimate
364
-
accuracy of one or more GFF files (the "query" files), when compared with a
371
+
[gffcompare](https://ccb.jhu.edu/software/stringtie/gff.shtml#gffcompare) can be used to compare, merge, annotate and estimate accuracy of one or more GFF files (the "query" files), when compared with a
-`*.quant.gz`: a tab separated file listing the quantified targets, as well as information about their length and other metadata. The num_reads column provides the estimate of the number of reads originating from each target.
398
+
-`*.meta_info.json`: a JSON format file containing information about relevant parameters with which oarfish was run, and other relevant inforamtion from the processed sample apart from the actual transcript quantifications.
393
399
394
400
</details>
395
401
396
402
[oarfish](https://github.com/COMBINE-lab/oarfish) is a program for quantifying
397
403
transcript-level expression from long-read (i.e. Oxford nanopore cDNA and
398
404
direct RNA and PacBio) sequencing technologies. oarfish requires a sample of
399
-
sequencing reads aligned to the transcriptome (currntly not to the genome). It
405
+
sequencing reads aligned to the transcriptome (currently not to the genome). It
400
406
handles multi-mapping reads through the use of probabilistic allocation via an
// TODO - this version of bambu currently only outputs the "extended annotation", which is the reference annotation + the detected transcripts in the sample, so there's no point to doing gffcompare as sensitivity and accuracy are 100%
0 commit comments