Skip to content

Commit 1257951

Browse files
committed
adding to docs
1 parent 39c9d49 commit 1257951

2 files changed

Lines changed: 120 additions & 36 deletions

File tree

docs/output.md

Lines changed: 85 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,42 +4,111 @@
44

55
This document describes the output produced by the pipeline. Most of the plots are taken from the MultiQC report, which summarises results at the end of the pipeline.
66

7-
The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.
8-
9-
<!-- nf-core: Write this documentation describing your workflow's output -->
7+
Directories corresponding to the stages listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.
108

119
## Pipeline overview
1210

1311
The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:
1412

15-
- [FastQC](#fastqc) - Raw read QC
13+
- [FASTQ quality control and summary stats](#FASTQ-quality-control)
14+
- [NANOQ](#NANOQ)
15+
- [SEQUALI](#SEQUALI)
16+
- [Reference genome mapping](#Reference-genome-mapping)
17+
- [minimap2](#minimap2)
18+
- [samtools](#samtools-sort-index)
19+
- [Create bigWig coverage files](#Create-files-to-visualise-mapping)
20+
- [bedtools](#bedtools)
21+
- [bedGraphToBigWig](#bedGraphToBigWig)
22+
- [Extensive QC of alignments](#Alignment-quality-control)
23+
- [samtools](#samtools-flagstats)
24+
- [cramino](#cramino)
25+
- [alfred](#alfred)
26+
- [ngs-bits](#ngs-bits)
27+
- [Transcriptome reconstruction](#Transcriptome-reconstruction)
28+
- [FLAIR](#FLAIR)
29+
- [bambu](#bambu)
30+
- [IsoQuant](#IsoQuant)
31+
- [StringTie](#StringTie)
32+
<!-- 7. Fusion gene detection [`JAFFA`](github.com/Oshlack/JAFFA) -->
33+
- [Transcriptome assessment](#Transcriptome-assessment)
34+
- [gffutils](#gffutils)
35+
- [Transcript quantification](#Transcript-quantification)
36+
- [TranSigner](#TranSigner)
37+
- [oarfish](#oarfish)
1638
- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline
1739
- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
1840

19-
### FastQC
41+
## FASTQ-quality-control
42+
43+
### NANOQ
2044

2145
<details markdown="1">
2246
<summary>Output files</summary>
2347

24-
- `fastqc/`
25-
- `*_fastqc.html`: FastQC report containing quality metrics.
26-
- `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images.
48+
- `fastq_qc/nanoq/`
49+
- `*_nanoq.json`: `json` formatted file containing quality metrics.
50+
- `*_nanoq.stats`: basic NANOQ report containing quality metrics.
51+
- `*_nanoq_stats.verbose`: verbose NANOQ report containing quality metrics.
2752

2853
</details>
2954

30-
[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/).
55+
[NANOQ](https://github.com/esteinig/nanoq) provides general quality statistics
56+
about the nanopore sequence reads. It outputs the statistics in both verbose and
57+
minimal reports, which can be formatted in `json` format.
58+
59+
```
60+
Nanoq Read Summary
61+
====================
62+
63+
Number of reads: 100000
64+
Number of bases: 400398234
65+
N50 read length: 5154
66+
Longest read: 44888
67+
Shortest read: 5
68+
Mean read length: 4003
69+
Median read length: 3256
70+
Mean read quality: NaN
71+
Median read quality: NaN
72+
73+
74+
Read length thresholds (bp)
75+
76+
> 200 99104 99.1%
77+
> 500 96406 96.4%
78+
> 1000 90837 90.8%
79+
> 2000 73579 73.6%
80+
> 5000 25515 25.5%
81+
> 10000 4987 05.0%
82+
> 30000 47 00.0%
83+
> 50000 0 00.0%
84+
> 100000 0 00.0%
85+
> 1000000 0 00.0%
3186
32-
![MultiQC - FastQC sequence counts plot](images/mqc_fastqc_counts.png)
3387
34-
![MultiQC - FastQC mean quality scores plot](images/mqc_fastqc_quality.png)
88+
Top ranking read lengths (bp)
3589
36-
![MultiQC - FastQC adapter content plot](images/mqc_fastqc_adapter.png)
90+
1. 44888
91+
2. 40044
92+
3. 37441
93+
4. 36543
94+
5. 35630
95+
```
96+
97+
### SEQUALI
98+
99+
<details markdown="1">
100+
<summary>Output files</summary>
101+
102+
- `fastq_qc/sequali/`
103+
- `*_sequali.json`: `json` formatted file containing quality metrics.
104+
- `*_sequali.html`: `html` formatted containing quality metrics.
105+
106+
</details>
37107

38-
:::note
39-
The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality.
40-
:::
108+
[NANOQ](https://github.com/esteinig/nanoq) provides general quality statistics
109+
about the nanopore sequence reads. It outputs the statistics in both verbose and
110+
minimal reports, which can be formatted in `json` format.
41111

42-
### MultiQC
43112

44113
<details markdown="1">
45114
<summary>Output files</summary>

docs/usage.md

Lines changed: 35 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -57,17 +57,16 @@ If you wish to repeatedly use the same parameters for multiple runs, rather than
5757

5858
Pipeline settings can be provided in a `yaml` or `json` file via `-params-file <file>`.
5959

60-
:::warning
61-
Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args).
62-
:::
60+
> [!WARNING]
61+
> Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args).
6362
6463
The above pipeline run specified with a params file in yaml format:
6564

6665
```bash
6766
nextflow run . -profile singularity -params-file params.yaml
6867
```
6968

70-
with `params.yaml` containing:
69+
The `params.yaml` containing:
7170

7271
```yaml
7372
input: './samplesheet.csv'
@@ -83,7 +82,12 @@ pipeline), finally entering `.` as the path to the workflow. Once completed, a c
8382

8483
### Updating the pipeline
8584

86-
When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline:
85+
When you run the above command, Nextflow automatically pulls the pipeline code
86+
from GitHub and stores it as a cached version. When running the pipeline after
87+
this, it will always use the cached version if available - even if the pipeline
88+
has been updated since. To make sure that you're running the latest version of
89+
the pipeline, make sure that you regularly update the cached version of the
90+
pipeline:
8791

8892
```bash
8993
git pull https://github.com/number-25/LongTranscriptomics
@@ -99,34 +103,28 @@ This version number will be logged in reports when you run the pipeline, so that
99103

100104
To further assist in reproducibility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.
101105

102-
:::tip
103-
If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles.
104-
:::
106+
> [!TIP]
107+
> If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles.
105108
106109
## Core Nextflow arguments
107110

108-
//TODO
109-
110-
:::note
111-
These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
112-
:::
111+
> [!NOTE]
112+
> These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
113113
114114
### `-profile`
115115

116116
Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments.
117117

118118
Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below.
119119

120-
:::info
121-
We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
122-
:::
120+
> [!IMPORTANT]
121+
> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
123122
124123
The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to see if your system is available in these configs please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation).
125124

126-
Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important!
127-
They are loaded in sequence, so later profiles can overwrite earlier profiles.
125+
Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important! They are loaded in sequence, so later profiles can overwrite earlier profiles.
128126

129-
If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer enviroment.
127+
If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer environment.
130128

131129
- `test`
132130
- A profile with a complete minimal configuration for rapid, automated testing
@@ -157,6 +155,9 @@ Specify this when restarting a pipeline. Nextflow will use cached results from a
157155

158156
You can also supply a run name to resume a specific run: `-resume [run-name]`. Use the `nextflow log` command to show previous run names.
159157

158+
> [!CUATION]
159+
> Very high volume use of the `-resume` option has been known to lead to inconsistent errors in a small number of instances. Keep this in mind if you're stuck at debugging.
160+
160161
### `-c`
161162

162163
Specify the path to a specific config file (this is a core Nextflow command). See the [nf-core website documentation](https://nf-co.re/usage/configuration) for more information.
@@ -183,7 +184,21 @@ To learn how to provide additional arguments to a particular tool of the pipelin
183184

184185
### nf-core/configs
185186

186-
In most cases, you will only need to create a custom config as a one-off but if you and others within your organisation are likely to be running nf-core pipelines regularly and need to use the same settings regularly it may be a good idea to request that your custom config file is uploaded to the `nf-core/configs` git repository. Before you do this please can you test that the config file works with your pipeline of choice using the `-c` parameter. You can then create a pull request to the `nf-core/configs` repository with the addition of your config file, associated documentation file (see examples in [`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs)), and amending [`nfcore_custom.config`](https://github.com/nf-core/configs/blob/master/nfcore_custom.config) to include your custom profile.
187+
> [!NOTE]
188+
> Some portions of the following section will only apply to official nf-core pipelines. As this current pipeline is NOT part of nf-core, attempting a custom config for this
189+
190+
In most cases, you will only need to create a custom config as a one-off but if
191+
you and others within your organisation are likely to be running nf-core
192+
pipelines regularly and need to use the same settings regularly it may be a
193+
good idea to request that your custom config file is uploaded to the
194+
`nf-core/configs` git repository. Before you do this please can you test that
195+
the config file works with your pipeline of choice using the `-c` parameter.
196+
You can then create a pull request to the `nf-core/configs` repository with the
197+
addition of your config file, associated documentation file (see examples in
198+
[`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs)),
199+
and amending
200+
[`nfcore_custom.config`](https://github.com/nf-core/configs/blob/master/nfcore_custom.config)
201+
to include your custom profile.
187202

188203
See the main [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html) for more information about creating your own configuration files.
189204

0 commit comments

Comments
 (0)