You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/usage.md
+25-14Lines changed: 25 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,8 +14,7 @@ You will need to create a samplesheet with information about the samples you wou
14
14
15
15
### Full samplesheet
16
16
17
-
The pipeline will auto-detect whether the sequencing summary files, and reads are in the paths listed on the samplesheet. Each row represents a fastq file. Replicate refers to a technical replicate, biological replicates should be named uniquely. Be sure to pay attention to sample naming, in
18
-
order to avoid duplication and file overwriting.
17
+
The pipeline will auto-detect whether the sequencing summary files, and reads are in the paths listed on the samplesheet. Each row represents a fastq file. Replicate refers to a technical replicate, biological replicates should be named uniquely. Be sure to pay attention to sample naming, in order to avoid duplication and file overwriting.
19
18
20
19
A final samplesheet file consisting of long-read data may look something like the one below. This is for **one biological** sample which has been sequenced twice, giving two technical replicates.
|`sample`| Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). |
31
-
|`fastq_1`| Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
32
-
|`fastq_2`| Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
29
+
|`sample`| Sample name. |
30
+
|`replicate`| Technical replicate number |
31
+
|`sequencing_summary_path`| Full path to nanopore sequencing summary file (usually a .txt file).gz". |
32
+
|`read_path`| Full path to fastq reads. |
33
33
34
34
An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
35
35
@@ -38,16 +38,17 @@ An [example samplesheet](../assets/samplesheet.csv) has been provided with the p
38
38
The typical command for running the pipeline is as follows:
This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles.
46
+
This will launch the pipeline with the `singularity` configuration profile. See below for more information about profiles.
45
47
46
48
Note that the pipeline will create the following files in your working directory:
47
49
48
50
```bash
49
51
work # Directory containing the nextflow working files
50
-
<OUTDIR># Finished results in specified location (defined with --outdir)
51
52
.nextflow_log # Log file from Nextflow
52
53
# Other nextflow hidden files, eg. history of pipeline runs and old logs.
53
54
```
@@ -63,26 +64,30 @@ Do not use `-c <file>` to specify parameters as this will result in errors. Cust
63
64
The above pipeline run specified with a params file in yaml format:
64
65
65
66
```bash
66
-
nextflow run number-25/rich_directRNA -profile docker -params-file params.yaml
67
+
nextflow run . -profile singularity -params-file params.yaml
67
68
```
68
69
69
70
with `params.yaml` containing:
70
71
71
72
```yaml
72
73
input: './samplesheet.csv'
73
74
outdir: './results/'
74
-
genome: 'GRCh37'
75
+
genome_fasta: '<path/to/genome_fasta'
76
+
annotation_gtf: '<path/to/annotation_gtf'
75
77
<...>
76
78
```
77
79
78
-
You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch).
80
+
To generate this custom params file, we can launch an interactive module (either online or on the command line) with
81
+
`<nf-core pipelines launch`, selecting a local pipeline (not a GitHub
82
+
pipeline), finally entering `.` as the path to the workflow. Once completed, a custom params.yaml file will be generated, which can be provided to the workflow with `-params-file params.yaml`.
83
+
79
84
80
85
### Updating the pipeline
81
86
82
87
When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline:
@@ -93,14 +98,17 @@ First, go to the [number-25/rich_directRNA releases page](https://github.com/num
93
98
94
99
This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports.
95
100
96
-
To further assist in reproducbility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.
101
+
To further assist in reproducibility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.
97
102
98
103
:::tip
99
104
If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles.
100
105
:::
101
106
102
107
## Core Nextflow arguments
103
108
109
+
//TODO
110
+
111
+
104
112
:::note
105
113
These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
106
114
:::
@@ -123,8 +131,11 @@ They are loaded in sequence, so later profiles can overwrite earlier profiles.
123
131
If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer enviroment.
124
132
125
133
-`test`
126
-
- A profile with a complete configuration for automated testing
134
+
- A profile with a complete minimal configuration for rapid, automated testing
127
135
- Includes links to test data so needs no other parameters
136
+
-`test_full`
137
+
- A profile with a complete, thorough configuration for automated testing of entire pipeline with full size data
138
+
- Requires the user to provide real world sequencing data, reference files
128
139
-`docker`
129
140
- A generic configuration profile to be used with [Docker](https://docker.com/)
0 commit comments