Skip to content

Commit 51ccdb2

Browse files
committed
small additions to docs
1 parent 7547e93 commit 51ccdb2

2 files changed

Lines changed: 25 additions & 14 deletions

File tree

docs/usage.md

Lines changed: 25 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,7 @@ You will need to create a samplesheet with information about the samples you wou
1414

1515
### Full samplesheet
1616

17-
The pipeline will auto-detect whether the sequencing summary files, and reads are in the paths listed on the samplesheet. Each row represents a fastq file. Replicate refers to a technical replicate, biological replicates should be named uniquely. Be sure to pay attention to sample naming, in
18-
order to avoid duplication and file overwriting.
17+
The pipeline will auto-detect whether the sequencing summary files, and reads are in the paths listed on the samplesheet. Each row represents a fastq file. Replicate refers to a technical replicate, biological replicates should be named uniquely. Be sure to pay attention to sample naming, in order to avoid duplication and file overwriting.
1918

2019
A final samplesheet file consisting of long-read data may look something like the one below. This is for **one biological** sample which has been sequenced twice, giving two technical replicates.
2120

@@ -27,9 +26,10 @@ CONTROL1,2,data/long_reads_sequencingsummary_2.txt,data/long_reads_2.fastq.gz
2726

2827
| Column | Description |
2928
| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
30-
| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). |
31-
| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
32-
| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". |
29+
| `sample` | Sample name. |
30+
| `replicate` | Technical replicate number |
31+
| `sequencing_summary_path` | Full path to nanopore sequencing summary file (usually a .txt file).gz". |
32+
| `read_path` | Full path to fastq reads. |
3333

3434
An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
3535

@@ -38,16 +38,17 @@ An [example samplesheet](../assets/samplesheet.csv) has been provided with the p
3838
The typical command for running the pipeline is as follows:
3939

4040
```bash
41-
nextflow run number-25/rich_directRNA --input ./samplesheet.csv --outdir ./results --genome GRCh37 -profile docker
41+
mkdir results
42+
43+
nextflow run . --input ./samplesheet.csv --outdir ./results --genome_fasta <path/to/genome> --annotation_gtf <path/to/annotation> -profile singularity
4244
```
4345

44-
This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles.
46+
This will launch the pipeline with the `singularity` configuration profile. See below for more information about profiles.
4547

4648
Note that the pipeline will create the following files in your working directory:
4749

4850
```bash
4951
work # Directory containing the nextflow working files
50-
<OUTDIR> # Finished results in specified location (defined with --outdir)
5152
.nextflow_log # Log file from Nextflow
5253
# Other nextflow hidden files, eg. history of pipeline runs and old logs.
5354
```
@@ -63,26 +64,30 @@ Do not use `-c <file>` to specify parameters as this will result in errors. Cust
6364
The above pipeline run specified with a params file in yaml format:
6465

6566
```bash
66-
nextflow run number-25/rich_directRNA -profile docker -params-file params.yaml
67+
nextflow run . -profile singularity -params-file params.yaml
6768
```
6869

6970
with `params.yaml` containing:
7071

7172
```yaml
7273
input: './samplesheet.csv'
7374
outdir: './results/'
74-
genome: 'GRCh37'
75+
genome_fasta: '<path/to/genome_fasta'
76+
annotation_gtf: '<path/to/annotation_gtf'
7577
<...>
7678
```
7779

78-
You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch).
80+
To generate this custom params file, we can launch an interactive module (either online or on the command line) with
81+
`<nf-core pipelines launch`, selecting a local pipeline (not a GitHub
82+
pipeline), finally entering `.` as the path to the workflow. Once completed, a custom params.yaml file will be generated, which can be provided to the workflow with `-params-file params.yaml`.
83+
7984

8085
### Updating the pipeline
8186

8287
When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline:
8388

8489
```bash
85-
nextflow pull number-25/rich_directRNA
90+
git pull https://github.com/number-25/rich_directRNA
8691
```
8792

8893
### Reproducibility
@@ -93,14 +98,17 @@ First, go to the [number-25/rich_directRNA releases page](https://github.com/num
9398

9499
This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports.
95100

96-
To further assist in reproducbility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.
101+
To further assist in reproducibility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.
97102

98103
:::tip
99104
If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles.
100105
:::
101106

102107
## Core Nextflow arguments
103108

109+
//TODO
110+
111+
104112
:::note
105113
These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
106114
:::
@@ -123,8 +131,11 @@ They are loaded in sequence, so later profiles can overwrite earlier profiles.
123131
If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer enviroment.
124132

125133
- `test`
126-
- A profile with a complete configuration for automated testing
134+
- A profile with a complete minimal configuration for rapid, automated testing
127135
- Includes links to test data so needs no other parameters
136+
- `test_full`
137+
- A profile with a complete, thorough configuration for automated testing of entire pipeline with full size data
138+
- Requires the user to provide real world sequencing data, reference files
128139
- `docker`
129140
- A generic configuration profile to be used with [Docker](https://docker.com/)
130141
- `singularity`
-11.1 KB
Binary file not shown.

0 commit comments

Comments
 (0)