You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nf-core tools are for everyone, with commands intended to help both **users** and **developers**. For users, the tools make it easier to execute workflows. For developers, the tools make it easier to develop and test your workflows using best practices. You can read about the nf-core commands on the [tools page](https://nf-co.re/tools/) of the nf-core website or using the command line.
134
134
135
-
!!! example "Exercise 1.2.3"
135
+
!!! example "Exercise 1.2.3.1"
136
136
137
137
Find out what nf-core tools commands and options are available using the `--help` option:
138
138
@@ -152,8 +152,6 @@ nf-core tools is updated with new features and fixes regularly so it's best to k
152
152
153
153
One very useful nf-core tools command is `nf-core pipelines download`. Sometimes you may need to execute an nf-core workflow on a computer with no internet connection, for example if you have highly protected data. In this case, you will need to fetch the workflow files and manually transfer them to your offline system. The `nf-core pipelines download` command makes this process easier and ensures accurate retrieval of correctly versioned code and software containers.
154
154
155
-
The `nf-core pipelines download` command will download both the workflow code and the institutional nf-core/configs files. It can also optionally download singularity image file.
156
-
157
155
```bash
158
156
nf-core pipelines download
159
157
```
@@ -176,9 +174,126 @@ Alternatively, you could build your own execution command with the command line
176
174
177
175
{width=100%}
178
176
179
-
## 1.2.4 Executing a workflow
177
+
The command line method also gives you a few additional options, including the ability to download all of the [nf-core institutional configs](https://nf-co.re/configs). This lets you run a workflow completely offline while still having access to these community-created configurations. **Note** that you must use the command line argument `--download-configuration yes` to do this; the interactive mode doesn't support this option yet.
178
+
179
+
!!! example "Exercise 1.2.3.2"
180
+
181
+
Have a go at using `nf-core pipelines download` to download an nf-core pipeline along with the nf-core institutional configs. Tell the tool to:
182
+
183
+
- Download the `nf-core/rnaseq` pipeline
184
+
- Pull the `3.23.0` version of the pipeline
185
+
- Download the institutional configs
186
+
- **Not** download the singularity images for the pipeline (doing so might take a while!)
187
+
- **Not** compress the downloaded data
188
+
189
+
Consult `nf-core pipelines download --help` to help you find the right arguments.
190
+
191
+
??? success "Solution"
192
+
193
+
The arguments we want are:
194
+
195
+
- `--revision 3.23.0`: this pulls the specific version we want
196
+
- `--download-configuration yes`: this pulls the institutional configs
197
+
- `--container-system none`: this tells the tool to not download any images
198
+
- `--compress none`: this tells the tool to not compress the data
We can see that the `3_23_0` folder contains the pipeline code, including its `main.nf` file, `nextflow.config` file, its `modules` and `subworkflows`, along with its configuration folder `conf`.
271
+
272
+
Meanwhile the `configs` folder is where the institutional configs were downloaded. The config files themselves are under the `conf` directory:
273
+
274
+
```bash
275
+
ls nf-core-rnaseq_3.23.0/configs/conf
276
+
```
277
+
278
+
```console title="Output"
279
+
abims.config
280
+
adcra.config
281
+
alice.config
282
+
alliance_canada.config
283
+
apollo.config
284
+
arcc.config
285
+
awsbatch.config
286
+
aws_tower.config
287
+
azurebatch.config
288
+
azurebatchdev.config
289
+
...
290
+
```
291
+
292
+
The pipeline code is set up to find these and include them when you request the appropriate profile; for example, if you run the pipeline with `-profile nci_gadi`, it will find the config file stored at `nf-core-rnaseq_3.23.0/configs/conf/nci_gadi.config` and include it in the pipeline's configuration.
293
+
294
+
## 1.2.4 Downloading and executing workflows with `nextflow`
180
295
181
-
Nextflow seamlessly integrates with code repositories such as [GitHub](https://github.com/). This feature allows you to manage your project code and use public Nextflow workflows — including nf-core workflows — quickly, consistently, and transparently.
296
+
The `nextflow` command itself can also be used to download pipelines. Nextflow seamlessly integrates with code repositories such as [GitHub](https://github.com/), allowing you tu use public Nextflow workflows — including nf-core workflows — quickly, consistently, and transparently.
182
297
183
298
The Nextflow `pull` command will download a workflow from a hosting platform into your global cache `$HOME/.nextflow/assets` folder.
184
299
@@ -202,19 +317,17 @@ nextflow clone foo/bar
202
317
203
318
This is equivalent to pulling the GitHub repository directly with `git clone https://github.com/foo/bar`. The `nextflow clone` syntax simply shortens and cleans up the command.
204
319
205
-
The Nextflow `run` command is used to initiate the execution of a workflow:
320
+
Once the workflow is donwloaded, the Nextflow `run` command is used to initiate the execution of a workflow:
206
321
207
322
```bash
208
323
nextflow run foo/bar
209
324
```
210
325
211
-
If you `run` a workflow, it will look for a local file with the workflow name you’ve specified. If that file does not exist, it will next look in your `$HOME/.nextflow/assets` folder to see if you have previously `pull`ed the pipeline. Failing that, it will look for a public repository with the same name on GitHub (unless otherwise specified). If it is found, Nextflow will automatically `pull` the workflow to your global cache and execute it.
212
-
213
326
!!! warning "Warning"
214
327
215
328
Be aware of what is already in your current working directory where you launch your workflow. If there are other workflows (or configuration files) within the directory, you may encounter unexpected results.
216
329
217
-
!!! example "Exercise 1.2.4.1"
330
+
!!! example "Exercise 1.2.4"
218
331
219
332
Use the `nextflow` command line tool to clone the [`nextflow-io/hello` Nextflow repository](https://github.com/nextflow-io/hello) to your local directory, then execute it.
220
333
@@ -260,48 +373,19 @@ If you `run` a workflow, it will look for a local file with the workflow name yo
260
373
261
374
Note that the second line says ``Launching `hello/main.nf` ...``, which indicates that it was launched from the local directory.
262
375
263
-
!!! example "Exercise 1.2.4.2"
264
-
265
-
Try executing the workflow directly from `nextflow-io` [GitHub](https://github.com/nextflow-io/hello) repository.
266
-
267
-
??? success "Solution"
268
-
269
-
Use the `run` command again, but this time use include the `nextflow-io/` prefix in the workflow name:
376
+
### 1.2.4.1 More on `nextflow run`
270
377
271
-
```bash
272
-
nextflow run nextflow-io/hello
273
-
```
274
-
275
-
Since there is no local directory called `nextflow-io/hello`, the workflow will be automatically pulled from GitHub and executed. You should see:
276
-
277
-
```console title="Output"
378
+
When you run a pipeline with `nextflow run some_pipeline`, it will look for a local folder with the workflow name you’ve specified and a `main.nf` file within. If that file does not exist, it will next look in your `$HOME/.nextflow/assets` folder to see if you have previously `pull`ed the pipeline. Failing that, it will look for a public repository with the same name on GitHub (unless otherwise specified). If it is found, Nextflow will automatically `pull` the workflow to your global cache and execute it.
278
379
279
-
N E X T F L O W ~ version 25.10.4
280
-
281
-
Pulling nextflow-io/hello ...
282
-
downloaded from https://github.com/nextflow-io/hello.git
This means it is possible to seamlessly `run` public Nextflow pipelines without having to manually download them first.
288
381
289
-
Ciao world!
290
-
291
-
Hello world!
292
-
293
-
Hola world!
294
-
295
-
```
382
+
!!! note "Our recommendation"
296
383
297
-
Note that now the output reads:
384
+
As you can see, there are a few different ways you can go about running a nextflow or nf-core pipeline. Generally, **we recommend always downloading the code to your working directory** and **not** using `nextflow pull` (and by extension `nextflow run` with pipelines you haven't already downloaded). This means using either the `nextflow clone` command or directly cloning the repository with `git clone` to make sure the code is in your working directory first.
298
385
299
-
```console title="Output"
300
-
Pulling nextflow-io/hello ...
301
-
downloaded from https://github.com/nextflow-io/hello.git
302
-
```
303
-
304
-
This indicates that the pipeline was pulled from the repository rather than executed from the local directory.
386
+
If you need to execute an **nf-core pipeline** in an environment **without an internet connection**, you can use the `nf-core pipelines download` method [mentioned above](#nf-core-pipelines-download) on a computer with internet and transfer it to where it will run. Again, this method ensures that you localise the pipeline code to your working directory first, and makes sure you have all of the required configuration files and singularity images ready for offline use.
387
+
388
+
We recommend this approach because it is the most flexible approach and gives you control over exacly what version of the workflow is being downloaded and where it is being downloaded to (instead of all pipelines going to `$HOME/.nextflow/assets` as with `nextflow pull`/`nextflow run`). In addition, it provides greater flexibility in modifying configuration files to suit the needs of your data and infrastructure, for example process CPU and memory resources.
305
389
306
390
More information about the Nextflow `run`, `pull`, and `clone` commands can be found in the Nextflow documentation:
307
391
@@ -323,11 +407,6 @@ More information about the Nextflow `run`, `pull`, and `clone` commands can be f
323
407
nextflow run -r 1.2.0 foo/bar
324
408
```
325
409
326
-
!!! note "Our recommendation"
327
-
328
-
As you can see, there are a few different ways you can go about running a nextflow or nf-core pipeline. We recommend using either the `nextflow clone` command or directly cloning the repository with `git clone`. This is because it is the most flexible approach and gives you control over exacly what version of the workflow is being downloaded and where it is being downloaded to (instead of all pipelines going to `$HOME/.nextflow/assets` as with `nextflow pull`/`nextflow run`). In addition, it provides greater flexibility in modifying configuration files to suit the needs of your data and infrastructure, for example process CPU and memory resources.
329
-
330
-
The one exception to this is when you need to execute an **nf-core pipeline** in an environment **without an internet connection**. In this case, the recommendation is to use the `nf-core pipelines download` method [mentioned above](#nf-core-pipelines-download), as this tool allows you to localise all of the required configuration files and singularity images for offline use.
331
410
332
411
## 1.2.5 Nextflow log
333
412
@@ -389,6 +468,19 @@ When running large (and possibly expensive) workflows, we want to be sure that i
389
468
390
469
In Nextflow, we can utilise this cache by using the `-resume` option. The cache works keeping track of the file paths, file sizes, and modification times of all input files to a process. It also keeps track of the process definition itself. If these are unchanged between runs, the **cached** outputs are re-used. If any of these values have changed, the process will be re-run.
391
470
471
+
!!! note "The Nextflow cache can be sensitive!"
472
+
473
+
It's important to note that the Nextflow cache looks for any change that might affect the output of each process. This includes:
474
+
475
+
- Input file modification times
476
+
- Changes to the script
477
+
- Changes to the container or conda environment used to run the process
478
+
- Changes to the `ext` properties, e.g. `ext.args`
479
+
480
+
Sometimes, you can run into issues where the cache is **invalidated** and a re-run of a process is forced even when nothing seems to have changed. Often this will happen on some HPC systems where file modification times aren't syncronised perfectly across parallel file systems. In these cases, it can help to apply the `process.cache = 'lenient'` configuration option to tell Nextflow to only use the file name and size, but not the modification time, to determine whether the cache is valid or not.
481
+
482
+
See the [Nextflow cache documentation](https://docs.seqera.io/nextflow/cache-and-resume) for further information on the cache and how to configure it.
483
+
392
484
!!! note "Key points"
393
485
394
486
- Environment variables can be used to control your Nextflow runtime
0 commit comments