Skip to content

Commit 23caebd

Browse files
committed
addressing session 1 feedback
1 parent 6881fd3 commit 23caebd

2 files changed

Lines changed: 71 additions & 70 deletions

File tree

docs/session_1/1.3_configure.md

Lines changed: 49 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -2,34 +2,25 @@
22

33
!!! tip "Objectives"
44

5-
- Learn how to customize the execution of an nf-core workflow.
6-
- Customize a toy example of an nf-core workflow.
5+
- Learn how to customize the execution of an nf-core workflow
6+
- Customize a toy example of an nf-core workflow
77

88
## 1.3.1 Introduction to Nextflow configuration
99

10-
Nextflow pipelines are configured by way of special `.config` files. These files are used to set a wide variety of options that control how Nextflow runs, such as defining computing environments, specifying containerisation systems, and setting resources required by processes. They are also used to set various parameters of a pipeline.
10+
There are two ways in which we can control how nf-core and other Nextflow pipelines run: through Nextflow **configuration files** and through **workflow parameters**. Configuration files are special `.config` files which are used to set a wide variety of options that control how the Nextflow engine runs, such as defining computing environments, specifying containerisation systems, and setting resources required by processes. Workflow parameters, on the other hand, are specific to the pipeline you are running and are used to pass information to the pipeline such as sample information, filtering thresholds, reference files, etc.
1111

1212
When a workflow is launched, Nextflow will look for configuration files in several locations. As each configuration file can contain conflicting settings, the sources are ranked to decide which settings to apply. Configuration sources are reported below and listed in order of decreasing priority:
1313

14-
1. Parameters specified on the command line (`--parameter`)
15-
2. Parameters that are provided using the `-params-file` option
16-
3. Config files that are provided using the `-c` option
17-
4. The config file named `nextflow.config` in the current directory
18-
5. The config file named `nextflow.config` in the workflow project directory
19-
6. The config file `$HOME/.nextflow/config`
20-
7. Values defined within the workflow script itself (e.g., `main.nf`)
14+
1. Config files that are provided using the `-c` option
15+
2. The config file named `nextflow.config` in the current directory
16+
3. The config file named `nextflow.config` in the workflow project directory
17+
4. The config file `$HOME/.nextflow/config`
2118

22-
Notably, while some of these files are already included in the nf-core workflow repository (e.g., the `nextflow.config` file in the nf-core workflow repository), others are automatically identified on your local system (e.g., the `nextflow.config` in the launch directory), and others are only included if they are specified using `run` options (e.g., `-params-file`, and `-c`). Understanding how and when these files are interpreted by Nextflow is critical for the accurate configuration of a workflows execution.
19+
Notably, while some of these files are already included in the nf-core workflow repository (e.g., the `nextflow.config` file in the nf-core workflow repository), others are automatically identified on your local system (e.g., the `nextflow.config` in the launch directory), and others are only included if they are specified using the `-c` option on the command line. Understanding how and when these files are interpreted by Nextflow is critical for the accurate configuration of a workflows execution.
2320

2421
This system allows you to finely tune how Nextflow runs for you. An nf-core workflow will come with its own `nextflow.config` file, but you can use your own `nextflow.config` file in your launch directory to override the defaults. You can also have a config file at `$HOME/.nextflow/config` to set configuration options that you frequently use across pipelines.
2522

26-
!!! warning "IMPORTANT!"
27-
28-
nf-core workflow **parameters** must be passed via the command line (`--<parameter>`) or Nextflow `-params-file` option. Custom config files, including those provided by the `-c` option, can be used to provide any configuration **except** for parameters.
29-
30-
Each nf-core workflow has its own **configuration** and **parameter** defaults. Typically, the default parameters are chosen to be applicable to the average user, but you will almost certainly want to modify these to fit your own purposes and system requirements.
31-
32-
Notably, configuration files can also contain the definition of one or more **profiles**. A profile is a set of configuration attributes that can be activated when launching a workflow by using the `-profile` command option:
23+
Notably, configuration files typically define of one or more **profiles**. A profile is a set of configuration attributes that can be activated when launching a workflow by using the `-profile` command option:
3324

3425
```bash
3526
nextflow run <workflow> -profile <profile>
@@ -38,10 +29,10 @@ nextflow run <workflow> -profile <profile>
3829
Profiles used by nf-core workflows include:
3930

4031
- **Software management profiles**
41-
- Profiles for the management of software using software management tools, e.g., `docker`, `singularity`, and `conda`.
32+
- Profiles for the management of software using software management tools, e.g., `docker`, `singularity`, and `conda`
4233
- **Test profiles**
43-
- Profiles to execute the workflow with a standardized set of test data and parameters, e.g., `test` and `test_full`.
44-
- Test profiles are typically used during development to check whether new changes work as expected. They are also a great place to start when learning how to use a new pipeline.
34+
- Profiles to execute the workflow with a standardized set of test data and parameters, e.g., `test` and `test_full`
35+
- Test profiles are typically used during development to check whether new changes work as expected; they are also a great place to start when learning how to use a new pipeline
4536

4637
Multiple profiles can be specified in a comma-separated (`,`) list when you execute your command. The order of profiles is important as they will be read from left to right. If there is any overlap in the configuration options set by each profile, the right-most profile will take precedence.
4738

@@ -168,9 +159,17 @@ nextflow run <workflow> -profile test,singularity
168159

169160
```
170161

171-
## 1.3.2 Viewing parameters
162+
## 1.3.2 Workflow parameters
163+
164+
Every nf-core workflow defines a set of workflow parameters that can be used to pass data, filtering thresholds, options, and other information to the pipeline and influence how it runs. These parameters are defined and given sensible defaults in the main `nextflow.config` file in the pipeline repository. While these default values are chosen to be applicable to the average user, you will almost certainly want to modify some of these to fit your own purposes and datasets.
172165

173-
Every nf-core workflow has an associated **schema** which describes the full list of **parameters** it accepts. You can find this manifest on the pipeline's page on the nf-core website, which displays a description and value type of each parameter. Most parameters will have additional text to help you understand when and how a parameter should be used. For example, the following is from the parameter page for `nf-core/rnaseq`:
166+
!!! warning "IMPORTANT!"
167+
168+
nf-core workflow **parameters** must be passed via the command line (`--<parameter>`) or Nextflow `-params-file` option. Custom config files, including those provided by the `-c` option, can be used to provide any configuration **except** for parameters.
169+
170+
We discuss both of these methods for passing parameters further down.
171+
172+
To help with setting parameters, each nf-core pipeline has an associated **schema** which describes the full list of **parameters** it accepts. You can find this manifest on the pipeline's page on the nf-core website, which displays a description and value type of each parameter. Most parameters will have additional text to help you understand when and how a parameter should be used. For example, the following is from the parameter page for `nf-core/rnaseq`:
174173

175174
[![](../assets/1.3_params.excalidraw.png){width=80%}](https://nf-co.re/rnaseq/3.23.0/parameters)
176175

@@ -254,6 +253,8 @@ At the highest level, parameters can be customized using the command line. Any p
254253
nextflow run <workflow> --<parameter>
255254
```
256255

256+
Parameters set via the command line this way take the highest precedence and will override conflicting parameters defined in any `-params-file` or in the default configuration.
257+
257258
!!! tip
258259

259260
Remember that workflow parameters are prefixed with a **double dash** (`--`), while options to the `nextflow run` command itself are prefixed with a single dash (`-`).
@@ -342,28 +343,28 @@ All parameters will have a default setting that is defined using the `nextflow.c
342343
There are also several `includeConfig` statements in the `nextflow.config` file that are used to include additional `.config` files from the `conf/` folder. Each additional `.config` file contains categorized configuration information for your workflow execution, some of which can be optionally included:
343344

344345
- `base.config`
345-
- Included by the workflow by default.
346-
- Generous resource allocations using labels.
347-
- Does not specify any method for software management and expects software to be available (or specified elsewhere).
346+
- Included by the workflow by default
347+
- Generous resource allocations using labels
348+
- Does not specify any method for software management and expects software to be available (or specified elsewhere)
348349
- `igenomes.config`
349-
- Included by the workflow by default.
350-
- Default configuration to access reference files stored on [AWS iGenomes](https://ewels.github.io/AWS-iGenomes/).
350+
- Included by the workflow by default
351+
- Default configuration to access reference files stored on [AWS iGenomes](https://ewels.github.io/AWS-iGenomes/)
351352
- Note that the iGenomes reference data may be out of date, so use with caution!
352353
- `modules.config`
353-
- Included by the workflow by default.
354-
- Module-specific configuration options (both mandatory and optional).
354+
- Included by the workflow by default
355+
- Module-specific configuration options (both mandatory and optional)
355356
- `test.config`
356-
- Only included if specified as a profile.
357-
- A configuration profile to test the workflow with a small test dataset.
357+
- Only included if specified as a profile
358+
- A configuration profile to test the workflow with a small test dataset
358359
- `test_full.config`
359-
- Only included if specified as a profile.
360-
- A configuration profile to test the workflow with a full-size test dataset.
360+
- Only included if specified as a profile
361+
- A configuration profile to test the workflow with a full-size test dataset
361362

362363
nf-core workflows are required to define **software containers** and **conda environments** that can be activated using profiles. Although it is possible to run the workflows with software installed by other methods (e.g., environment modules or manual installation), using containerisation systems like Docker or Singularity is more convenient and more reproducible.
363364

364365
!!! tip
365366

366-
If you're computer has internet access and one of Conda, Singularity, or Docker installed, you should be able to run any nf-core workflow with the `test` profile and the respective software management profile 'out of the box'. The `test` data profile will pull small test files directly from the `nf-core/test-data` GitHub repository and run it on your local system. The `test` profile is an important control to check the workflow is working as expected and is a great way to trial a workflow. Some workflows have multiple test `profiles` for you to test.
367+
If you're computer has internet access and one of Conda, Singularity, or Docker installed, you should be able to run any nf-core workflow with the `test` profile and the respective software management profile 'out of the box'. The `test` data profile will pull small test files directly from the `nf-core/test-data` GitHub repository and run it on your local system. The `test` profile is an important control to check the workflow is working as expected and is a great way to trial a workflow. Some workflows have multiple test profiles for you to trial.
367368

368369
## 1.3.5 Shared configuration files
369370

@@ -409,7 +410,7 @@ nextflow run <workflow> -profile test,docker -params-file /path/to/custom_params
409410

410411
!!! example "Exercise 1.3.6.1"
411412

412-
Run the `nf-core-demo` pipeline again and use a custom parameter file to set the title of the MultiQC report to the name of your **favourite food**.
413+
Run the `nf-core-demo` pipeline again and use a custom **parameter file** to set the title of the MultiQC report to the name of your **favourite food**.
413414

414415
??? success "Solution"
415416

@@ -450,24 +451,24 @@ This allows you to customise the configuration of the Nextflow run beyond the de
450451
Custom configuration files follow the same structure as the configuration file included in the workflow directory. Configuration properties are organized into [scopes](https://docs.seqera.io/nextflow/config#syntax) by dot-prefixing the property names with a scope identifier or grouping the properties in the same scope using the curly brackets notation. For example:
451452

452453
```groovy
453-
alpha.x = 1
454-
alpha.y = 'string value..'
454+
process.cpus = 1
455+
process.memory = '4 GB'
455456
```
456457

457458
Is equivalent to:
458459

459460
```groovy
460-
alpha {
461-
x = 1
462-
y = 'string value..'
461+
process {
462+
cpus = 1
463+
memory = 'string value..'
463464
}
464465
```
465466

466467
Scopes allow you to quickly configure settings required to deploy a workflow on different infrastructure using different software management. For example, the `executor` scope can be used to provide settings for the deployment of a workflow on a HPC cluster. Similarly, the `singularity` scope controls how Singularity containers are executed by Nextflow. Multiple scopes can be included in the same `.config` file using a mix of dot prefixes and curly brackets. See the [Nextflow documentation](https://docs.seqera.io/nextflow/config) for more information on writing configuration files, as well as [this list of configuration options](https://docs.seqera.io/nextflow/reference/config).
467468

468469
!!! example "Exercise 1.3.6.2"
469470

470-
Run the `nf-core-demo` pipeline again and use a custom config file to set the title of the MultiQC report to the name of your **favourite colour**.
471+
Run the `nf-core-demo` pipeline again and use a custom **config file** to set the title of the MultiQC report to the name of your **favourite colour**.
471472

472473
??? success "Solution"
473474

@@ -536,7 +537,7 @@ Scopes allow you to quickly configure settings required to deploy a workflow on
536537

537538
!!! warning "Why did this fail?"
538539

539-
When using nf-core pipelines, you **can not** use the `params` scope in custom configuration files. Parameters can **only** be configured using either a parameter file and the `-params-file` option, or using the `--<parameter_name>` syntax directly on the command line.
540+
When using nf-core pipelines, you **cannot** use the `params` scope in custom configuration files. Parameters can **only** be configured using either a parameter file and the `-params-file` option, or using the `--<parameter_name>` syntax directly on the command line.
540541

541542
!!! example "Exercise 1.3.6.3"
542543

@@ -590,7 +591,7 @@ process {
590591

591592
!!! tip
592593

593-
Note that `withName` selectors take precedence over `withLabel` selector. In the above example, if the process `MYPROCESS` also had the label `BIG_JOB`, it would only be assigned 4 CPUs and 8GB of memory as per the `withName` configuration.
594+
Note that `withName` selectors take precedence over `withLabel` selector. In the above example, if the process `MYPROCESS` also had the label `BIG_JOB`, it would only be assigned 4 CPUs and 8 GB of memory as per the `withName` configuration.
594595

595596
Most tools have lots of parameters that can be set on the command line. A workflow might expose the most important and frequently used of these as *workflow parameters* that can be directly configured via the `--<parameter_name>` syntax or in a parameter file.
596597

@@ -784,7 +785,7 @@ profiles {
784785

785786
!!! note "Key points"
786787

787-
- nf-core workflows follow a similar structure.
788-
- nf-core workflows are configured using multiple configuration sources.
789-
- Configuration sources are ranked to decide which settings to apply.
790-
- Workflow parameters must be passed via the command line (`--<parameter>`) or Nextflow `-params-file` option.
788+
- nf-core workflows follow a similar structure
789+
- nf-core workflows are configured using multiple configuration sources
790+
- Configuration sources are ranked to decide which settings to apply
791+
- Workflow parameters must be passed via the command line (`--<parameter>`) or Nextflow `-params-file` option

0 commit comments

Comments
 (0)