Skip to content

Lab 04: Structuring Projects and Inputs

Ryan edited this page Jan 8, 2024 · 9 revisions

Structuring Projects and Inputs

Inspect project organization

Go to exercise directory 04_structure_and_input and change to the pipeline directory.

This project is organized as follows:

.
├── fasta_seqs
│   ├── seqs_1.fna
│   ├── seqs_2.fna
│   ├── seqs_3.fna
│   ├── seqs_4.fna
│   └── seqs_5.fna
├── pipeline
│   ├── main.nf
│   ├── modules
│   │   └── fasta_utils.nf
│   └── nextflow.config
└── run_01.sh

Notice that we now have a modules subdirectory containing 'fasta_utils.nf'.

We also have a directory of 5 fasta sequence files to practice with.

Importing processes

Previously, we included our processes within the "main.nf" file. Although this is possible, it can quickly become very crowded as processes are added, so we will now practice writing our Nextflow processes in separate files and importing them into main.nf:

include { GET_HEADERS    } from "./modules/fasta_utils.nf"

Here we import the process "GET_HEADERS" from the file "fasta_utils.nf" and it is available within the scope of the main.nf workflow.

Workflow conditionals

This "main.nf" workflow contains a new type of syntax in the form of an if statement. If statements follow the groovy language convention of if () { code to perform if condition is true }. For example:

if (params.fasta_seqs) { 
    if (params.fasta_seqs) {                                                    
        ch_fastas = Channel.fromPath("${params.fasta_seqs}/*fna", checkIfExists: true)
        GET_HEADERS(ch_fastas)                                                  
    }

This "if block" is going to first test whether there is a parameter (from the "parameters" scope) called "fasta_seqs". If this variable is defined, then then a channel called ch_fastas will be produced using the fromPath operator, and this channel will be passed into the GET_HEADERS process we included earlier.

Clone this wiki locally