You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,7 @@ Initial release of nf-core/proteinannotator, created with the [nf-core](https://
9
9
10
10
### `Added`
11
11
12
+
-[#62](https://github.com/nf-core/proteinannotator/pull/62) - Added the option to download and use the latest FunFam HMM library (or use path to an existing one) for domain annotation. (by @vagkaratzas)
12
13
-[#61](https://github.com/nf-core/proteinannotator/pull/61) - Added nf-core modules `ARIA2` and `HMMER_HMMSEARCH` to download latest Pfam HMM library (or use path to existing one) and match domains to input sequences. (by @vagkaratzas)
13
14
-[#60](https://github.com/nf-core/proteinannotator/pull/60) - Added nf-core module `S4PRED_RUNMODEL` for secondary structure prediction (i.e., α-helix, a β-strand or a coil). (by @vagkaratzas)
14
15
-[#59](https://github.com/nf-core/proteinannotator/pull/59) - Added nf-core qc and pre-processing subworkflow for amino acid sequences `FAA_SEQFU_SEQKIT`. (by @vagkaratzas)
Copy file name to clipboardExpand all lines: docs/output.md
+8-5Lines changed: 8 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,6 @@ This document describes the output produced by the pipeline. Most of the plots a
6
6
7
7
The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.
8
8
9
-
<!-- TODO nf-core: Write this documentation describing your workflow's output -->
10
-
11
9
## Pipeline overview
12
10
13
11
The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps:
@@ -17,8 +15,8 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
17
15
-[SeqKit](#seqkit) for preprocessing input amino acid sequences (i.e., gap removal, convert to upper case, validate, filter by length, replace special characters such as `/`, and remove duplicate sequences)
18
16
19
17
-[Domain annotation](#domain-annotation) Annotate proteins with domains from established repositories.
20
-
-[aria2](#aria2) - To optionally download the latest Pfam database through the pipeline.
21
-
-[hmmer](#hmmer) - To optionally match the input sequence to known Pfam domains through `hmmer/hmmsearch`
18
+
-[aria2](#aria2) - To optionally download the latest Pfam and/or FunFam databases through the pipeline.
19
+
-[hmmer](#hmmer) - To optionally match the input sequence to known Pfam and/or FunFam domains through `hmmer/hmmsearch`
22
20
23
21
-[Functional annotation](#functional-annotation) Annotate proteins with functional domains
24
22
-[InterProScan](#Interproscan) - Search the InterPro database for functional domains
@@ -73,9 +71,12 @@ The `seqkit` module is used for initial preprocessing (i.e., gap removal, conver
73
71
74
72
-`downloaded_dbs/`
75
73
-`Pfam-A*.hmm.gz`: (optional) The latest full, or a minimal test, Pfam-A HMM database that can be downloaded through the pipeline.
74
+
-`funfam-hmm3-v4_3_0*.lib.gz`: (optional) The latest (v4_3_0) full, or a minimal test, FunFam HMM database that can be downloaded through the pipeline.
76
75
77
76
</details>
78
77
78
+
If the `skip_*` flags (e.g., `skip_pfam`, `skip_funfam`) for each domain annotation database is set to `true`, or the `*_db` parameter paths (e.g., `pfam_db`, `funfam_db`) are set (i.e., not `null`), or the run is resumed after a successful database download, then the respective database will not be (re)downloaded. The full database links can be found in the main `nextflow.config` file, while minimal test versions can be found in the `test` and `test_full` profiles (i.e., `conf/test.config`, `conf/test_full.config`).
79
+
79
80
[aria2](https://github.com/aria2/aria2/) is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.
80
81
81
82
#### hmmer
@@ -86,10 +87,12 @@ The `seqkit` module is used for initial preprocessing (i.e., gap removal, conver
86
87
-`domain_annotation/`
87
88
-`pfam/`
88
89
-`<samplename>.domtbl.gz`: `hmmer/hmmsearch` results along parameters info.
90
+
-`funfam/`
91
+
-`<samplename>.domtbl.gz`: `hmmer/hmmsearch` results along parameters info.
89
92
90
93
</details>
91
94
92
-
The `domain_annotation/pfam` folder contains a `.domtbl.gz` annotation file per input sample.
95
+
Each of the `domain_annotation/` subfolders (e.g., `pfam`, `funfam`) contain a `.domtbl.gz` annotation file per input sample, depending on which domain annotation databases were used in the pipeline execution.
93
96
94
97
[hmmer](https://github.com/EddyRivasLab/hmmer) is a fast and flexible alignment trimming tool that keeps phylogenetically informative sites and removes others.
0 commit comments