|
23 | 23 | "@type": "Dataset", |
24 | 24 | "creativeWorkStatus": "InProgress", |
25 | 25 | "datePublished": "2026-04-30T13:33:06+00:00", |
26 | | - "description": "<h1>\n <picture>\n <source media=\"(prefers-color-scheme: dark)\" srcset=\"docs/images/nf-core-proteinannotator_logo_dark.png\">\n <img alt=\"nf-core/proteinannotator\" src=\"docs/images/nf-core-proteinannotator_logo_light.png\">\n </picture>\n</h1>\n\n[](https://github.com/codespaces/new/nf-core/proteinannotator)\n[](https://github.com/nf-core/proteinannotator/actions/workflows/nf-test.yml)\n[](https://github.com/nf-core/proteinannotator/actions/workflows/linting.yml)[](https://nf-co.re/proteinannotator/results)[](https://doi.org/10.5281/zenodo.XXXXXXX)\n[](https://www.nf-test.com)\n\n[](https://www.nextflow.io/)\n[](https://github.com/nf-core/tools/releases/tag/4.0.2)\n[](https://docs.conda.io/en/latest/)\n[](https://www.docker.com/)\n[](https://sylabs.io/docs/)\n[](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/proteinannotator)\n\n[](https://nfcore.slack.com/channels/proteinannotator)[](https://bsky.app/profile/nf-co.re)[](https://mstdn.science/@nf_core)[](https://www.youtube.com/c/nf-core)\n\n## Introduction\n\n**nf-core/proteinannotator** is a bioinformatics pipeline that ...\n\n<!-- TODO nf-core:\n Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the\n major pipeline sections and the types of output it produces. You're giving an overview to someone new\n to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction\n-->\n\n<!-- TODO nf-core: Include a figure that guides the user through the major workflow steps. Many nf-core\n workflows use the \"tube map\" design for that. See https://nf-co.re/docs/community/brand/workflow-schematics#examples for examples. -->\n<!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline -->2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/get_started/environment_setup/overview) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/get_started/run-your-first-pipeline) with `-profile test` before running the workflow on actual data.\n\n<!-- TODO nf-core: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.\n Explain what rows and columns represent. For instance (please edit as appropriate):\n\nFirst, prepare a samplesheet with your input data that looks as follows:\n\n`samplesheet.csv`:\n\n```csv\nsample,fastq_1,fastq_2\nCONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz\n```\n\nEach row represents a fastq file (single-end) or a pair of fastq files (paired end).\n\n-->\n\nNow, you can run the pipeline using:\n\n<!-- TODO nf-core: update the following command to include all required parameters for a minimal example -->\n\n```bash\nnextflow run nf-core/proteinannotator \\\n -profile <docker/singularity/.../institute> \\\n --input samplesheet.csv \\\n --outdir <OUTDIR>\n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/running/run-pipelines#using-parameter-files).\n\nFor more details and further functionality, please refer to the [usage documentation](https://nf-co.re/proteinannotator/usage) and the [parameter documentation](https://nf-co.re/proteinannotator/parameters).\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/proteinannotator/results) tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\n[output documentation](https://nf-co.re/proteinannotator/output).\n\n## Credits\n\nnf-core/proteinannotator was originally written by Olga Botvinnik, Evangelos Karatzas.\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\n<!-- TODO nf-core: If applicable, make list of people who have also contributed -->\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](docs/CONTRIBUTING.md).\n\nFor further information or help, don't hesitate to get in touch on the [Slack `#proteinannotator` channel](https://nfcore.slack.com/channels/proteinannotator) (you can join with [this invite](https://nf-co.re/join/slack)).\n\n## Citations\n\n<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->\n<!-- If you use nf-core/proteinannotator for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->\n\n<!-- TODO nf-core: Add bibliography of tools and data used in your pipeline -->\n\nAn extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.\n\nYou can cite the `nf-core` publication as follows:\n\n> **The nf-core framework for community-curated bioinformatics pipelines.**\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).\n", |
| 26 | + "description": "<h1>\n <picture>\n <source media=\"(prefers-color-scheme: dark)\" srcset=\"docs/images/nf-core-proteinannotator_logo_dark.png\">\n <img alt=\"nf-core/proteinannotator\" src=\"docs/images/nf-core-proteinannotator_logo_light.png\">\n </picture>\n</h1>\n\n[](https://github.com/codespaces/new/nf-core/proteinannotator)\n[](https://github.com/nf-core/proteinannotator/actions/workflows/nf-test.yml)\n[](https://github.com/nf-core/proteinannotator/actions/workflows/linting.yml)[](https://nf-co.re/proteinannotator/results)[](https://doi.org/10.5281/zenodo.18547735)\n[](https://www.nf-test.com)\n\n[](https://www.nextflow.io/)\n[](https://github.com/nf-core/tools/releases/tag/4.0.2)\n[](https://docs.conda.io/en/latest/)\n[](https://www.docker.com/)\n[](https://sylabs.io/docs/)\n[](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/proteinannotator)\n\n[](https://nfcore.slack.com/channels/proteinannotator)[](https://bsky.app/profile/nf-co.re)[](https://mstdn.science/@nf_core)[](https://www.youtube.com/c/nf-core)\n\n## Introduction\n\n**nf-core/proteinannotator** is a bioinformatics pipeline that computes statistics for protein FASTA inputs and produces protein annotations based on predicted sequence features, including conserved domains, functions, and secondary structure.\n\n<p>\n <picture>\n <source media=\"(prefers-color-scheme: dark)\" srcset=\"docs/images/proteinannotator_metromap_dark.png\">\n <img alt=\"nf-core/proteinannotator\" src=\"docs/images/proteinannotator_metromap_light.png\">\n </picture>\n</p>\n\n### Check quality and pre-process\n\nGenerate input amino acid sequence statistics with ([`SeqFu`](https://github.com/telatin/seqfu2/)) and pre-process them (i.e., gap removal, convert to upper case, validate, filter by length, replace special characters such as `/`, and remove duplicate sequences) with ([`SeqKit`](https://github.com/shenwei356/seqkit/))\n\n### Annotate sequences\n\n1. Conserved domain annotation with ([`hmmer`](https://github.com/EddyRivasLab/hmmer/)) against databases\n such as [Pfam](https://ftp.ebi.ac.uk/pub/databases/Pfam/), [FunFam](https://download.cathdb.info/cath/releases/all-releases/), and [NMPFams and metagRoot](https://pavlopoulos-lab.org/envofams/databases/hmmer/)\n2. Functional annotation:\n - ([`InterProScan`](https://interproscan-docs.readthedocs.io/en/v5/)) a software tool used to analyze protein sequences by scanning them against the signatures of protein families, domains, and sites in the [InterPro](https://www.ebi.ac.uk/interpro/) database, helping to identify their functional characteristics.\n3. Predict secondary structure compositional features such as \u03b1-helices, \u03b2-strands and coils with ([`s4pred`](https://github.com/psipred/s4pred))\n4. Present QC stats for input sequences before and after initial pre-processing with ([`MultiQC`](http://multiqc.info/))\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/get_started/environment_setup/overview) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/get_started/run-your-first-pipeline) with `-profile test` before running the workflow on actual data.\n\nFirst, prepare a samplesheet with your input data that looks as follows:\n\n`samplesheet.csv`:\n\n```csv\nid,fasta\nspecies1,species1_proteins.fasta\nspecies2,species2_proteins.fasta\n```\n\nEach row represents a FASTA file of proteins from a single species.\n\nNow, you can run the pipeline using:\n\n```bash\nnextflow run nf-core/proteinannotator \\\n -profile <docker/singularity/.../institute> \\\n --input samplesheet.csv \\\n --outdir <OUTDIR>\n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/running/run-pipelines#using-parameter-files).\n\nFor more details and further functionality, please refer to the [usage documentation](https://nf-co.re/proteinannotator/usage) and the [parameter documentation](https://nf-co.re/proteinannotator/parameters).\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the [results](https://nf-co.re/proteinannotator/results) tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\n[output documentation](https://nf-co.re/proteinannotator/output).\n\n## Credits\n\nnf-core/proteinannotator was originally written by Olga Botvinnik and Evangelos Karatzas.\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\n- [Michael L Heuer](https://github.com/heuermh)\n- [Edmund Miller](https://github.com/edmundmiller)\n- [Eric Wei](https://github.com/eweizy)\n- [Martin Beracochea](https://github.com/mberacochea)\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](docs/CONTRIBUTING.md).\n\nFor further information or help, don't hesitate to get in touch on the [Slack `#proteinannotator` channel](https://nfcore.slack.com/channels/proteinannotator) (you can join with [this invite](https://nf-co.re/join/slack)).\n\n## Citations\n\nIf you use nf-core/proteinannotator for your analysis, please cite it using the following doi: [10.5281/zenodo.18547735](https://doi.org/10.5281/zenodo.18547735)\n\nAn extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.\n\nYou can cite the `nf-core` publication as follows:\n\n> **The nf-core framework for community-curated bioinformatics pipelines.**\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> _Nat Biotechnol._ 2020 Feb 13. doi: [10.1038/s41587-020-0439-x](https://dx.doi.org/10.1038/s41587-020-0439-x).\n", |
27 | 27 | "hasPart": [ |
28 | 28 | { |
29 | 29 | "@id": "main.nf" |
|
121 | 121 | }, |
122 | 122 | { |
123 | 123 | "@id": "main.nf", |
124 | | - "@type": ["File", "SoftwareSourceCode", "ComputationalWorkflow"], |
| 124 | + "@type": [ |
| 125 | + "File", |
| 126 | + "SoftwareSourceCode", |
| 127 | + "ComputationalWorkflow" |
| 128 | + ], |
125 | 129 | "contributor": [ |
126 | 130 | { |
127 | 131 | "@id": "https://orcid.org/0000-0003-4412-7970" |
|
133 | 137 | "dateCreated": "", |
134 | 138 | "dateModified": "2026-04-30T13:33:06Z", |
135 | 139 | "dct:conformsTo": "https://bioschemas.org/profiles/ComputationalWorkflow/1.0-RELEASE/", |
136 | | - "keywords": ["nf-core", "nextflow", "annotation", "proteomics"], |
137 | | - "license": ["MIT"], |
138 | | - "name": ["nf-core/proteinannotator"], |
| 140 | + "keywords": [ |
| 141 | + "nf-core", |
| 142 | + "nextflow", |
| 143 | + "annotation", |
| 144 | + "proteomics" |
| 145 | + ], |
| 146 | + "license": [ |
| 147 | + "MIT" |
| 148 | + ], |
| 149 | + "name": [ |
| 150 | + "nf-core/proteinannotator" |
| 151 | + ], |
139 | 152 | "programmingLanguage": { |
140 | 153 | "@id": "https://w3id.org/workflowhub/workflow-ro-crate#nextflow" |
141 | 154 | }, |
142 | 155 | "sdPublisher": { |
143 | 156 | "@id": "https://nf-co.re/" |
144 | 157 | }, |
145 | | - "url": ["https://github.com/nf-core/proteinannotator", "https://nf-co.re/proteinannotator/dev/"], |
146 | | - "version": ["1.1.0dev"] |
| 158 | + "url": [ |
| 159 | + "https://github.com/nf-core/proteinannotator", |
| 160 | + "https://nf-co.re/proteinannotator/dev/" |
| 161 | + ], |
| 162 | + "version": [ |
| 163 | + "1.1.0dev" |
| 164 | + ] |
147 | 165 | }, |
148 | 166 | { |
149 | 167 | "@id": "https://w3id.org/workflowhub/workflow-ro-crate#nextflow", |
|
0 commit comments