Skip to content

Commit e9b6248

Browse files
chore: rebuild [skip ci]
1 parent 3d8de68 commit e9b6248

11 files changed

Lines changed: 75884 additions & 52076 deletions

File tree

data/enpen/collection.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
]
2424
},
2525
"dataset_order": [
26-
"enpen/enterovirus/ev-d68"
26+
"enpen/enterovirus/ev-d68",
27+
"enpen/enterovirus/cva16"
2728
]
2829
}
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
## Unreleased
2+
3+
Initial release of a Coxsackievirus A16 dataset for lineage classification!
4+
5+
Read more about Nextclade datasets in the documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Coxsackievirus A16 dataset
2+
3+
| Key | Value |
4+
|----------------------|-----------------------------------------------------------------------|
5+
| authors | [Nadia Neuner-Jehle](https://eve-lab.org/people/nadia-neuner-jehle/), [Alejandra González-Sánchez](https://www.vallhebron.com/en/professionals/alejandra-gonzalez-sanchez), [Emma B. Hodcroft](https://eve-lab.org/people/emma-hodcroft/), [ENPEN](https://escv.eu/european-non-polio-enterovirus-network-enpen/) |
6+
| name | Coxsackievirus A16 |
7+
| reference | [Static Inferred Ancestor](https://github.com/enterovirus-phylo/nextclade_a16/blob/master/resources/inferred-root.fasta) |
8+
| workflow | https://github.com/enterovirus-phylo/nextclade_a16 |
9+
| path | `enpen/enterovirus/cva16` |
10+
| clade definitions | A–F |
11+
12+
## Scope of this dataset
13+
14+
This dataset uses the [Static Inferred Ancestor](https://github.com/enterovirus-phylo/nextclade_a16/blob/master/resources/inferred-root.fasta) instead of the historical G-10 prototype sequence ([U05876.1](https://www.ncbi.nlm.nih.gov/nuccore/U05876)). It is intended for broad subgenogroup classification, mutation quality control, and phylogenetic analysis of CVA16 diversity.
15+
16+
*Note: The G-10 reference differs substantially from currently circulating strains.* This is common for enterovirus datasets, in contrast to some other virus datasets (e.g., seasonal influenza), where the reference is updated more frequently to reflect recent lineages.
17+
18+
To address this, the dataset is *rooted* on a Static Inferred Ancestor, a phylogenetically reconstructed ancestral sequence near the tree root. This provides a stable reference point that can be used as an alternative for mutation calling.
19+
20+
## Features
21+
22+
This dataset supports:
23+
24+
- Assignment of subgenotypes
25+
- Phylogenetic placement
26+
- Sequence quality control (QC)
27+
28+
## Subgenogroups of Coxsackievirus A16
29+
30+
Subgenogroups B1a, B1b and B1c represent the major phylogenetic divisions of CVA16 and are commonly used in virological surveillance and the literature. They are defined based on phylogenetic clustering and do not necessarily reflect antigenic differences.
31+
32+
In recent years, additional recombinant forms have been identified and labeled C-F (also referred to as B2, B3, and D). These recombinant forms cluster with the prototype strain (clade A).
33+
34+
Overall, these designations are based on phylogenetic structure and characteristic mutations, and are widely used in molecular epidemiology, similar to subgenotype systems for other enteroviruses. Unlike influenza (H1N1, H3N2) or SARS-CoV-2, there is no universally standardized global lineage nomenclature for enteroviruses; naming instead follows conventions established in published studies and surveillance practices.
35+
36+
## Related Enteroviruses
37+
38+
CVA16 is closely related to other Enterovirus A (EV-A) viruses, including EV-A71, EV-A120, and CVA5. If you are not certain that your sequences contain only CVA16, we recommend using the "[Multiple Datasets](https://docs.nextstrain.org/projects/nextclade/en/stable/user/nextclade-web/getting-started.html#multi-dataset-mode)" tab instead of "Single Dataset". We are currently working on improving multiple virus assignment.
39+
40+
This prevents Nextclade from forcing sequences to align to the CVA16 reference tree. For example, EV-A71 sequences may still align and receive a clade assignment (often near recombinant forms).
41+
42+
Please be cautious when working with short genes or fragments (e.g., 5'UTR sequences). These regions can be highly conserved across EV-A viruses, making genogroup and subgenogroup assignment prone to errors. In addition, such fragments may originate from recombinant genomes. Recombination is common in enteroviruses, and when analyzing only a fragment, this may go undetected.
43+
44+
If you are unsure how to proceed, please contact us. We are happy to assist.
45+
46+
## Reference types
47+
48+
This dataset includes several reference points used in analyses:
49+
- *Static Inferred Ancestor:* Reconstructed ancestral sequence inferred with an outgroup, representing the likely founder of CVA16. Serves as a stable reference.
50+
51+
- *Parent:* The nearest ancestral node of a sample in the tree, used to infer branch-specific mutations.
52+
53+
- *Clade founder:* The inferred ancestral node defining a clade (e.g., B1a, B2). Mutations "since clade founder" describe changes that define that clade.
54+
55+
- *Reference:* RefSeq or similarly established prototype sequence. Here G-10 (U05876.1).
56+
57+
- *Tree root:* Corresponds to the root of the tree, it may change in future updates as more data become available.
58+
59+
All references use the coordinate system of the G-10 sequence.
60+
61+
## Issues & Contact
62+
- For questions or suggestions, please [open an issue](https://github.com/enterovirus-phylo/nextclade_a16/issues) or email: eve-group[at]swisstph.ch
63+
64+
## What is a Nextclade dataset?
65+
66+
A Nextclade dataset includes the reference sequence, genome annotations, tree, clade definitions, and QC rules. Learn more in the [Nextclade documentation](https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html).
1.51 MB
Binary file not shown.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
##gff-version 3
2+
#!gff-spec-version 1.21
3+
#!processor NCBI annotwriter
4+
##sequence-region U05876.1 1 7413
5+
##species https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=31704
6+
U05876.1 Genbank region 1 7413 . + . ID=U05876.1:1..7413;Dbxref=taxon:31704;gb-acronym=CV-A16;gbkey=Src;mol_type=genomic RNA;nat-host=Homo sapiens;strain=G-10
7+
U05876.1 Genbank CDS 751 957 . + . Name=VP4;gbkey=Prot;product=VP4;ID=id-AAA50478.1:1..69
8+
U05876.1 Genbank CDS 958 1719 . + . Name=VP2;gbkey=Prot;product=VP2;ID=id-AAA50478.1:70..323
9+
U05876.1 Genbank CDS 1720 2445 . + . Name=VP3;gbkey=Prot;product=VP3;ID=id-AAA50478.1:324..565
10+
U05876.1 Genbank CDS 2446 3336 . + . Name=VP1;gbkey=Prot;product=VP1;ID=id-AAA50478.1:566..862
11+
U05876.1 Genbank CDS 3337 3786 . + . Name=2A;product=2A;gbkey=Prot;ID=id-AAA50478.1:863..1012
12+
U05876.1 Genbank CDS 3787 4083 . + . Name=2B;product=2B;gbkey=Prot;ID=id-AAA50478.1:1013..1111
13+
U05876.1 Genbank CDS 4084 5070 . + . Name=2C;product=2C;gbkey=Prot;ID=id-AAA50478.1:1112..1440
14+
U05876.1 Genbank CDS 5071 5328 . + . Name=3A;product=3A;gbkey=Prot;ID=id-AAA50478.1:1441..1526
15+
U05876.1 Genbank CDS 5329 5394 . + . Name=3B;product=3B;gbkey=Prot;ID=id-AAA50478.1:1527..1548
16+
U05876.1 Genbank CDS 5395 5943 . + . Name=3C;product=3C;gbkey=Prot;ID=id-AAA50478.1:1549..1731
17+
U05876.1 Genbank CDS 5944 7329 . + . Name=3D;product=3D;gbkey=Prot;ID=id-AAA50478.1:1732..2193

0 commit comments

Comments
 (0)