Skip to content

Commit 6878e53

Browse files
authored
Update readme.md to gff annot
1 parent 95c5fcb commit 6878e53

1 file changed

Lines changed: 26 additions & 0 deletions

File tree

readme.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,32 @@ pathotypr classify \
164164
--genome-fasta <GENOMES_FASTA> \
165165
[OPTIONS]
166166
```
167+
#### Functional Annotation with GFF
168+
The classify command can translate SNPs into amino acid changes if provided with a GFF3 annotation file.
169+
170+
How to provide GFF files:
171+
- For a single FASTA input (--input): Use the --gff flag to specify a single GFF file that corresponds to the sequences in the FASTA file.
172+
- For multiple genomes via a list (--input-list): Add a third, optional column to your TSV file containing the path to the corresponding GFF file for each genome.
173+
174+
Example input-list.tsv:
175+
```bash
176+
SampleA path/to/sampleA.fasta path/to/sampleA.gff3
177+
SampleB path/to/sampleB.fasta path/to/sampleB.gff3
178+
SampleC path/to/sampleC.fasta # No GFF for this sample
179+
```
180+
Output Columns:
181+
When annotation is performed, the output file will contain three additional columns:
182+
183+
- Gene: The ID of the gene where the SNP is located.
184+
- AA_Pos: The position of the amino acid within the gene.
185+
- AA_Change: The resulting amino acid (using 3-letter code).
186+
187+
Example Output:
188+
```bash
189+
genome k-mer k-merPOS SNPgenome SNPreference lineage Gene AA_Pos AA_Change
190+
G0000_contig_1 GGCGGCGCCGCCTGGGTGGAG 1854184 1854194 1859559 L4 Rv1649 276 Gly
191+
G0000_contig_1 GACCCCGAGGCCCGGGCCGGC 4296504 4296514 4313128 L4 gyrA 95 Ser
192+
```
167193

168194
#### `split-fastq`
169195
Perform ultra-fast, alignment-free genotyping of SNPs, MNVs, and both small and large structural variants (Indels/SVs) directly from raw FASTQ reads.

0 commit comments

Comments
 (0)