Usage guides, CLI options, and output formats for each command.
| Command | Description |
|---|---|
| train.md | Train a Random Forest classifier from labeled genomes |
| predict.md | Predict lineages using a trained model |
| classify.md | Call known SNP markers in assembled genomes |
| split-fastq.md | Alignment-free genotyping from raw FASTQ reads |
| match.md | Find the closest reference genome for a sample |
| Document | Description |
|---|---|
| input-formats.md | Format specifications for FASTA, marker TSV, and input lists |
| gui.md | Building and using the Tauri desktop application |
| benchmarks.md | Performance benchmarks: speed, throughput, memory, and comparisons |
In-depth descriptions of the algorithms, data structures, and design decisions behind each module.
| Document | Topic |
|---|---|
| algorithms/feature-hashing.md | The hashing trick: k-mers → fixed-size sparse vectors |
| algorithms/random-forest.md | Sparse CART trees with bootstrap aggregation and OOB accuracy |
| algorithms/training.md | End-to-end pipeline: vectorize → CV/split → train → serialize |
| algorithms/prediction.md | Streaming batch prediction with majority voting |
| algorithms/marker-genotyping.md | Diagnostic k-mers + Bloom filter for FASTQ scanning |
| algorithms/reference-matching.md | K-mer containment scoring with streaming batches |
| algorithms/assembly-classification.md | Marker calling on FASTA assemblies with GFF annotation |