Skip to content

Latest commit

 

History

History

Algorithm Documentation

This directory describes the algorithms and data structures behind each pathotypr module. For CLI usage and options, see the command docs.

Module Document Core Idea
Feature Hashing feature-hashing.md The hashing trick: k-mers → fixed-size sparse vectors
Random Forest random-forest.md Sparse CART trees with bootstrap aggregation
Training Pipeline training.md End-to-end: vectorize → evaluate → train → OOB → export
Prediction prediction.md Streaming batch prediction with majority voting
Marker Genotyping marker-genotyping.md Diagnostic k-mers + Bloom filter for FASTQ scanning
Reference Matching reference-matching.md K-mer containment scoring with streaming batches
Assembly Classification assembly-classification.md Marker calling on FASTA assemblies with GFF annotation