Skip to content

Latest commit

 

History

History
13 lines (11 loc) · 1.03 KB

File metadata and controls

13 lines (11 loc) · 1.03 KB

Algorithm Documentation

This directory describes the algorithms and data structures behind each pathotypr module. For CLI usage and options, see the command docs.

Module Document Core Idea
Feature Hashing feature-hashing.md The hashing trick: k-mers → fixed-size sparse vectors
Random Forest random-forest.md Sparse CART trees with bootstrap aggregation
Training Pipeline training.md End-to-end: vectorize → evaluate → train → OOB → export
Prediction prediction.md Streaming batch prediction with majority voting
Marker Genotyping marker-genotyping.md Diagnostic k-mers + Bloom filter for FASTQ scanning
Reference Matching reference-matching.md K-mer containment scoring with streaming batches
Assembly Classification assembly-classification.md Marker calling on FASTA assemblies with GFF annotation