@@ -61,14 +61,15 @@ cargo build --release
6161# 1. Train a Random Forest model
6262pathotypr train --input training_genomes.fasta --output my_species.model.gz
6363
64- # 2. Predict the class of a new genome
65- pathotypr predict --input new_genome.fasta --model my_species.model.gz --output prediction.tsv
64+ # 2. Predict the class of a new genome (with debug logging)
65+ pathotypr predict --input new_genome.fasta --model my_species.model.gz --output prediction.tsv -v
6666
6767# 3. Genotype variants in an assembled genome
68- pathotypr classify --markers variants.tsv --reference ref.fasta --genome-fasta my_genome.fasta --output classified_variants.tsv
68+ pathotypr classify --markers variants.tsv --reference ref.fasta --input my_genome.fasta --output-prefix classified_variants
6969
7070# 4. Genotype variants directly from raw reads
7171pathotypr split-fastq --markers variants.tsv --reference ref.fasta -i sample_R1.fq.gz -i sample_R2.fq.gz --paired --output-prefix sample_genotyping
72+
7273```
7374
7475## Documentation
@@ -85,6 +86,7 @@ Builds and trains a Random Forest model from a multifasta file where headers are
8586| --kmer-size | -k | The size of the k-mers to generate from sequences. | 6 |
8687| --test-split | -s | Proportion of the data to use for the test set. | 0.2 (20%) |
8788| --threads | -t | Number of CPU threads to use. | All available |
89+ | --verbose | -v | Set the verbosity level. Use -v for debug, -vv for trace. | Off |
8890
8991_ The tool will warn you if it detects a strong class imbalance in your training data._
9092
@@ -106,6 +108,7 @@ Classifies new genomes using a model file generated by `train`.
106108| --model | -m | Path to the unified model file created by the train command. | Required |
107109| --output | -o | Path for the output file where predictions will be written in TSV format. | Required |
108110| --threads | -t | Number of CPU threads to use. | All available |
111+ | --verbose | -v | Set the verbosity level. Use -v for debug, -vv for trace. | Off |
109112
110113** Usage** :
111114``` bash
@@ -155,6 +158,7 @@ Both commands use the same flexible TSV format for defining variants:
155158| --gff | | Optional GFF file for annotation when using --input. | Optional |
156159| --kmer-size | -k | The size of the diagnostic k-mers to use. | 21 |
157160| --threads | -t | Number of CPU threads to use. | All available |
161+ | --verbose | -v | Set the verbosity level. Use -v for debug, -vv for trace. | Off |
158162
159163** Usage** :
160164``` bash
@@ -207,6 +211,7 @@ Perform ultra-fast, alignment-free genotyping of SNPs, MNVs, and both small and
207211| --min-depth | | Minimum read depth required to call a variant. | 10 |
208212| --min-alt-percent| | Minimum frequency of the alternate allele to call a variant (%). | 95 |
209213| --threads | -t | Number of CPU threads to use. | All available |
214+ | --verbose | -v | Set the verbosity level. Use -v for debug, -vv for trace. | Off |
210215
211216** Usage** :
212217``` bash
@@ -223,11 +228,13 @@ pathotypr split-fastq \
223228```
224229pathotypr/
225230├── src/
226- │ ├── main.rs # CLI handling
227- │ ├── train.rs # Model training logic
228- │ ├── predict.rs # Model prediction logic
229- │ ├── classify.rs # Variant detection in assemblies
230- │ ├── classify_split_fastq.rs # Variant detection in reads
231+ │ ├── main.rs # CLI handling and dispatch
232+ │ ├── errors.rs # Custom error types
233+ │ ├── common.rs # Shared code (model bundle, kmerize)
234+ │ ├── train.rs # `train` subcommand logic
235+ │ ├── predict.rs # `predict` subcommand logic
236+ │ ├── classify.rs # `classify` subcommand logic
237+ │ ├── classify_split_fastq.rs # `split-fastq` subcommand logic
231238│ └── split_kmer.rs # Core dynamic k-mer engine
232239└── Cargo.toml
233240
0 commit comments