diff --git a/use_case_examples/MAG_fishing/README.md b/use_case_examples/MAG_fishing/README.md new file mode 100644 index 00000000..41e8d3dd --- /dev/null +++ b/use_case_examples/MAG_fishing/README.md @@ -0,0 +1,77 @@ +# YACHT for MAG Fishing + +## Description of use-case-example + +Metagenomic Assembled-Genomes (MAG) fishing is the process of reporting the assembled genomes within a metagenomic sample. + +Metagenomics has been an important field in exploring the microbial communities of specific environments, especially for environments that contain unculturable microbes. However, there is a persistent underrepresentation of genomes challenging the production of a high-resolution of taxonomic profile. Consequently, many microbial communities are still understudied. Efforts have been made to increase the knowledge of these environments, such as the study we highlight here. One of the goals in the study by Banchi and colleagues (link to paper) was to unveil a more resolved taxonomic composition of marine sediments in the Venice Lagoon for further functional analyses of these microbial communities. The dataset from this study (NCBI accession: PRJNA924243) has 58 MAGS and serves as a use case example of using YACHT to resolve taxonomic composition. + +According to their study, we should expect YACHT to report species from the phylum Proteobacteria, under the classes Alphaproteobacteria, Gammaproteobacteria, and Deltaprotwobacteria. + +Banchi, E., Corre, E., Del Negro, P., Celussi, M., & Malfatti, F. (2024). Genome-resolved metagenomics of Venice Lagoon surface sediment bacteria reveals high biosynthetic potential and metabolic plasticity as successful strategies in an impacted environment. Marine Life Science & Technology, 6(1), 126-142. + +## Install the following programs + +**datasets** + +More information please go to: [datasets](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/download-and-install/). + +**YACHT** + +More information please visit: [YACHT](https://github.com/KoslickiLab/YACHT). + + +## Download MAG samples + +The following command was used to download the MAG sample of interest: + +``` +datasets download genome accession PRJNA924243 +``` + +Downloading this MAG project will produce a directory of multiple pathways for each fasta file in this project but yacht wants one fasta file with running `yacht sketch sample` and yacht will let you know of this with the following error message: + +``` +ValueError: Please provide either one file for single-end reads or two files for paired-end reads. +``` + +A work around is running the following commands. + +``` +cd MAG_data +cp data/ncbi_dataset/data/GCA_02928*/*fna MAG_data/. +``` + +## yacht sketch MAG of interest + +``` +yacht sketch sample --infile MAG_sample.fna --kmer 31 --scaled 1000 --outfile sample.sig.zip +``` + +Be aware that `yacht sketch sample` will create a sketch sample with more than one signature, but `yacht run` wants a sample with one signature, so it will direct you to create merge signatures using `sourmash merge`. Please execute the following command: + +``` +sourmash sig merge sample.sig.zip -k 31 -o sample_merge.sig.zip +``` + +## Using yacht download pretrained_ref_db + +Trying to download the pretrained data did not download anything + +``` +yacht download pretrained_ref_db --database gtdb --db_version rs214 --k 31 --ani_thresh 0.9995 --outfolder ./ +``` + +Runnning the following command, gave me everything? + +``` +yacht download default_ref_db --database gtdb --db_version rs214 --gtdb_type reps --k 31 --outfolder ./ +``` + +It seems that the yacht downloand pretrained_ref_db doesn't show? It never really completed but completed once I ran yacht download default_ref_db + +## yacht run +``` +yacht run --json 'gtdb-rs214-reps.k31_0.9995_pretrained/gtdb-rs214-reps.k31_0.9995_config.json' --sample_file 'sample_merge.sig.zip' --num_threads 32 --keep_raw --significance 0.99 --min_coverage_list 1 0.5 0.1 0.05 0.01 --out ./result.xlsx +``` + diff --git a/use_case_examples/MAG_fishing/result.xlsx b/use_case_examples/MAG_fishing/result.xlsx new file mode 100644 index 00000000..d1ea1bdb Binary files /dev/null and b/use_case_examples/MAG_fishing/result.xlsx differ diff --git a/use_case_examples/contamination_detection_example/1_before_starting.sh b/use_case_examples/contamination_detection_example/1_before_starting.sh new file mode 100644 index 00000000..05078e0c --- /dev/null +++ b/use_case_examples/contamination_detection_example/1_before_starting.sh @@ -0,0 +1,17 @@ +# Download SRR25626360 which represents WGS of Haemophilus influenzae +nohup fastq-dump --fasta 60 SRR25626360 2>&1 & + +### Download SRR24210460 which represents WGS of mycoplasma pneumoniae from library MDY +nohup fastq-dump --fasta 60 SRR24210460 2>&1 & + +### Download SRR7217470 which represents WGS of Chlamydia pneumoniae +nohup fastq-dump --fasta 60 SRR7217470 2>&1 & + +### Download SRR5962942 which represents WGS of Streptococcus pneumoniae +nohup fastq-dump --fasta 60 SRR5962942 2>&1 & + +### Download SRR26202532 which represents WGS of Bordetella pertussis +nohup fastq-dump --fasta 60 SRR26202532 2>&1 & + +### Download SRR2830253, reads of a healthy human lung microbiome +nohup fastq-dump --fasta 60 SRR2830253 2>&1 & diff --git a/use_case_examples/contamination_detection_example/2_before_starting.sh b/use_case_examples/contamination_detection_example/2_before_starting.sh new file mode 100644 index 00000000..3bccd1ba --- /dev/null +++ b/use_case_examples/contamination_detection_example/2_before_starting.sh @@ -0,0 +1,30 @@ +# Create the example sample data for a patient with respiratory symptoms seeks to find out the pathogen that is causing them these symptoms. + +# Before moving on. Make sure reads needed to create sample dataset are available. Please reference create_reference_database.md + +# Create samples that will be loaded to the 96-well tray + +# Negative control, so just reads from a healthy lung +cat SRR2830253.fasta negative_control_well_11.fasta + +# Positive control with H. influenzae +cat SRR25626360.fasta SRR2830253.fasta > positive_control_well_23.fasta + +# Sample 1 +cat SRR25626360.fasta SRR2830253.fasta SRR25626360.fasta > positive_control_well_64.fasta + +# Sample 2 +cat SRR24210460.fasta SRR2830253.fasta SRR25626360.fasta > sample_well_80.fasta + +# I check one of my negative controls, which is a healthy lung example and we should not detect any bacteria here +# no contamination + +# I check one of my positive controls for M. pneumonaie which should not have H. influenzae +# no contamination + +# I check one of my positive controls for H. influenzae which should not have M. pneumonaie +# contamination + +# I check one of my samples for H. influenzae which should not have M. pneumonaie +# contamination + diff --git a/use_case_examples/contamination_detection_example/Picture1.png b/use_case_examples/contamination_detection_example/Picture1.png new file mode 100644 index 00000000..008b4d42 Binary files /dev/null and b/use_case_examples/contamination_detection_example/Picture1.png differ diff --git a/use_case_examples/contamination_detection_example/contamination_detection_example.md b/use_case_examples/contamination_detection_example/contamination_detection_example.md new file mode 100644 index 00000000..646e547d --- /dev/null +++ b/use_case_examples/contamination_detection_example/contamination_detection_example.md @@ -0,0 +1,95 @@ +# Contamination Detection Example +A research is being conducted on how microbial communities are being shaped among diffrent types of respiratory diseases. Samples were collected from two patience in wihch one patient is M. pneumoniae positive and the other in H. influenzae. To save time and many, only one 96-well tray will be used for both samples. Before downstream analysis can be performed, we want to know if cross contamination between samples occured during the loading of the 96-well tray and we randomly choose wells 11, 23, 64, and 80. + +Make sure all bacterial reads needed to create your reference dataset also known as a training dataset are available. +```bash +bash 1_before_starting.sh +``` +```bash +bash 2_before_starting.sh +``` + +### Sketch your training dataset and sample to your preference. + +#### Using k=31 +Note: training and sample datasets are required to have the same ksize. Please note that since we are sketching from a list of genomes. We can use the following sourmash sketch command: +```bash +sourmash sketch fromfile genome_list.csv -p dna,k=31,scaled=1000,abund -o training_database.k31.sig.zip +``` + +Sketch the negative control reads from well 11 +```bash +yacht sketch sample --infile ./negative_control_well_11.fasta --kmer 31 --scaled 1000 --outfile negative_control_well_11.k31.sig.zip +``` + +Sketch the positive control from well 23 +```bash +yacht sketch sample --infile ./positive_control_well_23.fasta --kmer 31 --scaled 1000 --outfile positive_control_well_23.k31.sig.zip +``` + +Sketch the positive control from well 64 +```bash +yacht sketch sample --infile ./positive_control_well_64.fasta --kmer 31 --scaled 1000 --outfile positive_control_well_64.k31.sig.zip +``` + +Sketch the sample from well 80 +```bash +yacht sketch sample --infile ./sample_well_80.fasta --kmer 31 --scaled 1000 --outfile sample_well_80.k31.sig.zip +``` + +### Make training data for k=31 +```bash +yacht train --ref_file training_database.k31.sig.zip --ksize 31 --num_threads 64 --ani_thresh 0.95 --prefix 'training_database.k31' --outdir ./ --force +``` + +### Identify whether the patient has a infection and what pathogen is causing the disease. +```bash +yacht run --json training_database.k31_config.json --sample_file negative_control_well_11.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./negative_control_well_11_k31_result.xlsx +``` + +```bash +yacht run --json training_database.k31_config.json --sample_file positive_control_well_23.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./positive_control_well_23_k31_result.xlsx +``` + +```bash +yacht run --json training_database.k31_config.json --sample_file positive_control_well_64.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./positive_control_well_64_k31_result.xlsx +``` + +```bash +yacht run --json training_database.k31_config.json --sample_file sample_well_80.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./sample_well_80.xlsx +``` + +### Results +Using a ksize of 31 at ANI 0.95, YACHT finds XYZ + +## Let's decrease ANI to 0.50 + +### Make training data for k=31 +```bash +yacht train --ref_file training_database.k31.sig.zip --ksize 31 --num_threads 64 --ani_thresh 0.95 --prefix 'training_database.k31_ani0.50' --outdir ./ --force +``` + +### Pathogen Detection using YACHT +Identify whether the patient has a infectin and what pathogen is causing the disease. +```bash +yacht run --json training_database.k31_ani0.50_config.json --sample_file negative_control_well_11.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k31_ani0.50_result_negative_control_well_11.xlsx +``` + +Identify whether the patient has a infectin and what pathogen is causing the disease. +```bash +yacht run --json training_database.k31_ani0.50_config.json --sample_file positive_control_well_23.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k31_ani0.50_result_positive_control_well_23.xlsx +``` + +Identify whether the patient has a infectin and what pathogen is causing the disease. +```bash +yacht run --json training_database.k31_ani0.50_config.json --sample_file positive_control_well_64.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k31_ani0.50_result_positive_control_well_64.xlsx +``` + +Identify whether the patient has a infectin and what pathogen is causing the disease. +```bash +yacht run --json training_database.k31_ani0.50_config.json --sample_file sample_well_80.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k31_ani0.50_result_sample_well_80.xlsx +``` + + +### Results +Decreasing ANI to 0.50 and using a ksize of 31, YACHT finds XYZ \ No newline at end of file diff --git a/use_case_examples/contamination_detection_example/genome_list.csv b/use_case_examples/contamination_detection_example/genome_list.csv new file mode 100644 index 00000000..cc965d4f --- /dev/null +++ b/use_case_examples/contamination_detection_example/genome_list.csv @@ -0,0 +1,6 @@ +0,name,genome_filename,protein_filename +1,SRR25626360,SRR25626360.fasta, +2,SRR24210460,SRR24210460.fasta, +3,SRR7217470,SRR7217470.fasta, +4,SRR5962942,SRR5962942.fasta, +5,SRR26202532,SRR26202532.fasta, diff --git a/use_case_examples/pathogen_detection_example/1_before_starting.sh b/use_case_examples/pathogen_detection_example/1_before_starting.sh new file mode 100644 index 00000000..05078e0c --- /dev/null +++ b/use_case_examples/pathogen_detection_example/1_before_starting.sh @@ -0,0 +1,17 @@ +# Download SRR25626360 which represents WGS of Haemophilus influenzae +nohup fastq-dump --fasta 60 SRR25626360 2>&1 & + +### Download SRR24210460 which represents WGS of mycoplasma pneumoniae from library MDY +nohup fastq-dump --fasta 60 SRR24210460 2>&1 & + +### Download SRR7217470 which represents WGS of Chlamydia pneumoniae +nohup fastq-dump --fasta 60 SRR7217470 2>&1 & + +### Download SRR5962942 which represents WGS of Streptococcus pneumoniae +nohup fastq-dump --fasta 60 SRR5962942 2>&1 & + +### Download SRR26202532 which represents WGS of Bordetella pertussis +nohup fastq-dump --fasta 60 SRR26202532 2>&1 & + +### Download SRR2830253, reads of a healthy human lung microbiome +nohup fastq-dump --fasta 60 SRR2830253 2>&1 & diff --git a/use_case_examples/pathogen_detection_example/2_before_starting.sh b/use_case_examples/pathogen_detection_example/2_before_starting.sh new file mode 100644 index 00000000..8a682fec --- /dev/null +++ b/use_case_examples/pathogen_detection_example/2_before_starting.sh @@ -0,0 +1,11 @@ +# Create the example sample data for a patient with respiratory symptoms seeks to find out the pathogen that is causing them these symptoms. + +# Before moving on. Make sure reads needed to create sample dataset are available. Please reference create_reference_database.md + +# Sketch sample to your preference. Note: training and sample datasets are required to have the same ksize. + +## Using k=31 +nohup sourmash sketch fromfile lung_list.csv -p dna,k=31,scaled=1000,abund -o lung_sample.k31.sig.zip > k31_sample.log 2>&1 & + +## Using k=15 +nohup sourmash sketch fromfile lung_list.csv -p dna,k=15,scaled=1000,abund -o lung_sample.k15.sig.zip > k15_sample.log 2>&1 & diff --git a/use_case_examples/pathogen_detection_example/genome_list.csv b/use_case_examples/pathogen_detection_example/genome_list.csv new file mode 100644 index 00000000..cc965d4f --- /dev/null +++ b/use_case_examples/pathogen_detection_example/genome_list.csv @@ -0,0 +1,6 @@ +0,name,genome_filename,protein_filename +1,SRR25626360,SRR25626360.fasta, +2,SRR24210460,SRR24210460.fasta, +3,SRR7217470,SRR7217470.fasta, +4,SRR5962942,SRR5962942.fasta, +5,SRR26202532,SRR26202532.fasta, diff --git a/use_case_examples/pathogen_detection_example/lung_list.csv b/use_case_examples/pathogen_detection_example/lung_list.csv new file mode 100644 index 00000000..db273129 --- /dev/null +++ b/use_case_examples/pathogen_detection_example/lung_list.csv @@ -0,0 +1,3 @@ +0,name,genome_filename,protein_filename +1,SRR24210460,SRR24210460.fasta, +2,SRR26202532,SRR26202532.fasta, diff --git a/use_case_examples/pathogen_detection_example/pathogen_detection_example.md b/use_case_examples/pathogen_detection_example/pathogen_detection_example.md new file mode 100644 index 00000000..6b43ce0e --- /dev/null +++ b/use_case_examples/pathogen_detection_example/pathogen_detection_example.md @@ -0,0 +1,77 @@ +# Pathogen Detection Example +A patient with respiratory symptoms seeks to find out the pathogen that is causing them these symptoms. + +Make sure all bacterial reads needed to create your reference dataset also known as a training dataset are available. +```bash +bash 1_before_starting.sh +``` +```bash +bash 2_before_starting.sh +``` + +### Sketch your training dataset and sample to your preference. + +#### Using k=31 +Note: training and sample datasets are required to have the same ksize. Please note that since we are sketching from a list of genomes. We can use the following sourmash sketch command: +```bash +sourmash sketch fromfile genome_list.csv -p dna,k=31,scaled=1000,abund -o training_database.k31.sig.zip +``` + +Sketch your sample fasta file +```bash +yacht sketch sample --infile ./lung_sample.fasta --kmer 31 --scaled 1000 --outfile lung_sample.k31.sig.zip +``` + +### Make training data for k=31 +```bash +yacht train --ref_file training_database.k31.sig.zip --ksize 31 --num_threads 64 --ani_thresh 0.95 --prefix 'training_database.k31' --outdir ./ --force +``` + +### Identify whether the patient has a infection and what pathogen is causing the disease. +```bash +yacht run --json training_database.k31_config.json --sample_file lung_sample.k31.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k31_result.xlsx +``` + +### Results +Using a ksize of 31, YACHT finds that M. pneumoniae is present in the lung sample. + +## What if we decrease ksize to 15? +If we use small ksizes like 15, we would expect to not find that the patient is infected by M. pneumoniae. Let's set up the experiment. Note that a ksize below 7 may not produce results and is not recommend. + +### Sketch Lung Sample using a k=15 +```bash +sourmash sketch fromfile genome_list.csv -p dna,k=15,scaled=1000,abund -o training_database.k15.sig.zip +``` + +Sketch your sample fasta file +```bash +yacht sketch sample --infile ./lung_sample.fasta --kmer 15 --scaled 1000 --outfile lung_sample.k15.sig.zip +``` + +### Make training data for k=15 +```bash +yacht train --ref_file training_database.k15.sig.zip --ksize 15 --num_threads 64 --ani_thresh 0.95 --prefix 'training_database.k15' --outdir ./ --force +``` + +### Pathogen Detection using YACHT +Identify whether the patient has a infectin and what pathogen is causing the disease. +```bash +yacht run --json training_database.k15_config.json --sample_file lung_sample.k15.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k15_result.xlsx +``` +### Results +Using a ksize of 15, YACHT finds/does not fine that M. pneumoniae + +## Let's decrease ANI to 0.85 + +### Make training data for k=15 +```bash +yacht train --ref_file training_database.k15.sig.zip --ksize 15 --num_threads 64 --ani_thresh 0.85 --prefix 'training_database.k15_ani0.85' --outdir ./ --force +``` + +### Pathogen Detection using YACHT +Identify whether the patient has a infectin and what pathogen is causing the disease. +```bash +yacht run --json training_database.k15_ani0.85_config.json --sample_file lung_sample.k15.sig.zip --significance 0.99 --num_threads 64 --min_coverage_list 1 0.6 0.2 0.1 --out ./k15_ani0.85_result.xlsx +``` +### Results +Using a ksize of 15, YACHT finds/does not fine that M. pneumoniae \ No newline at end of file diff --git a/use_case_examples/use-case-examples.md b/use_case_examples/use-case-examples.md new file mode 100644 index 00000000..71874e36 --- /dev/null +++ b/use_case_examples/use-case-examples.md @@ -0,0 +1,69 @@ +Advantages of using YACHT is that you can analyze from fasta or fastq files. + +# Biological Application + +Does YACHT provide qunatitative data? + +How sensitive is Yacht in detecting viruses? + +Could we use YACHT to identify the amount of host DNA? + +Is sequencing depth an issue for YACHT? + +Can YACHT be used to identify the amount of contamination that there is in a sample? + +## Contamination Detection + +Identifying contamination of samples is an important step for downstream analysis. + +Failure to detect sample contaminants can bias community diversity and strain sharing identification, which lead to false claims in research. Additionally, contaminant detection is a challenge for low-biomass data. + +There are two types of contaminants that yacht can detect. Specifically, external and cross contaminations. External contaminations includes any microbial DNA that can come from a researcher's native microbiome, experimental kits, and surrounding areas. Cross contamination can come from DNA extractions, sequencing index switching, and sample bleeding. Further, cross-contamination can complicate contimanation detection. + +Here, we explore how we can use YACHT for contaminant detection with the following tutorials: +* External Contamination Detection +* Cross Contamination Detection +* Contamination in Low-biomass Datasets + +Lou YC, Hoff J, Olm MR, West-Roberts J, Diamond S, Firek BA, Morowitz MJ, Banfield JF. Using strain-resolved analysis to identify contamination in metagenomics data. Microbiome. 2023 Mar 2;11(1):36. doi: 10.1186/s40168-023-01477-2. PMID: 36864482; PMCID: PMC9979413. + +### Contaminant Detection Examples + +#### External Contamination Detection + +Data: + +Commands: + +Interpretation: + +#### Cross Contamination Detection + +Data: + +Commands: + +Interpretation: + +#### Contamination in Low-biomass Datasets + +## Pathogenic Detection + +Early and accurate pathogen detection is vital for rapid diagnostic, clinical intervention, pathogen discovery, pathogen surveillance, and outbreak investigations. Some microorganisms are hard to grow, cultivate (i.e. viruses), and/or take a long time to grow (i.e. mycobacteriums and mold). Additionally, the origin of the sample can decrease pathogen detection sensitivity such as purulent samples having human DNA noise. These challenges can lead to extended hospitalizations, readmissions, as well as increased mortality. + +Research in pathogen detection have led to improvements of the experimental part such as differential lysis of human cells, etc. Further, efforts in improving computational tools and pipelines for pathogen detection have also been made such as real time pathogen detection (EPI2ME, SURPI-RT, etc) + +Methods for pathogen detection include PCR, multiplex PCR, broad-range PCR, antigen detection, MALDI-TOF MS, and PNA-FISH. However, PCR based methods are limited to false negatives, knowledge priori, and/or viral detection. Further, antigen detection is limited to the time for antibodies to develop which can take 1-2 weeks. Finally, methods such as mass spectrometry and insitu hybridization have been advancements for bacterial and fungal identification but still require bacterial cultivation, do not provide quantitative results (in the case of MALDI-TOF MS) or are limited to tissue samples (in the case of PNA-FISH). + +Using clinical metagenomic data, we can detect any pathogenic infection (i.e. respiratory, bloodstream, central nervous system infections) without prior knowledge from millions of reads produced by shot gun metagenomic sequencing. However, pathogen detection can be difficult depending on the sample type being used such as purulent samples having a lot of human DNA nouse. + +Computationally, pathogen detection is challenging due to alignment/classification algorithms becoming overwhelmed by large data, read sparsity (leading to de novo assembly difficulty), the lack of genome representation of novel pathogens, and the detection of pathogens that are highly divergent novel apthogens. Computationl subtraction is a popular approach to use for pathogen detection, tools such as PathSeq and SURPI use this approach. These approaches use alignments and although alignment based methods have been shown to slow down analyses, SURPI has shown to detect pathogens and generate results in approaite clinical timing from minutes to hours. + +Gu W, Deng X, Lee M, Sucu YD, Arevalo S, Stryke D, Federman S, Gopez A, Reyes K, Zorn K, Sample H, Yu G, Ishpuniani G, Briggs B, Chow ED, Berger A, Wilson MR, Wang C, Hsu E, Miller S, DeRisi JL, Chiu CY. Rapid pathogen detection by metagenomic next-generation sequencing of infected body fluids. Nat Med. 2021 Jan;27(1):115-124. doi: 10.1038/s41591-020-1105-z. Epub 2020 Nov 9. PMID: 33169017; PMCID: PMC9020267. + +Batool M, Galloway-Peña J. Clinical metagenomics-challenges and future prospects. Front Microbiol. 2023 Jun 28;14:1186424. doi: 10.3389/fmicb.2023.1186424. PMID: 37448579; PMCID: PMC10337830. + +Naccache SN, Federman S, Veeraraghavan N, Zaharia M, Lee D, Samayoa E, Bouquet J, Greninger AL, Luk KC, Enge B, Wadford DA, Messenger SL, Genrich GL, Pellegrino K, Grard G, Leroy E, Schneider BS, Fair JN, Martínez MA, Isa P, Crump JA, DeRisi JL, Sittler T, Hackett J Jr, Miller S, Chiu CY. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples. Genome Res. 2014 Jul;24(7):1180-92. doi: 10.1101/gr.171934.113. Epub 2014 Jun 4. PMID: 24899342; PMCID: PMC4079973. + +### Pathogenic Detection Examples +