DNAnalyzer’s CLI now ships with a self-contained polygenic risk scoring (PRS) pipeline. This guide walks through the supported inputs, how to run a score, and how to interpret the console output.
- 23andMe raw data export (TXT). The pipeline expects the standard tab-delimited layout: comment lines beginning with
#, followed by a header rowrsid\tchromosome\tposition\tgenotypeand genotype calls such asrs12345 1 123456 AG. - PRS weight table (CSV). Each record must include three columns: SNP identifier (
rsID), risk allele (single base or two-character genotype), and the numeric weight. Example files are shipped underassets/risk/.
# Example 23andMe snippet
rsid chromosome position genotype
rs6025 1 1000 AA
rs1799983 7 2000 GT
# Example weight table
SNP,RiskAllele,Weight
rs6025,A,1.5
rs1799983,T,1.2
Invoke the Gradle task (or the shaded JAR) with the genotype file and at least one PRS CSV:
./gradlew run --args='--23andme my_genotype.txt --prs assets/risk/heart_disease_prs.csv'You can provide multiple weight tables by comma-separating them:
./gradlew run --args='--23andme my_genotype.txt --prs assets/risk/heart_disease_prs.csv,assets/risk/sample_weights.csv'An optional FASTA may be supplied as the final positional argument if you want the traditional sequence summary alongside the PRS output.
The CLI prints a summary per weight table:
- Variants - Total variants in the weight table, how many were matched in the genotype file, and coverage percentage.
- Raw score - Sum of (allele dosage × weight) across all SNPs.
- Normalised - Raw score divided by the theoretical maximum magnitude (
Σ|weight| × 2). Use this to compare scores across traits.
Each SNP contribution is displayed as a row showing the genotype call, requested risk allele, matched dosage (0-2), the weight, and the contribution added to the raw score. Missing genotypes or uncallable entries are flagged with a -- genotype and a note such as “No genotype call”.
- Coverage < 100% means the shipped weights reference SNPs that are not present in your raw file. Treat scores derived from sparse coverage as exploratory only.
- Uncallable genotypes (
--) occur when 23andMe marks a site as “no call”. These variants are excluded from the raw score.
- Scores are deterministic; no statistical normalisation against population cohorts is performed. Future releases may add reference distributions where ethically and legally possible.
- Weight files are bundled for demonstration only. Replace them with weights sourced from peer-reviewed GWAS studies before making any decision-oriented interpretations.
- Always validate against your own datasets. Run
GRADLE_USER_HOME=./.gradle ./gradlew testto execute the included JUnit coverage once you have network connectivity to fetch the Gradle wrapper.
For questions or to contribute additional trait panels, open an issue or pull request in the repository.