You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Varcode today predicts variant effects against a single reference genome, treating each variant as an independent ref→alt change. For many real-world use cases this is incorrect: patients have their own germline background, variants can be phased together on a single haplotype, and interactions between nearby variants can produce joint effects very different from the sum of their independent predictions.
This umbrella issue tracks the work to move varcode from reference-vs-variant predictions to haplotype-vs-haplotype predictions — treating each patient's actual genome sequences as the input to effect prediction.
Motivation
Neoantigen prediction: the peptide presented by MHC is generated from the patient's transcript, not the reference transcript. Germline SNPs in flanking regions change which 9-mers get cleaved, even when the somatic variant of interest is untouched.
Clinical interpretation: compound heterozygosity (two LoF variants in trans) is causative for recessive disease; the same two variants in cis leave one functional copy and are typically benign. Reporting cannot be correct without phase.
Research pipelines: pipelines that sum independent-variant scores underestimate epistasis between nearby variants. Accurate joint-effect prediction is the foundation for doing this properly.
Incorporate RNA-level evidence for variant effects #259 introduced the idea that a single DNA event can produce multiple protein outcomes (a set of possibilities). Phasing is the other axis of the same picture: multiple DNA events together produce one haplotype-level outcome.
Exacto (Add loader for Exacto output formats #260) provides phased long-read calls with both DNA and RNA. Once the loader lands, varcode can consume phased haplotypes directly — skipping the inference step.
#271 (MutantTranscript refactor) is the consolidation point for both this umbrella and the SV roadmap (#261). Landing it first makes #267/#268/#269 significantly simpler and opens the door to Isovar-driven phasing evidence.
Summary
Varcode today predicts variant effects against a single reference genome, treating each variant as an independent ref→alt change. For many real-world use cases this is incorrect: patients have their own germline background, variants can be phased together on a single haplotype, and interactions between nearby variants can produce joint effects very different from the sum of their independent predictions.
This umbrella issue tracks the work to move varcode from reference-vs-variant predictions to haplotype-vs-haplotype predictions — treating each patient's actual genome sequences as the input to effect prediction.
Motivation
Sub-issues
How this connects to the SV/RNA roadmap (#261)
The haplotype-aware work and the SV work rhyme:
Suggested order of work
Foundational dependency
#271 (MutantTranscript refactor) is the consolidation point for both this umbrella and the SV roadmap (#261). Landing it first makes #267/#268/#269 significantly simpler and opens the door to Isovar-driven phasing evidence.