Skip to content

Personal genome / haplotype-aware effect prediction (umbrella) #270

@iskandr

Description

@iskandr

Summary

Varcode today predicts variant effects against a single reference genome, treating each variant as an independent ref→alt change. For many real-world use cases this is incorrect: patients have their own germline background, variants can be phased together on a single haplotype, and interactions between nearby variants can produce joint effects very different from the sum of their independent predictions.

This umbrella issue tracks the work to move varcode from reference-vs-variant predictions to haplotype-vs-haplotype predictions — treating each patient's actual genome sequences as the input to effect prediction.

Motivation

  • Neoantigen prediction: the peptide presented by MHC is generated from the patient's transcript, not the reference transcript. Germline SNPs in flanking regions change which 9-mers get cleaved, even when the somatic variant of interest is untouched.
  • Clinical interpretation: compound heterozygosity (two LoF variants in trans) is causative for recessive disease; the same two variants in cis leave one functional copy and are typically benign. Reporting cannot be correct without phase.
  • Research pipelines: pipelines that sum independent-variant scores underestimate epistasis between nearby variants. Accurate joint-effect prediction is the foundation for doing this properly.

Sub-issues

Issue Description
#267 Genotype / zygosity representation — preserve GT data as a first-class attribute, support het/hom queries and multi-sample access
#268 Germline-aware effect prediction — treat germline as the "before" state for somatic variants
#269 Phasing: cis/trans-aware effect prediction — joint effects for variants on the same haplotype, using phase sets or read-derived evidence

How this connects to the SV/RNA roadmap (#261)

The haplotype-aware work and the SV work rhyme:

Suggested order of work

  1. Represent zygosity and genotype on loaded variants #267 (genotype) — foundation; can land independently and is useful on its own.
  2. Phasing: cis/trans-aware effect prediction for nearby variants #269 (phasing) — depends on Represent zygosity and genotype on loaded variants #267; enables joint effects for nearby variants.
  3. Germline-aware effect prediction (umbrella) #268 (germline-aware) — depends on Represent zygosity and genotype on loaded variants #267 and benefits from Phasing: cis/trans-aware effect prediction for nearby variants #269; transforms the effect model to haplotype-vs-haplotype.

Foundational dependency

#271 (MutantTranscript refactor) is the consolidation point for both this umbrella and the SV roadmap (#261). Landing it first makes #267/#268/#269 significantly simpler and opens the door to Isovar-driven phasing evidence.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions