Skip to content

stop_lost falsely added to frameshift_variant when frameshift extends past original stop codon #115

@mwiewior

Description

@mwiewior

Summary

vepyr adds stop_lost alongside frameshift_variant when a frameshift extends past the original stop codon position. Ensembl VEP reports only frameshift_variant in these cases — it does not add stop_lost for frameshifts that naturally displace the stop.

Scale: 26 Consequence mismatches across chr1-22 (HG002 GRCh38 benchmark).

Root cause: The suppression block in transcript_consequence.rs (lines ~1630-1635) sets classification.stop_lost = false when frameshift_variant is present, but this logic is bypassed in certain code paths — specifically when the stop_lost check runs after the suppression block, or when the variant is classified through classify_coding_change_deletion() which adds stop_lost independently. #90 sub-pattern C identified 3 cases; the full chr1-22 run reveals 26.

Test cases (10 variants from different chromosomes)

All are frameshifts (deletions or insertions). Expected: frameshift_variant only. Actual: frameshift_variant&stop_lost.

# chrom  pos         ref                                             alt
chr2     20598246    CT                                              C
chr3     37984259    CA                                              C
chr5     1428787     AG                                              A
chr8     22619565    GGCAGTCC                                        G
chr9     36003431    CTT                                             C
chr13    77017068    CT                                              C
chr15    40354319    CT                                              C
chr20    145669      ACC                                             A
chr21    30541662    AG                                              A
chr22    19029764    GC                                              G

Detailed examples

chr2:20598246 CT>C

vepyr:  frameshift_variant&stop_lost
VEP:    frameshift_variant

chr8:22619565 GGCAGTCC>G (2 transcripts)

vepyr:  frameshift_variant&stop_lost
VEP:    frameshift_variant

chr3:7691039 AT>A (with splice_region_variant + NMD)

vepyr:  frameshift_variant&stop_lost&splice_region_variant&NMD_transcript_variant
VEP:    frameshift_variant&splice_region_variant&NMD_transcript_variant

Related issues

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions