-
Notifications
You must be signed in to change notification settings - Fork 98
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
The variant mapper is failing with an index error for a large duplication.
To Reproduce
import hgvs.dataproviders.uta
import hgvs.parser
import hgvs.variantmapper
def reproduce_bug():
"""Reproduce the IndexError with a large insertion in PTEN."""
# Connect to public UTA database (requires internet)
print("Connecting to UTA database...")
hdp = hgvs.dataproviders.uta.connect()
# Create parser and mapper
parser = hgvs.parser.Parser()
mapper = hgvs.variantmapper.VariantMapper(hdp)
# cDNA variant: large 184bp insertion in PTEN transcript NM_000314.8
# This insertion is at position c.1086_1087, which is near the end of the coding sequence
cdna_hgvs = ( "NM_000314.8:c.1086_1087insACTTCTGTAACACCAGATGTTAGTGACAATGAACCTGATCATTATAGATATTCTGACACCACTGACTCTGATCCAGAGAATGAACCTTTTGATGAAGATCAGCATACACAAATTACAAAAGTCTGAA"
)
print(f"\nParsing cDNA variant: {cdna_hgvs[:80]}...")
var_c = parser.parse_hgvs_variant(cdna_hgvs)
print(f"Parsed successfully: {var_c}")
print("\nConverting to protein coordinates (this will fail)...")
try:
var_p = mapper.c_to_p(var_c)
print(f"Protein variant: {var_p}")
except IndexError as e:
print(f"\n✓ Successfully reproduced IndexError!")
print(f"Error message: {e}")
raise
reproduce_bug()
Expected behavior
If there is a good reason to fail for this variant, then it should fail more gracefully. But ideally it should produce something like this:
"p.(Thr363_Ter404dup)"
Which is the result for a one base shorter insertion.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working