AUTHOR=Corcuff Mélanie , Garibal Marc , Desvignes Jean-Pierre , Guien Céline , Grattepanche Coralie , Collod-Béroud Gwenaëlle , Ménoret Estelle , Salgado David , Béroud Christophe
TITLE=Protein domains provide a new layer of information for classifying human variations in rare diseases
JOURNAL=Frontiers in Bioinformatics
VOLUME=3
YEAR=2023
URL=https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2023.1127341
DOI=10.3389/fbinf.2023.1127341
ISSN=2673-7647
ABSTRACT=
Introduction: Using the ACMG-AMP guidelines for the interpretation of sequence variants, it remains difficult to meet the criterion associated with the protein domain, PM1, which is assigned in only about 10% of cases, whereas the criteria related to variant frequency, PM2/BA1/BS1, is reported in 50% of cases. To improve the classification of human missense variants using protein domains information, we developed the DOLPHIN system (https://dolphin.mmg-gbit.eu).
Methods: We used Pfam alignments of eukaryotes to define DOLPHIN scores to identify protein domain residues and variants that have a significant impact. In parallel, we enriched gnomAD variants frequencies for each domains’ residue. These were validated using ClinVar data.
Results: We applied this method to all potential human transcripts’ variants, resulting in 30.0% being assigned a PM1 label, whereas 33.2% were eligible for a new benign support criterion, BP8. We also showed that DOLPHIN provides an extrapolated frequency for 31.8% of the variants, compared to the original frequency available in gnomAD for 7.6% of them.
Discussion: Overall, DOLPHIN allows a simplified use of the PM1 criterion, an expanded application of the PM2/BS1 criteria and the creation of a new BP8 criterion. DOLPHIN could facilitate the classification of amino acid substitutions in protein domains that cover nearly 40% of proteins and represent the sites of most pathogenic variants.