1
|
Mohammed EEA, Fayez AG, Abdelfattah NM, Fateen E. Novel gene-specific Bayesian Gaussian mixture model to predict the missense variants pathogenicity of Sanfilippo syndrome. Sci Rep 2024; 14:12148. [PMID: 38802532 PMCID: PMC11130188 DOI: 10.1038/s41598-024-62352-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 05/16/2024] [Indexed: 05/29/2024] Open
Abstract
MPS III is an autosomal recessive lysosomal storage disease caused mainly by missense variants in the NAGLU, GNS, HGSNAT, and SGSH genes. The pathogenicity interpretation of missense variants is still challenging. We aimed to develop unsupervised clustering-based pathogenicity predictor scores using extracted features from eight in silico predictors to predict the impact of novel missense variants of Sanfilippo syndrome. The model was trained on a dataset consisting of 415 uncertain significant (VUS) missense NAGLU variants. Performance The SanfilippoPred tool was evaluated by validation and test datasets consisting of 197-labelled NAGLU missense variants, and its performance was compared versus individual pathogenicity predictors using receiver operating characteristic (ROC) analysis. Moreover, we tested the SanfilippoPred tool using extra-labelled 427 missense variants to assess its specificity and sensitivity threshold. Application of the trained machine learning (ML) model on the test dataset of labelled NAGLU missense variants showed that SanfilippoPred has an accuracy of 0.93 (0.86-0.97 at CI 95%), sensitivity of 0.93, and specificity of 0.92. The comparative performance of the SanfilippoPred showed better performance (AUC = 0.908) than the individual predictors SIFT (AUC = 0.756), Polyphen-2 (AUC = 0.788), CADD (AUC = 0.568), REVEL (AUC = 0.548), MetaLR (AUC = 0.751), and AlphMissense (AUC = 0.885). Using high-confidence labelled NAGLU variants, showed that SanfilippoPred has an 85.7% sensitivity threshold. The poor correlation between the Sanfilippo syndrome phenotype and genotype represents a demand for a new tool to classify its missense variants. This study provides a significant tool for preventing the misinterpretation of missense variants of the Sanfilippo syndrome-relevant genes. Finally, it seems that ML-based pathogenicity predictors and Sanfilippo syndrome-specific prediction tools could be feasible and efficient pathogenicity predictors in the future.
Collapse
Affiliation(s)
- Eman E A Mohammed
- Medical Molecular Genetics Department, Human Genetics and Genome Research Institute, National Research Centre, Giza, Egypt.
| | - Alaaeldin G Fayez
- Molecular Genetics and Enzymology Department, Human Genetics and Genome Research Institute, National Research Centre, Giza, Egypt
| | | | - Ekram Fateen
- Biochemical Genetics Department, Human Genetics and Genome Research Institute, National Research Centre, Giza, Egypt
| |
Collapse
|
2
|
Brock DC, Wang M, Hussain HMJ, Rauch DE, Marra M, Pennesi ME, Yang P, Everett L, Ajlan RS, Colbert J, Porto FBO, Matynia A, Gorin MB, Koenekoop RK, Lopez I, Sui R, Zou G, Li Y, Chen R. Comparative analysis of in-silico tools in identifying pathogenic variants in dominant inherited retinal diseases. Hum Mol Genet 2024; 33:945-957. [PMID: 38453143 PMCID: PMC11102593 DOI: 10.1093/hmg/ddae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 03/09/2024] Open
Abstract
Inherited retinal diseases (IRDs) are a group of rare genetic eye conditions that cause blindness. Despite progress in identifying genes associated with IRDs, improvements are necessary for classifying rare autosomal dominant (AD) disorders. AD diseases are highly heterogenous, with causal variants being restricted to specific amino acid changes within certain protein domains, making AD conditions difficult to classify. Here, we aim to determine the top-performing in-silico tools for predicting the pathogenicity of AD IRD variants. We annotated variants from ClinVar and benchmarked 39 variant classifier tools on IRD genes, split by inheritance pattern. Using area-under-the-curve (AUC) analysis, we determined the top-performing tools and defined thresholds for variant pathogenicity. Top-performing tools were assessed using genome sequencing on a cohort of participants with IRDs of unknown etiology. MutScore achieved the highest accuracy within AD genes, yielding an AUC of 0.969. When filtering for AD gain-of-function and dominant negative variants, BayesDel had the highest accuracy with an AUC of 0.997. Five participants with variants in NR2E3, RHO, GUCA1A, and GUCY2D were confirmed to have dominantly inherited disease based on pedigree, phenotype, and segregation analysis. We identified two uncharacterized variants in GUCA1A (c.428T>A, p.Ile143Thr) and RHO (c.631C>G, p.His211Asp) in three participants. Our findings support using a multi-classifier approach comprised of new missense classifier tools to identify pathogenic variants in participants with AD IRDs. Our results provide a foundation for improved genetic diagnosis for people with IRDs.
Collapse
Affiliation(s)
- Daniel C Brock
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Medical Scientist Training Program, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Meng Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Hafiz Muhammad Jafar Hussain
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - David E Rauch
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Molly Marra
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Mark E Pennesi
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Paul Yang
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Lesley Everett
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
| | - Radwan S Ajlan
- Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
| | - Jason Colbert
- Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
| | - Fernanda Belga Ottoni Porto
- INRET Clínica e Centro de Pesquisa, Rua dos Otoni, 735/507 - Santa Efigênia, Belo Horizonte, MG 30150270, Brazil
- Department of Ophthalmology, Santa Casa de Misericórdia de Belo Horizonte, Av. Francisco Sales, 1111 - Santa Efigênia, Belo Horizonte, MG 30150221, Brazil
- Centro Oftalmológico de Minas Gerais, R. Santa Catarina, 941 - Lourdes, Belo Horizonte, MG 30180070, Brazil
| | - Anna Matynia
- College of Optometry, University of Houston, 4401 Martin Luther King Boulevard, Houston, TX 77004, United States
| | - Michael B Gorin
- Jules Stein Eye Institute, University of California Los Angeles, 100 Stein Plaza, Los Angeles, CA 90095, United States
- Department of Ophthalmology, University of California Los Angeles David Geffen School of Medicine, 10833 Le Conte Ave, Los Angeles, CA 90095, United States
| | - Robert K Koenekoop
- McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
| | - Irma Lopez
- McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
| | - Ruifang Sui
- Department of Ophthalmology, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, WC67+HW Dongcheng, Beijing 100005, China
| | - Gang Zou
- Department of Ophthalmology, Ningxia Eye Hospital, People's Hospital of Ningxia Hui Autonomous Region, First Affiliated Hospital of Northwest University for Nationalities, Ningxia Clinical Research Center on Diseases of Blindness in Eye, F4RJ+43 Xixia District, Yinchuan, Ningxia, China
| | - Yumei Li
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
| |
Collapse
|
3
|
Delinière A, Jaupart L, Janin A, Millat G, Boulin T, Andrini O, Chevalier P. Functional and clinical characterization of a novel homozygous KCNH2 missense variant in the pore region of Kv11.1 leading to a viable but severe long-QT syndrome. Gene 2024; 897:148076. [PMID: 38086455 DOI: 10.1016/j.gene.2023.148076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 11/22/2023] [Accepted: 12/08/2023] [Indexed: 12/24/2023]
Abstract
BACKGROUND Among KCNH2 missense loss of function (LOF) variants, homozygosity -at any position in the Kv11.1/hERG channel - is very rare and generally leads to intrauterine death, while heterozygous variants in the pore are responsible for severe Type 2 long-QT syndrome (LQTS). We report a novel homozygous p.Gly603Ser missense variant in the pore of Kv11.1/hERG (KCNH2 c.1807G > A) discovered in the context of a severe LQTS. METHODS We carried out a phenotypic family study combined with a functional analysis of mutated and wild-type (WT) Kv11.1 by two-electrode voltage-clamp using the Xenopus laevis oocyte heterologous expression system. RESULTS The variant resulted in a severe LQTS phenotype (very prolonged corrected QT interval, T-wave alternans, multiple Torsades de pointes) with a delayed clinical expression in later childhood in the homozygous state, and in a Type 2 LQTS phenotype in the heterozygous state. Expression of KCNH2 p.Gly603Ser cRNA alone elicited detectable current in Xenopus oocytes. Inactivation kinetics and voltage dependence of activation were not significantly affected by the variant. The macroscopic slope conductance of the variant was three-fold less compared to the WT (18.5 ± 9.01 vs 54.7 ± 17.2 μS, p < 0.001). CONCLUSIONS We characterized the novel p.Gly603Ser KCNH2 missense LOF variant in the pore region of Kv11.1/hERG leading to a severe but viable LQTS in the homozygous state and an attenuated Type 2 LQTS in heterozygous carriers. To our knowledge we provide the first description of a homozygous variant in the pore-forming region of Kv11.1 with a functional impact but a delayed clinical expression.
Collapse
Affiliation(s)
- Antoine Delinière
- National Reference Center for Inherited Arrhythmias of Lyon, Department of Cardiac Electrophysiology, Louis Pradel Hospital, Hospices Civils de Lyon, Lyon, France; University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France
| | - Laureen Jaupart
- University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France
| | - Alexandre Janin
- University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France; Laboratoire de cardiogénétique moléculaire, Centre de biologie et pathologie Est, Hospices Civils de Lyon, Lyon, France
| | - Gilles Millat
- University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France; Laboratoire de cardiogénétique moléculaire, Centre de biologie et pathologie Est, Hospices Civils de Lyon, Lyon, France
| | - Thomas Boulin
- University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France
| | - Olga Andrini
- University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France.
| | - Philippe Chevalier
- National Reference Center for Inherited Arrhythmias of Lyon, Department of Cardiac Electrophysiology, Louis Pradel Hospital, Hospices Civils de Lyon, Lyon, France; University of Lyon, Claude Bernard Lyon 1 University, MeLiS, CNRS UMR 5284, INSERM U1314, Institut NeuroMyoGène, Lyon 69008, France.
| |
Collapse
|
4
|
Zeng B, Liu DC, Huang JG, Xia XB, Qin B. PdmIRD: missense variants pathogenicity prediction for inherited retinal diseases in a disease-specific manner. Hum Genet 2024; 143:331-342. [PMID: 38478153 DOI: 10.1007/s00439-024-02645-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 04/25/2024]
Abstract
Accurate discrimination of pathogenic and nonpathogenic variation remains an enormous challenge in clinical genetic testing of inherited retinal diseases (IRDs) patients. Computational methods for predicting variant pathogenicity are the main solutions for this dilemma. The majority of the state-of-the-art variant pathogenicity prediction tools disregard the differences in characteristics among different genes and treat all types of mutations equally. Since missense variants are the most common type of variation in the coding region of the human genome, we developed a novel missense mutation pathogenicity prediction tool, named Prediction of Deleterious Missense Mutation for IRDs (PdmIRD) in this study. PdmIRD was tailored for IRDs-related genes and constructed with the conditional random forest model. Population frequencies and a newly available prediction tool were incorporated into PdmIRD to improve the performance of the model. The evaluation of PdmIRD demonstrated its superior performance over nonspecific tools (areas under the curves, 0.984 and 0.910) and an existing eye abnormalities-specific tool (areas under the curves, 0.975 and 0.891). We also demonstrated the submodel that used a smaller gene panel further slightly improved performance. Our study provides evidence that a disease-specific model can enhance the prediction of missense mutation pathogenicity, especially when new and important features are considered. Additionally, this study provides guidance for exploring the characteristics and functions of the mutated proteins in a greater number of Mendelian disorders.
Collapse
Affiliation(s)
- Bing Zeng
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China
- Department of Ophthalmology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
| | - Dong Cheng Liu
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China
| | - Jian Guo Huang
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China
| | - Xiao Bo Xia
- Eye Center of Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China.
| | - Bo Qin
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China.
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China.
- Aier School of Ophthalmology, Central South University, Changsha, Hunan, China.
| |
Collapse
|
5
|
Moldenhauer HJ, Tammen K, Meredith AL. Structural mapping of patient-associated KCNMA1 gene variants. Biophys J 2023:S0006-3495(23)04120-6. [PMID: 38042986 DOI: 10.1016/j.bpj.2023.11.3404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/30/2023] [Accepted: 11/30/2023] [Indexed: 12/04/2023] Open
Abstract
KCNMA1-linked channelopathy is a neurological disorder characterized by seizures, motor abnormalities, and neurodevelopmental disabilities. The disease mechanisms are predicted to result from alterations in KCNMA1-encoded BK K+ channel activity; however, only a subset of the patient-associated variants have been functionally studied. The localization of these variants within the tertiary structure or evaluation by pathogenicity algorithms has not been systematically assessed. In this study, 82 nonsynonymous patient-associated KCNMA1 variants were mapped within the BK channel protein. Fifty-three variants localized within cryoelectron microscopy-resolved structures, including 21 classified as either gain of function (GOF) or loss of function (LOF) in BK channel activity. Clusters of LOF variants were identified in the pore, the AC region (RCK1), and near the Ca2+ bowl (RCK2), overlapping with sites of pharmacological or endogenous modulation. However, no clustering was found for GOF variants. To further understand variants of uncertain significance (VUSs), assessments by multiple standard pathogenicity algorithms were compared, and new thresholds for sensitivity and specificity were established from confirmed GOF and LOF variants. An ensemble algorithm was constructed (KCNMA1 meta score (KMS)), consisting of a weighted summation of this trained dataset combined with a structural component derived from the Ca2+-bound and unbound BK channels. KMS assessment differed from the highest-performing individual algorithm (REVEL) at 10 VUS residues, and a subset were studied further by electrophysiology in HEK293 cells. M578T, E656A, and D965V (KMS+;REVEL-) were confirmed to alter BK channel properties in voltage-clamp recordings, and D800Y (KMS-;REVEL+) was assessed as benign under the test conditions. However, KMS failed to accurately assess K457E. These combined results reveal the distribution of potentially disease-causing KCNMA1 variants within BK channel functional domains and pathogenicity evaluation for VUSs, suggesting strategies for improving channel-level predictions in future studies by building on ensemble algorithms such as KMS.
Collapse
Affiliation(s)
- Hans J Moldenhauer
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland
| | - Kelly Tammen
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland
| | - Andrea L Meredith
- Department of Physiology, University of Maryland School of Medicine, Baltimore, Maryland.
| |
Collapse
|
6
|
Huang S, Wu Z, Wang T, Yu R, Song Z, Wang H. MmisAT and MmisP: an efficient and accurate suite of variant analysis toolkit for primary mitochondrial diseases. Hum Genomics 2023; 17:108. [PMID: 38012712 PMCID: PMC10683248 DOI: 10.1186/s40246-023-00557-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Accepted: 11/22/2023] [Indexed: 11/29/2023] Open
Abstract
Recent advances in next-generation sequencing (NGS) technology have greatly accelerated the need for efficient annotation to accurately interpret clinically relevant genetic variants in human diseases. Therefore, it is crucial to develop appropriate analytical tools to improve the interpretation of disease variants. Given the unique genetic characteristics of mitochondria, including haplogroup, heteroplasmy, and maternal inheritance, we developed a suite of variant analysis toolkits specifically designed for primary mitochondrial diseases: the Mitochondrial Missense Variant Annotation Tool (MmisAT) and the Mitochondrial Missense Variant Pathogenicity Predictor (MmisP). MmisAT can handle protein-coding variants from both nuclear DNA and mtDNA and generate 349 annotation types across six categories. It processes 4.78 million variant data in 76 min, making it a valuable resource for clinical and research applications. Additionally, MmisP provides pathogenicity scores to predict the pathogenicity of genetic variations in mitochondrial disease. It has been validated using cross-validation and external datasets and demonstrated higher overall discriminant accuracy with a receiver operating characteristic (ROC) curve area under the curve (AUC) of 0.94, outperforming existing pathogenicity predictors. In conclusion, the MmisAT is an efficient tool that greatly facilitates the process of variant annotation, expanding the scope of variant annotation information. Furthermore, the development of MmisP provides valuable insights into the creation of disease-specific, phenotype-specific, and even gene-specific predictors of pathogenicity, further advancing our understanding of specific fields.
Collapse
Affiliation(s)
- Shuangshuang Huang
- Department of Clinical Laboratory, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China
| | - Zhaoyu Wu
- Department of Clinical Laboratory, The Affiliated Hospital of Guangdong Medical University, Zhanjiang, China
| | - Tong Wang
- Department of Clinical Laboratory, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China
| | - Rui Yu
- Department of Ophthalmology, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China
| | - Zhijian Song
- OrigiMed, 5th Floor, Building 3, No.115 Xin Jun Huan Road, Minhang District, Shanghai, China.
| | - Hao Wang
- Department of Clinical Laboratory, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China.
| |
Collapse
|
7
|
Moldenhauer HJ, Tammen K, Meredith AL. Structural mapping of patient-associated KCNMA1 gene variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550850. [PMID: 37546746 PMCID: PMC10402178 DOI: 10.1101/2023.07.27.550850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
KCNMA1-linked channelopathy is a neurological disorder characterized by seizures, motor abnormalities, and neurodevelopmental disabilities. The disease mechanisms are predicted to result from alterations in KCNMA1-encoded BK K+ channel activity; however, only a subset of the patient-associated variants have been functionally studied. The localization of these variants within the tertiary structure or evaluation by pathogenicity algorithms has not been systematically assessed. In this study, 82 nonsynonymous patient-associated KCNMA1 variants were mapped within the BK channel protein. Fifty-three variants localized within cryo-EM resolved structures, including 21 classified as either gain-of-function (GOF) or loss-of-function (LOF) in BK channel activity. Clusters of LOF variants were identified in the pore, the AC region (RCK1), and near the Ca 2+ bowl (RCK2), overlapping with sites of pharmacological or endogenous modulation. However, no clustering was found for GOF variants. To further understand variants of uncertain significance (VUS), assessments by multiple standard pathogenicity algorithms were compared, and new thresholds for sensitivity and specificity were established from confirmed GOF and LOF variants. An ensemble algorithm was constructed (KCNMA1 Meta Score), consisting of a weighted summation of this trained dataset combined with a structural component derived from the Ca 2+ bound and unbound BK channels. KMS assessment differed from the highest performing individual algorithm (REVEL) at 10 VUS residues, and a subset were studied further by electrophysiology in HEK293 cells. M578T, E656A, and D965V (KMS+;REVEL-) were confirmed to alter BK channel properties in voltage-clamp recordings, and D800Y (KMS-;REVEL+) was assessed as benign under the test conditions. However, KMS failed to accurately assess K457E. These combined results reveal the distribution of potentially disease-causing KCNMA1 variants within BK channel functional domains and pathogenicity evaluation for VUS, suggesting strategies for improving channel-level predictions in future studies by building on ensemble algorithms such as KMS.
Collapse
|
8
|
Palmieri G, D’Ambrosio MF, Correale M, Brunetti ND, Santacroce R, Iacoviello M, Margaglione M. The Role of Genetics in the Management of Heart Failure Patients. Int J Mol Sci 2023; 24:15221. [PMID: 37894902 PMCID: PMC10607512 DOI: 10.3390/ijms242015221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/09/2023] [Accepted: 10/13/2023] [Indexed: 10/29/2023] Open
Abstract
Over the last decades, the relevance of genetics in cardiovascular diseases has expanded, especially in the context of cardiomyopathies. Its relevance extends to the management of patients diagnosed with heart failure (HF), given its capacity to provide invaluable insights into the etiology of cardiomyopathies and identify individuals at a heightened risk of poor outcomes. Notably, the identification of an etiological genetic variant necessitates a comprehensive evaluation of the family lineage of the affected patients. In the future, these genetic variants hold potential as therapeutic targets with the capability to modify gene expression. In this complex setting, collaboration among cardiologists, specifically those specializing in cardiomyopathies and HF, and geneticists becomes paramount to improving individual and family health outcomes, as well as therapeutic clinical results. This review is intended to offer geneticists and cardiologists an updated perspective on the value of genetic research in HF and its implications in clinical practice.
Collapse
Affiliation(s)
- Gianpaolo Palmieri
- School of Cardiology, Department of Medical and Surgical Sciences, University of Foggia, 70122 Foggia, Italy; (G.P.); (M.C.); (N.D.B.)
| | - Maria Francesca D’Ambrosio
- Medical Genetics, Department of Clinical and Experimental Medicine, University of Foggia, 70122 Foggia, Italy; (M.F.D.); (R.S.); (M.M.)
| | - Michele Correale
- School of Cardiology, Department of Medical and Surgical Sciences, University of Foggia, 70122 Foggia, Italy; (G.P.); (M.C.); (N.D.B.)
| | - Natale Daniele Brunetti
- School of Cardiology, Department of Medical and Surgical Sciences, University of Foggia, 70122 Foggia, Italy; (G.P.); (M.C.); (N.D.B.)
| | - Rosa Santacroce
- Medical Genetics, Department of Clinical and Experimental Medicine, University of Foggia, 70122 Foggia, Italy; (M.F.D.); (R.S.); (M.M.)
| | - Massimo Iacoviello
- University Cardiology Unit, Polyclinic Hospital of Bari, 70124 Bari, Italy
| | - Maurizio Margaglione
- Medical Genetics, Department of Clinical and Experimental Medicine, University of Foggia, 70122 Foggia, Italy; (M.F.D.); (R.S.); (M.M.)
| |
Collapse
|
9
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
10
|
Doh CY, Kampourakis T, Campbell KS, Stelzer JE. Basic science methods for the characterization of variants of uncertain significance in hypertrophic cardiomyopathy. Front Cardiovasc Med 2023; 10:1238515. [PMID: 37600050 PMCID: PMC10432852 DOI: 10.3389/fcvm.2023.1238515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Accepted: 07/20/2023] [Indexed: 08/22/2023] Open
Abstract
With the advent of next-generation whole genome sequencing, many variants of uncertain significance (VUS) have been identified in individuals suffering from inheritable hypertrophic cardiomyopathy (HCM). Unfortunately, this classification of a genetic variant results in ambiguity in interpretation, risk stratification, and clinical practice. Here, we aim to review some basic science methods to gain a more accurate characterization of VUS in HCM. Currently, many genomic data-based computational methods have been developed and validated against each other to provide a robust set of resources for researchers. With the continual improvement in computing speed and accuracy, in silico molecular dynamic simulations can also be applied in mutational studies and provide valuable mechanistic insights. In addition, high throughput in vitro screening can provide more biologically meaningful insights into the structural and functional effects of VUS. Lastly, multi-level mathematical modeling can predict how the mutations could cause clinically significant organ-level dysfunction. We discuss emerging technologies that will aid in better VUS characterization and offer a possible basic science workflow for exploring the pathogenicity of VUS in HCM. Although the focus of this mini review was on HCM, these basic science methods can be applied to research in dilated cardiomyopathy (DCM), restrictive cardiomyopathy (RCM), arrhythmogenic cardiomyopathy (ACM), or other genetic cardiomyopathies.
Collapse
Affiliation(s)
- Chang Yoon Doh
- School of Medicine, Case Western Reserve University, Cleveland, OH, United States
| | - Thomas Kampourakis
- Randall Centre for Cell and Molecular Biophysics, and British Heart Foundation Centre of Research Excellence, King’s College London, London, United Kingdom
| | - Kenneth S. Campbell
- Division of Cardiovascular Medicine, University of Kentucky, Lexington, KY, United States
| | - Julian E. Stelzer
- Department of Physiology and Biophysics, School of Medicine, Case Western Reserve University, Cleveland, OH, United States
| |
Collapse
|
11
|
Kang M, Kim S, Lee DB, Hong C, Hwang KB. Gene-specific machine learning for pathogenicity prediction of rare BRCA1 and BRCA2 missense variants. Sci Rep 2023; 13:10478. [PMID: 37380723 DOI: 10.1038/s41598-023-37698-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 06/26/2023] [Indexed: 06/30/2023] Open
Abstract
Machine learning-based pathogenicity prediction helps interpret rare missense variants of BRCA1 and BRCA2, which are associated with hereditary cancers. Recent studies have shown that classifiers trained using variants of a specific gene or a set of genes related to a particular disease perform better than those trained using all variants, due to their higher specificity, despite the smaller training dataset size. In this study, we further investigated the advantages of "gene-specific" machine learning compared to "disease-specific" machine learning. We used 1068 rare (gnomAD minor allele frequency (MAF) < 0.005) missense variants of 28 genes associated with hereditary cancers for our investigation. Popular machine learning classifiers were employed: regularized logistic regression, extreme gradient boosting, random forests, support vector machines, and deep neural networks. As features, we used MAFs from multiple populations, functional prediction and conservation scores, and positions of variants. The disease-specific training dataset included the gene-specific training dataset and was > 7 × larger. However, we observed that gene-specific training variants were sufficient to produce the optimal pathogenicity predictor if a suitable machine learning classifier was employed. Therefore, we recommend gene-specific over disease-specific machine learning as an efficient and effective method for predicting the pathogenicity of rare BRCA1 and BRCA2 missense variants.
Collapse
Affiliation(s)
- Moonjong Kang
- Research Center, Software Division, NGeneBio, Seoul, 08390, Korea
| | - Seonhwa Kim
- Research Center, Software Division, NGeneBio, Seoul, 08390, Korea
| | - Da-Bin Lee
- Department of Computer Science and Engineering, Graduate School, Soongsil University, Seoul, 06978, Korea
| | - Changbum Hong
- Research Center, Software Division, NGeneBio, Seoul, 08390, Korea.
| | - Kyu-Baek Hwang
- Department of Computer Science and Engineering, Graduate School, Soongsil University, Seoul, 06978, Korea.
| |
Collapse
|
12
|
Wang L, Sun J, Ma S, Xia J, Li X. PredDSMC: A predictor for driver synonymous mutations in human cancers. Front Genet 2023; 14:1164593. [PMID: 37051593 PMCID: PMC10083435 DOI: 10.3389/fgene.2023.1164593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 03/09/2023] [Indexed: 03/29/2023] Open
Abstract
Introduction: Driver mutations play a critical role in the occurrence and development of human cancers. Most studies have focused on missense mutations that function as drivers in cancer. However, accumulating experimental evidence indicates that synonymous mutations can also act as driver mutations.Methods: Here, we proposed a computational method called PredDSMC to accurately predict driver synonymous mutations in human cancers. We first systematically explored four categories of multimodal features, including sequence features, splicing features, conservation scores, and functional scores. Further feature selection was carried out to remove redundant features and improve the model performance. Finally, we utilized the random forest classifier to build PredDSMC.Results: The results of two independent test sets indicated that PredDSMC outperformed the state-of-the-art methods in differentiating driver synonymous mutations from passenger mutations.Discussion: In conclusion, we expect that PredDSMC, as a driver synonymous mutation prediction method, will be a valuable method for gaining a deeper understanding of synonymous mutations in human cancers.
Collapse
|
13
|
O'Neill MJ, Sala L, Denjoy I, Wada Y, Kozek K, Crotti L, Dagradi F, Kotta MC, Spazzolini C, Leenhardt A, Salem JE, Kashiwa A, Ohno S, Tao R, Roden DM, Horie M, Extramiana F, Schwartz PJ, Kroncke BM. Continuous Bayesian variant interpretation accounts for incomplete penetrance among Mendelian cardiac channelopathies. Genet Med 2023; 25:100355. [PMID: 36496179 PMCID: PMC9992222 DOI: 10.1016/j.gim.2022.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 12/05/2022] [Accepted: 12/05/2022] [Indexed: 12/12/2022] Open
Abstract
PURPOSE The congenital Long QT Syndrome (LQTS) and Brugada Syndrome (BrS) are Mendelian autosomal dominant diseases that frequently precipitate fatal cardiac arrhythmias. Incomplete penetrance is a barrier to clinical management of heterozygotes harboring variants in the major implicated disease genes KCNQ1, KCNH2, and SCN5A. We apply and evaluate a Bayesian penetrance estimation strategy that accounts for this phenomenon. METHODS We generated Bayesian penetrance models for KCNQ1-LQT1 and SCN5A-LQT3 using variant-specific features and clinical data from the literature, international arrhythmia genetic centers, and population controls. We analyzed the distribution of posterior penetrance estimates across 4 genotype-phenotype relationships and compared continuous estimates with ClinVar annotations. Posterior estimates were mapped onto protein structure. RESULTS Bayesian penetrance estimates of KCNQ1-LQT1 and SCN5A-LQT3 are empirically equivalent to 10 and 5 clinically phenotype heterozygotes, respectively. Posterior penetrance estimates were bimodal for KCNQ1-LQT1 and KCNH2-LQT2, with a higher fraction of missense variants with high penetrance among KCNQ1 variants. There was a wide distribution of variant penetrance estimates among identical ClinVar categories. Structural mapping revealed heterogeneity among "hot spot" regions and featured high penetrance estimates for KCNQ1 variants in contact with calmodulin and the S6 domain. CONCLUSIONS Bayesian penetrance estimates provide a continuous framework for variant interpretation.
Collapse
Affiliation(s)
- Matthew J O'Neill
- Vanderbilt University School of Medicine, Medical Scientist Training Program, Vanderbilt University, Nashville, TN
| | - Luca Sala
- IRCCS, Istituto Auxologico Italiano, Center for Cardiac Arrhythmias of Genetic Origin and Laboratory of Cardiovascular Genetics, Milano, Italy
| | - Isabelle Denjoy
- Department of Cardiovascular Medicine, Hôpital Bichat, APHP, Université de Paris Cité, Paris, France
| | - Yuko Wada
- Vanderbilt Center for Arrhythmia Research and Therapeutics (VanCART), Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Krystian Kozek
- Vanderbilt University School of Medicine, Medical Scientist Training Program, Vanderbilt University, Nashville, TN
| | - Lia Crotti
- IRCCS, Istituto Auxologico Italiano, Center for Cardiac Arrhythmias of Genetic Origin and Laboratory of Cardiovascular Genetics, Milano, Italy
| | - Federica Dagradi
- IRCCS, Istituto Auxologico Italiano, Center for Cardiac Arrhythmias of Genetic Origin and Laboratory of Cardiovascular Genetics, Milano, Italy
| | - Maria-Christina Kotta
- IRCCS, Istituto Auxologico Italiano, Center for Cardiac Arrhythmias of Genetic Origin and Laboratory of Cardiovascular Genetics, Milano, Italy
| | - Carla Spazzolini
- IRCCS, Istituto Auxologico Italiano, Center for Cardiac Arrhythmias of Genetic Origin and Laboratory of Cardiovascular Genetics, Milano, Italy
| | - Antoine Leenhardt
- Department of Cardiovascular Medicine, Hôpital Bichat, APHP, Université de Paris Cité, Paris, France
| | - Joe-Elie Salem
- Department of Cardiovascular Medicine, Hôpital Bichat, APHP, Université de Paris Cité, Paris, France
| | - Asami Kashiwa
- Department of Cardiovascular Medicine, Kyoto University Graduate School of Medicine Kyoto, Japan
| | - Seiko Ohno
- Department of Bioscience and Genetics, National Cerebral and Cardiovascular Center, Osaka, Japan
| | - Ran Tao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN
| | - Dan M Roden
- Vanderbilt Center for Arrhythmia Research and Therapeutics (VanCART), Departments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Minoru Horie
- Department of Cardiovascular Medicine, Shiga University of Medical Science, Shiga, Japan
| | - Fabrice Extramiana
- Department of Cardiovascular Medicine, Hôpital Bichat, APHP, Université de Paris Cité, Paris, France
| | - Peter J Schwartz
- IRCCS, Istituto Auxologico Italiano, Center for Cardiac Arrhythmias of Genetic Origin and Laboratory of Cardiovascular Genetics, Milano, Italy
| | - Brett M Kroncke
- Vanderbilt Center for Arrhythmia Research and Therapeutics (VanCART), Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN.
| |
Collapse
|
14
|
Zhang H, Xu MS, Fan X, Chung WK, Shen Y. Predicting functional effect of missense variants using graph attention neural networks. NAT MACH INTELL 2022; 4:1017-1028. [PMID: 37484202 PMCID: PMC10361701 DOI: 10.1038/s42256-022-00561-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 10/07/2022] [Indexed: 11/16/2022]
Abstract
Accurate prediction of damaging missense variants is critically important for interpreting a genome sequence. Although many methods have been developed, their performance has been limited. Recent advances in machine learning and the availability of large-scale population genomic sequencing data provide new opportunities to considerably improve computational predictions. Here we describe the graphical missense variant pathogenicity predictor (gMVP), a new method based on graph attention neural networks. Its main component is a graph with nodes that capture predictive features of amino acids and edges weighted by co-evolution strength, enabling effective pooling of information from the local protein context and functionally correlated distal positions. Evaluation of deep mutational scan data shows that gMVP outperforms other published methods in identifying damaging variants in TP53, PTEN, BRCA1 and MSH2. Furthermore, it achieves the best separation of de novo missense variants in neuro developmental disorder cases from those in controls. Finally, the model supports transfer learning to optimize gain- and loss-of-function predictions in sodium and calcium channels. In summary, we demonstrate that gMVP can improve interpretation of missense variants in clinical testing and genetic studies.
Collapse
Affiliation(s)
- Haicang Zhang
- Department of Systems Biology, Columbia University, New York, NY, USA
| | | | - Xiao Fan
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Pediatrics, Columbia University, New York, NY, USA
| | - Wendy K. Chung
- Department of Pediatrics, Columbia University, New York, NY, USA
- Department of Medicine, Columbia University, New York, NY, USA
| | - Yufeng Shen
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- JP Sulzberger Columbia Genome Center, Columbia University, New York, NY, USA
| |
Collapse
|
15
|
The Cancermuts software package for the prioritization of missense cancer variants: a case study of AMBRA1 in melanoma. Cell Death Dis 2022; 13:872. [PMID: 36243772 PMCID: PMC9569343 DOI: 10.1038/s41419-022-05318-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/27/2022] [Accepted: 10/03/2022] [Indexed: 11/07/2022]
Abstract
Cancer genomics and cancer mutation databases have made an available wealth of information about missense mutations found in cancer patient samples. Contextualizing by means of annotation and predicting the effect of amino acid change help identify which ones are more likely to have a pathogenic impact. Those can be validated by means of experimental approaches that assess the impact of protein mutations on the cellular functions or their tumorigenic potential. Here, we propose the integrative bioinformatic approach Cancermuts, implemented as a Python package. Cancermuts is able to gather known missense cancer mutations from databases such as cBioPortal and COSMIC, and annotate them with the pathogenicity score REVEL as well as information on their source. It is also able to add annotations about the protein context these mutations are found in, such as post-translational modification sites, structured/unstructured regions, presence of short linear motifs, and more. We applied Cancermuts to the intrinsically disordered protein AMBRA1, a key regulator of many cellular processes frequently deregulated in cancer. By these means, we classified mutations of AMBRA1 in melanoma, where AMBRA1 is highly mutated and displays a tumor-suppressive role. Next, based on REVEL score, position along the sequence, and their local context, we applied cellular and molecular approaches to validate the predicted pathogenicity of a subset of mutations in an in vitro melanoma model. By doing so, we have identified two AMBRA1 mutations which show enhanced tumorigenic potential and are worth further investigation, highlighting the usefulness of the tool. Cancermuts can be used on any protein targets starting from minimal information, and it is available at https://www.github.com/ELELAB/cancermuts as free software.
Collapse
|
16
|
Newey PJ. Approach to the patient with a variant of uncertain significance on genetic testing. Clin Endocrinol (Oxf) 2022; 97:400-408. [PMID: 35996232 DOI: 10.1111/cen.14818] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 08/12/2022] [Accepted: 08/13/2022] [Indexed: 11/29/2022]
Abstract
Establishing a genetic diagnosis may lead to major health benefits for the patient and their wider family, but is dependent on the accurate interpretation of test results. The processes of variant interpretation are by their nature imprecise such that the potential for uncertain test results (i.e., variant(s) of uncertain significance [VUS]) are an inevitable consequence of genomic testing. With an increased responsibility for diagnostic testing in the hands of the specialty physician (e.g., endocrinologist) rather than clinical geneticist, it is essential that they are familiar with the possible outcomes of testing including an understanding of the VUS category. While uncertainty is endemic to many aspects of clinical medicine, receiving a VUS result may pose a considerable challenge to both the clinician and the patient. In this article, a framework to support decision-making when confronted with a VUS variant is provided, focusing on the key components of the genetic testing pathway. This highlights the importance of assessing the VUS result in the context of the clinical presentation and genetic testing strategy, the value of multidisciplinary team working and ensuring good communication with the patient.
Collapse
Affiliation(s)
- Paul J Newey
- Division of Molecular and Clinical Medicine, Ninewells Hospital & Medical School, University of Dundee, Dundee, Scotland, UK
| |
Collapse
|
17
|
Liu Y, Yeung WSB, Chiu PCN, Cao D. Computational approaches for predicting variant impact: An overview from resources, principles to applications. Front Genet 2022; 13:981005. [PMID: 36246661 PMCID: PMC9559863 DOI: 10.3389/fgene.2022.981005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
One objective of human genetics is to unveil the variants that contribute to human diseases. With the rapid development and wide use of next-generation sequencing (NGS), massive genomic sequence data have been created, making personal genetic information available. Conventional experimental evidence is critical in establishing the relationship between sequence variants and phenotype but with low efficiency. Due to the lack of comprehensive databases and resources which present clinical and experimental evidence on genotype-phenotype relationship, as well as accumulating variants found from NGS, different computational tools that can predict the impact of the variants on phenotype have been greatly developed to bridge the gap. In this review, we present a brief introduction and discussion about the computational approaches for variant impact prediction. Following an innovative manner, we mainly focus on approaches for non-synonymous variants (nsSNVs) impact prediction and categorize them into six classes. Their underlying rationale and constraints, together with the concerns and remedies raised from comparative studies are discussed. We also present how the predictive approaches employed in different research. Although diverse constraints exist, the computational predictive approaches are indispensable in exploring genotype-phenotype relationship.
Collapse
Affiliation(s)
- Ye Liu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - William S. B. Yeung
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Philip C. N. Chiu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
- *Correspondence: Philip C. N. Chiu, ; Dandan Cao,
| | - Dandan Cao
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- *Correspondence: Philip C. N. Chiu, ; Dandan Cao,
| |
Collapse
|
18
|
Zhang B, Fan T. Knowledge structure and emerging trends in the application of deep learning in genetics research: A bibliometric analysis [2000–2021]. Front Genet 2022; 13:951939. [PMID: 36081985 PMCID: PMC9445221 DOI: 10.3389/fgene.2022.951939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/13/2022] [Indexed: 11/13/2022] Open
Abstract
Introduction: Deep learning technology has been widely used in genetic research because of its characteristics of computability, statistical analysis, and predictability. Herein, we aimed to summarize standardized knowledge and potentially innovative approaches for deep learning applications of genetics by evaluating publications to encourage more research.Methods: The Science Citation Index Expanded TM (SCIE) database was searched for deep learning applications for genomics-related publications. Original articles and reviews were considered. In this study, we derived a clustered network from 69,806 references that were cited by the 1,754 related manuscripts identified. We used CiteSpace and VOSviewer to identify countries, institutions, journals, co-cited references, keywords, subject evolution, path, current characteristics, and emerging topics.Results: We assessed the rapidly increasing publications concerned about deep learning applications of genomics approaches and identified 1,754 articles that published reports focusing on this subject. Among these, a total of 101 countries and 2,487 institutes contributed publications, The United States of America had the most publications (728/1754) and the highest h-index, and the US has been in close collaborations with China and Germany. The reference clusters of SCI articles were clustered into seven categories: deep learning, logic regression, variant prioritization, random forests, scRNA-seq (single-cell RNA-seq), genomic regulation, and recombination. The keywords representing the research frontiers by year were prediction (2016–2021), sequence (2017–2021), mutation (2017–2021), and cancer (2019–2021).Conclusion: Here, we summarized the current literature related to the status of deep learning for genetics applications and analyzed the current research characteristics and future trajectories in this field. This work aims to provide resources for possible further intensive exploration and encourages more researchers to overcome the research of deep learning applications in genetics.
Collapse
Affiliation(s)
- Bijun Zhang
- Department of Clinical Genetics, Shengjing Hospital of China Medical University, Shenyang, China
| | - Ting Fan
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, China
- *Correspondence: Ting Fan,
| |
Collapse
|
19
|
Barbosa P, Ribeiro M, Carmo-Fonseca M, Fonseca A. Clinical significance of genetic variation in hypertrophic cardiomyopathy: comparison of computational tools to prioritize missense variants. Front Cardiovasc Med 2022; 9:975478. [PMID: 36061567 PMCID: PMC9433717 DOI: 10.3389/fcvm.2022.975478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
Hypertrophic cardiomyopathy (HCM) is a common heart disease associated with sudden cardiac death. Early diagnosis is critical to identify patients who may benefit from implantable cardioverter defibrillator therapy. Although genetic testing is an integral part of the clinical evaluation and management of patients with HCM and their families, in many cases the genetic analysis fails to identify a disease-causing mutation. This is in part due to difficulties in classifying newly detected rare genetic variants as well as variants-of-unknown-significance (VUS). Multiple computational algorithms have been developed to predict the potential pathogenicity of genetic variants, but their relative performance in HCM has not been comprehensively assessed. Here, we compared the performance of 39 currently available prediction tools in distinguishing between high-confidence HCM-causing missense variants and benign variants, and we developed an easy-to-use-tool to perform variant prediction benchmarks based on annotated VCF files (VETA). Our results show that tool performance increases after HCM-specific calibration of thresholds. After excluding potential biases due to circularity type I issues, we identified ClinPred, MISTIC, FATHMM, MPC and MetaLR as the five best performer tools in discriminating HCM-associated variants. We propose combining these tools in order to prioritize unknown HCM missense variants that should be closely followed-up in the clinic.
Collapse
Affiliation(s)
- Pedro Barbosa
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina da Universidade de Lisboa, Lisboa, Portugal
| | - Marta Ribeiro
- Department of Bioengineering and iBB-Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| | - Maria Carmo-Fonseca
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina da Universidade de Lisboa, Lisboa, Portugal
- *Correspondence: Maria Carmo-Fonseca
| | - Alcides Fonseca
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Lisboa, Portugal
- GenoMed - Diagnósticos de Medicina Molecular, Lisboa, Portugal
- Alcides Fonseca
| |
Collapse
|
20
|
Ng CA, Ullah R, Farr J, Hill AP, Kozek KA, Vanags LR, Mitchell DW, Kroncke BM, Vandenberg JI. A massively parallel assay accurately discriminates between functionally normal and abnormal variants in a hotspot domain of KCNH2. Am J Hum Genet 2022; 109:1208-1216. [PMID: 35688148 PMCID: PMC9300756 DOI: 10.1016/j.ajhg.2022.05.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 05/03/2022] [Indexed: 01/09/2023] Open
Abstract
Many genes, including KCNH2, contain "hotspot" domains associated with a high density of variants associated with disease. This has led to the suggestion that variant location can be used as evidence supporting classification of clinical variants. However, it is not known what proportion of all potential variants in hotspot domains cause loss of function. Here, we have used a massively parallel trafficking assay to characterize all single-nucleotide variants in exon 2 of KCNH2, a known hotspot for variants that cause long QT syndrome type 2 and an increased risk of sudden cardiac death. Forty-two percent of KCNH2 exon 2 variants caused at least 50% reduction in protein trafficking, and 65% of these trafficking-defective variants exerted a dominant-negative effect when co-expressed with a WT KCNH2 allele as assessed using a calibrated patch-clamp electrophysiology assay. The massively parallel trafficking assay was more accurate (AUC of 0.94) than bioinformatic prediction tools (REVEL and CardioBoost, AUC of 0.81) in discriminating between functionally normal and abnormal variants. Interestingly, over half of variants in exon 2 were found to be functionally normal, suggesting a nuanced interpretation of variants in this "hotspot" domain is necessary. Our massively parallel trafficking assay can provide this information prospectively.
Collapse
Affiliation(s)
- Chai-Ann Ng
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia; School of Clinical Medicine, UNSW Sydney, Darlinghurst, NSW, Australia
| | - Rizwan Ullah
- Vanderbilt Center for Arrhythmia Research and Therapeutics, Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Jessica Farr
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia; School of Computer Science and Engineering, UNSW Sydney, Kensington, NSW, Australia
| | - Adam P Hill
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia; School of Clinical Medicine, UNSW Sydney, Darlinghurst, NSW, Australia
| | - Krystian A Kozek
- Vanderbilt Center for Arrhythmia Research and Therapeutics, Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Loren R Vanags
- Vanderbilt Center for Arrhythmia Research and Therapeutics, Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Devyn W Mitchell
- Vanderbilt Center for Arrhythmia Research and Therapeutics, Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Brett M Kroncke
- Vanderbilt Center for Arrhythmia Research and Therapeutics, Division of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
| | - Jamie I Vandenberg
- Mark Cowley Lidwill Research Program in Cardiac Electrophysiology, Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia; School of Clinical Medicine, UNSW Sydney, Darlinghurst, NSW, Australia.
| |
Collapse
|
21
|
Anderson CL, Munawar S, Reilly L, Kamp TJ, January CT, Delisle BP, Eckhardt LL. How Functional Genomics Can Keep Pace With VUS Identification. Front Cardiovasc Med 2022; 9:900431. [PMID: 35859585 PMCID: PMC9291992 DOI: 10.3389/fcvm.2022.900431] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 06/09/2022] [Indexed: 01/03/2023] Open
Abstract
Over the last two decades, an exponentially expanding number of genetic variants have been identified associated with inherited cardiac conditions. These tremendous gains also present challenges in deciphering the clinical relevance of unclassified variants or variants of uncertain significance (VUS). This review provides an overview of the advancements (and challenges) in functional and computational approaches to characterize variants and help keep pace with VUS identification related to inherited heart diseases.
Collapse
Affiliation(s)
- Corey L. Anderson
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Saba Munawar
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Louise Reilly
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Timothy J. Kamp
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Craig T. January
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Brian P. Delisle
- Department of Physiology, University of Kentucky College of Medicine, Lexington, KY, United States
| | - Lee L. Eckhardt
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
22
|
Fang M, Su Z, Abolhassani H, Itan Y, Jin X, Hammarström L. VIPPID: a gene-specific single nucleotide variant pathogenicity prediction tool for primary immunodeficiency diseases. Brief Bioinform 2022; 23:6590436. [PMID: 35598327 PMCID: PMC9487673 DOI: 10.1093/bib/bbac176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 04/05/2022] [Accepted: 04/18/2022] [Indexed: 01/04/2023] Open
Abstract
Abstract
Distinguishing pathogenic variants from non-pathogenic ones remains a major challenge in clinical genetic testing of primary immunodeficiency (PID) patients. Most of the existing mutation pathogenicity prediction tools treat all mutations as homogeneous entities, ignoring the differences in characteristics of different genes, and use the same model for genes in different diseases. In this study, we developed a single nucleotide variant (SNV) pathogenicity prediction tool, Variant Impact Predictor for PIDs (VIPPID; https://mylab.shinyapps.io/VIPPID/), which was tailored for PIDs genes and used a specific model for each of the most prevalent PID known genes. It employed a Conditional Inference Forest model and utilized information of 85 features of SNVs and scores from 20 existing prediction tools. Evaluation of VIPPID showed that it had superior performance (area under the curve = 0.91) over non-specific conventional tools. In addition, we also showed that the gene-specific model outperformed the non-gene-specific models. Our study demonstrated that disease-specific and gene-specific models can improve SNV pathogenicity prediction performance. This observation supports the notion that each feature of mutations in the model can be potentially used, in a new algorithm, to investigate the characteristics and function of the encoded proteins.
Collapse
Affiliation(s)
- Mingyan Fang
- BGI-Shenzhen, Shenzhen 518083, China
- Division of Clinical Immunology at the Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital Huddinge, SE-141 86 Stockholm, Sweden
- BGI-Singapore, Singapore 138567, Singapore
| | - Zheng Su
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, The University of New South Wales, Sydney, New South Wales, Australia
- GenieUs Genomics, 19A Boundary St, Darlinghurst NSW 2010, Australia
| | - Hassan Abolhassani
- Division of Clinical Immunology at the Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital Huddinge, SE-141 86 Stockholm, Sweden
- Department of Biosciences and Nutrition, NEO, Karolinska Institutet, SE14183 Huddinge, Sweden
| | - Yuval Itan
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Xin Jin
- BGI-Shenzhen, Shenzhen 518083, China
- BGI-Singapore, Singapore 138567, Singapore
| | - Lennart Hammarström
- BGI-Shenzhen, Shenzhen 518083, China
- Division of Clinical Immunology at the Department of Laboratory Medicine, Karolinska Institutet at Karolinska University Hospital Huddinge, SE-141 86 Stockholm, Sweden
- Department of Biosciences and Nutrition, NEO, Karolinska Institutet, SE14183 Huddinge, Sweden
| |
Collapse
|
23
|
DVPred: a disease-specific prediction tool for variant pathogenicity classification for hearing loss. Hum Genet 2022; 141:401-411. [PMID: 35182233 DOI: 10.1007/s00439-022-02440-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 02/06/2022] [Indexed: 02/08/2023]
Abstract
Numerous computational prediction tools have been introduced to estimate the functional impact of variants in the human genome based on evolutionary constraints and biochemical metrics. However, their implementation in diagnostic settings to classify variants faced challenges with accuracy and validity. Most existing tools are pan-genome and pan-diseases, which neglected gene- and disease-specific properties and limited the accessibility of curated data. As a proof-of-concept, we developed a disease-specific prediction tool named Deafness Variant deleteriousness Prediction tool (DVPred) that focused on the 157 genes reportedly causing genetic hearing loss (HL). DVPred applied the gradient boosting decision tree (GBDT) algorithm to the dataset consisting of expert-curated pathogenic and benign variants from a large in-house HL patient cohort and public databases. With the incorporation of variant-level and gene-level features, DVPred outperformed the existing universal tools. It boasts an area under the curve (AUC) of 0.98, and showed consistent performance (AUC = 0.985) in an independent assessment dataset. We further demonstrated that multiple gene-level metrics, including low complexity genomic regions and substitution intolerance scores, were the top features of the model. A comprehensive analysis of missense variants showed a gene-specific ratio of predicted deleterious and neutral variants, implying varied tolerance or intolerance to variation in different genes. DVPred explored the utility of disease-specific strategy in improving the deafness variant prediction tool. It can improve the prioritization of pathogenic variants among massive variants identified by high-throughput sequencing on HL genes. It also shed light on the development of variant prediction tools for other genetic disorders.
Collapse
|
24
|
Kingdom R, Wright CF. Incomplete Penetrance and Variable Expressivity: From Clinical Studies to Population Cohorts. Front Genet 2022; 13:920390. [PMID: 35983412 PMCID: PMC9380816 DOI: 10.3389/fgene.2022.920390] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 06/09/2022] [Indexed: 12/20/2022] Open
Abstract
The same genetic variant found in different individuals can cause a range of diverse phenotypes, from no discernible clinical phenotype to severe disease, even among related individuals. Such variants can be said to display incomplete penetrance, a binary phenomenon where the genotype either causes the expected clinical phenotype or it does not, or they can be said to display variable expressivity, in which the same genotype can cause a wide range of clinical symptoms across a spectrum. Both incomplete penetrance and variable expressivity are thought to be caused by a range of factors, including common variants, variants in regulatory regions, epigenetics, environmental factors, and lifestyle. Many thousands of genetic variants have been identified as the cause of monogenic disorders, mostly determined through small clinical studies, and thus, the penetrance and expressivity of these variants may be overestimated when compared to their effect on the general population. With the wealth of population cohort data currently available, the penetrance and expressivity of such genetic variants can be investigated across a much wider contingent, potentially helping to reclassify variants that were previously thought to be completely penetrant. Research into the penetrance and expressivity of such genetic variants is important for clinical classification, both for determining causative mechanisms of disease in the affected population and for providing accurate risk information through genetic counseling. A genotype-based definition of the causes of rare diseases incorporating information from population cohorts and clinical studies is critical for our understanding of incomplete penetrance and variable expressivity. This review examines our current knowledge of the penetrance and expressivity of genetic variants in rare disease and across populations, as well as looking into the potential causes of the variation seen, including genetic modifiers, mosaicism, and polygenic factors, among others. We also considered the challenges that come with investigating penetrance and expressivity.
Collapse
Affiliation(s)
- Rebecca Kingdom
- Institute of Biomedical and Clinical Science, Royal Devon & Exeter Hospital, University of Exeter Medical School, Exeter, United Kingdom
| | - Caroline F Wright
- Institute of Biomedical and Clinical Science, Royal Devon & Exeter Hospital, University of Exeter Medical School, Exeter, United Kingdom
| |
Collapse
|
25
|
Computational prediction of protein subdomain stability in MYBPC3 enables clinical risk stratification in hypertrophic cardiomyopathy and enhances variant interpretation. Genet Med 2021; 23:1281-1287. [PMID: 33782553 PMCID: PMC8257482 DOI: 10.1038/s41436-021-01134-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 12/13/2022] Open
Abstract
Purpose Variants in MYBPC3 causing loss of function are the most common cause of hypertrophic cardiomyopathy (HCM). However, a substantial number of patients carry missense variants of uncertain significance (VUS) in MYBPC3. We hypothesize that a structural-based algorithm, STRUM, which estimates the effect of missense variants on protein folding, will identify a subgroup of HCM patients with a MYBPC3 VUS associated with increased clinical risk. Methods Among 7,963 patients in the multicenter Sarcomeric Human Cardiomyopathy Registry (SHaRe), 120 unique missense VUS in MYBPC3 were identified. Variants were evaluated for their effect on subdomain folding and a stratified time-to-event analysis for an overall composite endpoint (first occurrence of ventricular arrhythmia, heart failure, all-cause mortality, atrial fibrillation, and stroke) was performed for patients with HCM and a MYBPC3 missense VUS. Results We demonstrated that patients carrying a MYBPC3 VUS predicted to cause subdomain misfolding (STRUM+, ΔΔG ≤ −1.2 kcal/mol) exhibited a higher rate of adverse events compared with those with a STRUM- VUS (hazard ratio = 2.29, P = 0.0282). In silico saturation mutagenesis of MYBPC3 identified 4,943/23,427 (21%) missense variants that were predicted to cause subdomain misfolding. Conclusion STRUM identifies patients with HCM and a MYBPC3 VUS who may be at higher clinical risk and provides supportive evidence for pathogenicity.
Collapse
|