1
|
Livesey BJ, Marsh JA. Variant effect predictor correlation with functional assays is reflective of clinical classification performance. Genome Biol 2025; 26:104. [PMID: 40264194 PMCID: PMC12016141 DOI: 10.1186/s13059-025-03575-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Accepted: 04/11/2025] [Indexed: 04/24/2025] Open
Abstract
BACKGROUND Understanding the relationship between protein sequence and function is crucial for accurate classification of missense variants. Variant effect predictors (VEPs) play a vital role in deciphering this complex relationship, yet evaluating their performance remains challenging for several reasons, including data circularity, where the same or related data is used for training and assessment. High-throughput experimental strategies like deep mutational scanning (DMS) offer a promising solution. RESULTS In this study, we extend upon our previous benchmarking approach, assessing the performance of 97 VEPs using missense DMS measurements from 36 different human proteins. In addition, a new pairwise, VEP-centric approach mitigates the impact of missing predictions on overall performance comparison. We observe a strong correspondence between VEP performance in DMS-based benchmarks and clinical variant classification, especially for predictors that have not been directly trained on human clinical variants. CONCLUSIONS Our results suggest that comparing VEP performance against diverse functional assays represents a reliable strategy for assessing their relative performance in clinical variant classification. However, major challenges in clinical interpretation of VEP scores persist, highlighting the need for further research to fully leverage computational predictors for genetic diagnosis. We also address practical considerations for end users in terms of choice of methodology.
Collapse
Affiliation(s)
- Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Jin F, Cheng N, Wang L, Ye B, Xia J. FDPSM: Feature-Driven Prediction Modeling of Pathogenic Synonymous Mutations. J Chem Inf Model 2025; 65:3064-3076. [PMID: 40082068 DOI: 10.1021/acs.jcim.4c02139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
Synonymous mutations, once considered to be biologically neutral, are now recognized to affect protein expression and function by altering the RNA splicing, stability, or translation efficiency. These effects can contribute to disease, making the prediction of the pathogenicity a crucial task. Computational methods have been developed to analyze the sequence features and biological functions of synonymous mutations, but existing methods face limitations, including scarcity of labeled data, reliance on other prediction tools, and insufficient representation of feature interrelationships. Here, we present FDPSM, a novel prediction method specifically designed to predict pathogenic synonymous mutations. FDPSM was trained on a robust data set of 4251 positive and negative training samples to enhance predictive accuracy. The method leveraged a comprehensive set of features, including genomic context, conservation, splicing effects, functional effects, and epigenomics, without relying on prediction scores from other mutation pathogenicity tools. Recognizing that original features alone may not fully capture the distinctions between pathogenic and benign synonymous mutations, we enhanced the feature set by extracting effective information from the interactions and distribution of these features. The experimental results showed that FDPSM significantly outperformed existing methods in predicting the pathogenicity of synonymous mutations, offering a more accurate and reliable tool for this important task. FDPSM is available at https://github.com/xialab-ahu/FDPSM.
Collapse
Affiliation(s)
- Fangfang Jin
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| | - Na Cheng
- School of Biomedical Engineering, Anhui Medical University, Hefei, Anhui 230032, China
| | - Lihua Wang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
- School of Information Engineering, Huangshan University, Huangshan, Anhui 245041, China
| | - Bin Ye
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
- School of Computer Science and Technology, Anhui University, Hefei, Anhui 230601, China
| | - Junfeng Xia
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| |
Collapse
|
3
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
4
|
Zaka A, Yousaf M, Shahzad S, Rao HZ, Foo JN, Siddiqi S. Structural and functional insights into a novel homozygous missense pathogenic variant in CUL7 identified in consanguineous Pakistani family. J Biomol Struct Dyn 2024; 42:5092-5103. [PMID: 37345548 DOI: 10.1080/07391102.2023.2224889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 06/08/2023] [Indexed: 06/23/2023]
Abstract
3M syndrome is a rare genetic familial disorder characterized by short stature, growth retardation, facial dysmorphism, skeletal abnormalities, fleshy protruding heels, and normal intelligence, caused by mutations in the CUL7, OBSL1 and CCDC8 genes. In the present study, a novel homozygous missense variant of CUL7 (NP_001161842.1, c.4493T > C, p.L1498P) has been identified in a consanguineous Pakistani family by whole exome sequencing. In silico structural evaluation, molecular docking and simulation studies of mutant CUL7 provides substantial evidence about its crucial role in the progression of discussed ailment. The newly discovered variant significantly altered the protein's three dimensional structure, leading to abnormal interaction with binding proteins. This computational and experimental investigation provides useful information to drug developers for the synthesis of novel therapeutics against the discussed ailment.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ayesha Zaka
- Genomics Research Lab, Department of Biological Sciences, International Islamic University, Islamabad, Pakistan
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad, Pakistan
| | - Maha Yousaf
- Genomics Research Lab, Department of Biological Sciences, International Islamic University, Islamabad, Pakistan
| | - Shaheen Shahzad
- Genomics Research Lab, Department of Biological Sciences, International Islamic University, Islamabad, Pakistan
| | - Hadi Zahid Rao
- Department of Oral & Maxillofacial Surgery, Bahria University Medical and Dental College Karachi, Pakistan
| | - Jia Nee Foo
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Saima Siddiqi
- Institute of Biomedical and Genetic Engineering (IBGE), Islamabad, Pakistan
| |
Collapse
|
5
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
6
|
Zeng B, Liu DC, Huang JG, Xia XB, Qin B. PdmIRD: missense variants pathogenicity prediction for inherited retinal diseases in a disease-specific manner. Hum Genet 2024; 143:331-342. [PMID: 38478153 DOI: 10.1007/s00439-024-02645-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 04/25/2024]
Abstract
Accurate discrimination of pathogenic and nonpathogenic variation remains an enormous challenge in clinical genetic testing of inherited retinal diseases (IRDs) patients. Computational methods for predicting variant pathogenicity are the main solutions for this dilemma. The majority of the state-of-the-art variant pathogenicity prediction tools disregard the differences in characteristics among different genes and treat all types of mutations equally. Since missense variants are the most common type of variation in the coding region of the human genome, we developed a novel missense mutation pathogenicity prediction tool, named Prediction of Deleterious Missense Mutation for IRDs (PdmIRD) in this study. PdmIRD was tailored for IRDs-related genes and constructed with the conditional random forest model. Population frequencies and a newly available prediction tool were incorporated into PdmIRD to improve the performance of the model. The evaluation of PdmIRD demonstrated its superior performance over nonspecific tools (areas under the curves, 0.984 and 0.910) and an existing eye abnormalities-specific tool (areas under the curves, 0.975 and 0.891). We also demonstrated the submodel that used a smaller gene panel further slightly improved performance. Our study provides evidence that a disease-specific model can enhance the prediction of missense mutation pathogenicity, especially when new and important features are considered. Additionally, this study provides guidance for exploring the characteristics and functions of the mutated proteins in a greater number of Mendelian disorders.
Collapse
Affiliation(s)
- Bing Zeng
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China
- Department of Ophthalmology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China
| | - Dong Cheng Liu
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China
| | - Jian Guo Huang
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China
| | - Xiao Bo Xia
- Eye Center of Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China.
| | - Bo Qin
- Shenzhen Aier Eye Hospital, Aier Eye Hospital, Jinan University, Shenzhen, 518031, Guangdong, China.
- Shenzhen Aier Ophthalmic Technology Institute, Shenzhen, 518031, Guangdong, China.
- Aier School of Ophthalmology, Central South University, Changsha, Hunan, China.
| |
Collapse
|