1
|
Phogat A, Krishnan SR, Pandey M, Gromiha MM. ZFP-CanPred: Predicting the effect of mutations in zinc-finger proteins in cancers using protein language models. Methods 2025; 235:55-63. [PMID: 39909391 DOI: 10.1016/j.ymeth.2025.01.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 01/21/2025] [Accepted: 01/27/2025] [Indexed: 02/07/2025] Open
Abstract
Zinc-finger proteins (ZNFs) constitute the largest family of transcription factors and play crucial roles in various cellular processes. Missense mutations in ZNFs significantly alter protein-DNA interactions, potentially leading to the development of various types of cancers. This study presents ZFP-CanPred, a novel deep learning-based model for predicting cancer-associated driver mutations in ZNFs. The representations derived from protein language models (PLMs) from the structural neighbourhood of mutated sites were utilized to train ZFP-CanPred for differentiating between cancer-causing and neutral mutations. ZFP-CanPred, achieved a superior performance with an accuracy of 0.72, F1-score of 0.79, and area under the Receiver Operating Characteristics (ROC) Curve (AUC) of 0.74, on an independent test set. In a comparative analysis against 11 existing prediction tools using a curated dataset of 331 mutations, ZFP-CanPred demonstrated the highest AU-ROC of 0.74, outperforming both generic and cancer-specific methods. The model's balanced performance across specificity and sensitivity addresses a significant limitation of current methodologies. The source code and other related files are available on GitHub at https://github.com/amitphogat/ZFP-CanPred.git. We envisage that the present study contributes to understand the oncogenic processes and developing targeted therapeutic strategies.
Collapse
Affiliation(s)
- Amit Phogat
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036 India
| | - Sowmya Ramaswamy Krishnan
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036 India
| | - Medha Pandey
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036 India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036 India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama 226-8501 Japan.
| |
Collapse
|
2
|
Gharui S, Sengupta D. Molecular Interactions of the Pioneer Transcription Factor GATA3 With DNA. Proteins 2025; 93:555-566. [PMID: 39315643 DOI: 10.1002/prot.26749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 08/15/2024] [Accepted: 08/30/2024] [Indexed: 09/25/2024]
Abstract
The GATA3 transcription factor is a pioneer transcription factor that is critical in the development, proliferation, and maintenance of several immune cell types. Identifying the detailed conformational dynamics and interactions of this transcription factor, as well as its clinically important population variants will allow us to unravel its mode of action. In this study, we analyze the molecular interactions of the GATA3 transcription factor bound to dsDNA as well as three clinically important population variants by atomistic molecular dynamics simulations. We identify the effect of the variants on the DNA conformational dynamics and delineate the differences compared to the wildtype transcription factor that could be related to impaired function. We highlight the structural plasticity in the binding of the GATA3 transcription factor and identify important DNA-protein contacts. Although the DNA-protein contacts are persistent and appear to be stable, they exhibit nanosecond timescale fluctuations and several binding/unbinding events. Further, we identify differential DNA binding in the three variants and show that the N-terminal binding is reduced in two of the variants. Our results indicate that reduced minor groove width and DNA diameter are important hallmarks for the binding of GATA3. Our work is an important step towards understanding the functional dynamics of the GATA3 protein and its clinically significant population variants.
Collapse
Affiliation(s)
- Sowmomita Gharui
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Pune, India
| | - Durba Sengupta
- Physical and Materials Chemistry Division, CSIR-National Chemical Laboratory, Pune, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
3
|
Balakrishnan A, Winiarek G, Hołówka O, Godlewski J, Bronisz A. Unlocking the secrets of the immunopeptidome: MHC molecules, ncRNA peptides, and vesicles in immune response. Front Immunol 2025; 16:1540431. [PMID: 39944685 PMCID: PMC11814183 DOI: 10.3389/fimmu.2025.1540431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 01/13/2025] [Indexed: 05/09/2025] Open
Abstract
The immunopeptidome, a diverse set of peptides presented by Major Histocompatibility Complex (MHC) molecules, is a critical component of immune recognition and response. This review article delves into the mechanisms of peptide presentation by MHC molecules, particularly emphasizing the roles of ncRNA-derived peptides and extracellular vesicles (EVs) in shaping the immunopeptidome landscape. We explore established and emerging insights into MHC molecule interactions with peptides, including the dynamics of peptide loading, transport, and the influence of cellular and genetic variations. The article highlights novel research on non-coding RNA (ncRNA)-derived peptides, which challenge conventional views of antigen processing and presentation and the role of EVs in transporting these peptides, thereby modulating immune responses at remote body sites. This novel research not only challenges conventional views but also opens up new avenues for understanding immune responses. Furthermore, we discuss the implications of these mechanisms in developing therapeutic strategies, particularly for cancer immunotherapy. By conducting a comprehensive analysis of current literature and advanced methodologies in immunopeptidomics, this review aims to deepen the understanding of the complex interplay between MHC peptide presentation and the immune system, offering new perspectives on potential diagnostic and therapeutic applications. Additionally, the interactions between ncRNA-derived peptides and EVs provide a mechanism for the enhanced surface presentation of these peptides and highlight a novel pathway for their systemic distribution, potentially altering immune surveillance and therapeutic landscapes.
Collapse
Affiliation(s)
- Arpita Balakrishnan
- Tumor Microenvironment Laboratory, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
- Translational Medicine Doctoral School, Centre of Postgraduate Medical Education, Warsaw, Poland
| | - Gabriela Winiarek
- Tumor Microenvironment Laboratory, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
| | - Olga Hołówka
- Tumor Microenvironment Laboratory, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
| | - Jakub Godlewski
- Department of NeuroOncology, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
| | - Agnieszka Bronisz
- Tumor Microenvironment Laboratory, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
4
|
Tajane SV, Thakur A, Acharya S, Chakrabarti P, Dey S. On the abundance and importance of AXXXA sequence motifs in globular proteins and their involvement in C βC β interaction. J Struct Biol 2024; 216:108129. [PMID: 39343152 DOI: 10.1016/j.jsb.2024.108129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 09/22/2024] [Accepted: 09/25/2024] [Indexed: 10/01/2024]
Abstract
The AXXXA and GXXXG motifs are frequently observed in helices, especially in membrane proteins. The motif GXXXG is known to stabilize helix-helix association in membrane proteins via CαHO bonding. AXXXA sequence motif additionally stabilizes the folded state of proteins. We found 27,000 and 18,000 occurrences of AXXXA and GXXXG motifs in a non-redundant set of 6000 obligate homodimeric (OD) complexes. Interestingly, this is less pronounced in transient homodimers (TD) and heterodimers (HetD). On average each obligate homodimer contains four AXXXA motifs, it is 2 and 3.5 for HetD and TD, respectively. Focusing on the binding surface it is seen that 27 % of the ODs contain at least one AXXXA motif at the interface, whereas it is 17 % and 15 % for HetD and TD respectively. AXXXA predominantly stabilizes the OD quaternary structure via the side chain CβCβ interactions. This interaction is energetically favorable and is found to be a major driving force for OD quaternary structure stability. Cβ-Cβ interactions are observed ∼6 times higher than the known CαHO interaction for helix-helix stabilization. Two additional new interactions of CβO and OO are observed at the AXXXA containing interface regions. The occurrence of the motif gets drastically reduced if any of the terminal Ala residues are replaced by Gly. Our findings show the importance of AXXXA in providing stability to the quaternary structure through specific hydrophobic interactions and the specificity of the Ala residue at motif termini. The knowledge gained can be used for designing synthetic proteins of improved stability and for designing peptide-based therapeutics.
Collapse
Affiliation(s)
- Surbhi Vilas Tajane
- Department of Bioscience and Bioengineering, Indian Institute of Technology Jodhpur, NH 62, Nagaur Road, Karwar 342030, Rajasthan, India
| | - Abhilasha Thakur
- Department of Bioscience and Bioengineering, Indian Institute of Technology Jodhpur, NH 62, Nagaur Road, Karwar 342030, Rajasthan, India
| | - Srijita Acharya
- Department of Bioscience and Bioengineering, Indian Institute of Technology Jodhpur, NH 62, Nagaur Road, Karwar 342030, Rajasthan, India
| | | | - Sucharita Dey
- Department of Bioscience and Bioengineering, Indian Institute of Technology Jodhpur, NH 62, Nagaur Road, Karwar 342030, Rajasthan, India.
| |
Collapse
|
5
|
Pan Q, Parra GB, Myung Y, Portelli S, Nguyen TB, Ascher DB. AlzDiscovery: A computational tool to identify Alzheimer's disease-causing missense mutations using protein structure information. Protein Sci 2024; 33:e5147. [PMID: 39276018 PMCID: PMC11401060 DOI: 10.1002/pro.5147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/14/2024] [Accepted: 07/31/2024] [Indexed: 09/16/2024]
Abstract
Alzheimer's disease (AD) is one of the most common forms of dementia and neurodegenerative diseases, characterized by the formation of neuritic plaques and neurofibrillary tangles. Many different proteins participate in this complicated pathogenic mechanism, and missense mutations can alter the folding and functions of these proteins, significantly increasing the risk of AD. However, many methods to identify AD-causing variants did not consider the effect of mutations from the perspective of a protein three-dimensional environment. Here, we present a machine learning-based analysis to classify the AD-causing mutations from their benign counterparts in 21 AD-related proteins leveraging both sequence- and structure-based features. Using computational tools to estimate the effect of mutations on protein stability, we first observed a bias of the pathogenic mutations with significant destabilizing effects on family AD-related proteins. Combining this insight, we built a generic predictive model, and improved the performance by tuning the sample weights in the training process. Our final model achieved the performance on area under the receiver operating characteristic curve up to 0.95 in the blind test and 0.70 in an independent clinical validation, outperforming all the state-of-the-art methods. Feature interpretation indicated that the hydrophobic environment and polar interaction contacts were crucial to the decision on pathogenic phenotypes of missense mutations. Finally, we presented a user-friendly web server, AlzDiscovery, for researchers to browse the predicted phenotypes of all possible missense mutations on these 21 AD-related proteins. Our study will be a valuable resource for AD screening and the development of personalized treatment.
Collapse
Affiliation(s)
- Qisheng Pan
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneAustralia
| | - Georgina Becerra Parra
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneAustralia
| | - Yoochan Myung
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneAustralia
| | - Stephanie Portelli
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneAustralia
| | - Thanh Binh Nguyen
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneAustralia
| | - David B. Ascher
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular BioscienceUniversity of QueenslandBrisbaneAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneAustralia
| |
Collapse
|
6
|
Ahmad RM, Ali BR, Al-Jasmi F, Al Dhaheri N, Al Turki S, Kizhakkedath P, Mohamad MS. AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes. Hum Genomics 2024; 18:99. [PMID: 39256852 PMCID: PMC11389290 DOI: 10.1186/s40246-024-00667-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 08/22/2024] [Indexed: 09/12/2024] Open
Abstract
Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.
Collapse
Affiliation(s)
- Rahaf M Ahmad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Bassam R Ali
- Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Fatma Al-Jasmi
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Noura Al Dhaheri
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Saeed Al Turki
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Praseetha Kizhakkedath
- Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates.
- Center for Engineering Computational Intelligence, Faculty of Engineering and Technology, Multimedia University, Melaka, Malaysia.
| |
Collapse
|
7
|
Mótyán JA, Tőzsér J. The human retroviral-like aspartic protease 1 (ASPRV1): From in vitro studies to clinical correlations. J Biol Chem 2024; 300:107634. [PMID: 39098535 PMCID: PMC11402058 DOI: 10.1016/j.jbc.2024.107634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 07/25/2024] [Accepted: 07/27/2024] [Indexed: 08/06/2024] Open
Abstract
The human retroviral-like aspartic protease 1 (ASPRV1) is a retroviral-like protein that was first identified in the skin due to its expression in the stratum granulosum layer of the epidermis. Accordingly, it is also referred to as skin-specific aspartic protease. Similar to the retroviral polyproteins, the full-length ASPRV1 also undergoes self-proteolysis, the processing of the precursor is necessary for the autoactivation of the protease domain. ASPRV1's functions are well-established at the level of the skin: it is part of the epidermal proteolytic network and has a significant contribution to skin moisturization via the limited proteolysis of filaggrin; it is only natural protein substrate identified so far. Filaggrin and ASPRV1 are also specific for mammalians, these proteins provide unique features for the skins of these species, and the importance of filaggrin processing in hydration is proved by the fact that some ASPRV1 mutations are associated with skin diseases such as ichthyosis. ASPRV1 was also found to be expressed in macrophage-like neutrophil cells, indicating that its functions are not limited to the skin. In addition, differential expression of ASPRV1 was detected in many diseases, with yet unknown significance. The currently known enzymatic characteristics-that had been revealed mainly by in vitro studies-and correlations with pathogenic phenotypes imply potentially important functions in multiple cell types, which makes the protein a promising target of functional studies. In this review we describe the currently available knowledge and future perspective in regard to ASPRV1.
Collapse
Affiliation(s)
- János András Mótyán
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary.
| | - József Tőzsér
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
8
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
9
|
Bromberg Y, Prabakaran R, Kabir A, Shehu A. Variant Effect Prediction in the Age of Machine Learning. Cold Spring Harb Perspect Biol 2024; 16:a041467. [PMID: 38621825 PMCID: PMC11216171 DOI: 10.1101/cshperspect.a041467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Over the years, many computational methods have been created for the analysis of the impact of single amino acid substitutions resulting from single-nucleotide variants in genome coding regions. Historically, all methods have been supervised and thus limited by the inadequate sizes of experimentally curated data sets and by the lack of a standardized definition of variant effect. The emergence of unsupervised, deep learning (DL)-based methods raised an important question: Can machines learn the language of life from the unannotated protein sequence data well enough to identify significant errors in the protein "sentences"? Our analysis suggests that some unsupervised methods perform as well or better than existing supervised methods. Unsupervised methods are also faster and can, thus, be useful in large-scale variant evaluations. For all other methods, however, their performance varies by both evaluation metrics and by the type of variant effect being predicted. We also note that the evaluation of method performance is still lacking on less-studied, nonhuman proteins where unsupervised methods hold the most promise.
Collapse
Affiliation(s)
- Yana Bromberg
- Department of Biology, Emory University, Atlanta 30322, Georgia, USA
- Department of Computer Science, Emory University, Atlanta 30322, Georgia, USA
| | - R Prabakaran
- Department of Biology, Emory University, Atlanta 30322, Georgia, USA
| | - Anowarul Kabir
- Department of Computer Science, George Mason University, Fairfax 22030, Virginia, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax 22030, Virginia, USA
| |
Collapse
|
10
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
11
|
Sun J, Qu J, Zhao C, Zhang X, Liu X, Wang J, Wei C, Liu X, Wang M, Zeng P, Tang X, Ling X, Qing L, Jiang S, Chen J, Chen TSR, Kuang Y, Gao J, Zeng X, Huang D, Yuan Y, Fan L, Yu H, Ding J. Precise prediction of phase-separation key residues by machine learning. Nat Commun 2024; 15:2662. [PMID: 38531854 DOI: 10.1038/s41467-024-46901-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 03/13/2024] [Indexed: 03/28/2024] Open
Abstract
Understanding intracellular phase separation is crucial for deciphering transcriptional control, cell fate transitions, and disease mechanisms. However, the key residues, which impact phase separation the most for protein phase separation function have remained elusive. We develop PSPHunter, which can precisely predict these key residues based on machine learning scheme. In vivo and in vitro validations demonstrate that truncating just 6 key residues in GATA3 disrupts phase separation, enhancing tumor cell migration and inhibiting growth. Glycine and its motifs are enriched in spacer and key residues, as revealed by our comprehensive analysis. PSPHunter identifies nearly 80% of disease-associated phase-separating proteins, with frequent mutated pathological residues like glycine and proline often residing in these key residues. PSPHunter thus emerges as a crucial tool to uncover key residues, facilitating insights into phase separation mechanisms governing transcriptional control, cell fate transitions, and disease development.
Collapse
Affiliation(s)
- Jun Sun
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Jiale Qu
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Cai Zhao
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Xinyao Zhang
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Xinyu Liu
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Jia Wang
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- GMU-GIBH Joint School of Life Sciences, Guangzhou Medical University, Guangzhou, 511436, China
| | - Chao Wei
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Xinyi Liu
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Mulan Wang
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Pengguihang Zeng
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Xiuxiao Tang
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Xiaoru Ling
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Li Qing
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Shaoshuai Jiang
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Jiahao Chen
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Tara S R Chen
- Department of Rehabilitation Medicine, The Seventh Affiliated Hospital, Sun Yat-Sen University, Shenzhen, Guangdong, 518107, China
| | - Yalan Kuang
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China
| | - Jinhang Gao
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China
| | - Xiaoxi Zeng
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China
| | - Dongfeng Huang
- Department of Rehabilitation Medicine, The Seventh Affiliated Hospital, Sun Yat-Sen University, Shenzhen, Guangdong, 518107, China
| | - Yong Yuan
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China.
| | - Lili Fan
- Guangzhou Key Laboratory of Formula-Pattern of Traditional Chinese Medicine, School of Traditional Chinese Medicine, Jinan University, Guangzhou, Guangdong, China.
| | - Haopeng Yu
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China.
| | - Junjun Ding
- Department of Thoracic Surgery and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
- Med-X Center for Informatics, Sichuan University, Chengdu, 610041, China.
- RNA Biomedical Institute, Sun Yat-sen Memorial Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China.
- Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China.
- Center for Stem Cell Biology and Tissue Engineering, Key Laboratory for Stem Cells and Tissue Engineering, Ministry of Education, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, Guangdong, China.
- Department of Rehabilitation Medicine, The Seventh Affiliated Hospital, Sun Yat-Sen University, Shenzhen, Guangdong, 518107, China.
| |
Collapse
|
12
|
Verma RK, Lokhande KB, Srivastava PK, Singh A. Elucidating B4GALNT1 as potential biomarker in hepatocellular carcinoma using machine learning models and mutational dynamics explored through MD simulation. INFORMATICS IN MEDICINE UNLOCKED 2024; 48:101514. [DOI: 10.1016/j.imu.2024.101514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025] Open
|
13
|
Colaco V, Goswami N, Goel VK, Srivastava SK, Lalrohlua P, Senthil Kumar N, Borah P, Baruah R, Varma AK. In silico and structure-based evaluation of deleterious mutations identified in human Chk1, Chk2, and Wee1 protein kinase. J Cell Biochem 2024; 125:89-99. [PMID: 38047473 DOI: 10.1002/jcb.30508] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 11/16/2023] [Accepted: 11/20/2023] [Indexed: 12/05/2023]
Abstract
Checkpoint kinases Chk1, Chk2, Wee1 are playing a key role in DNA damage response and genomic integrity. Cancer-associated mutations identified in human Chk1, Chk2, and Wee1 were retrieved to understand the function associated with the mutation and also alterations in the folding pattern. Therefore, an attempt has been made to identify deleterious effect of variants using in silico and structure-based approach. Variants of uncertain significance for Chk1, Chk2, and Wee1 were retrieved from different databases and four prediction servers were employed to predict pathogenicity of mutations. Further, Interpro, I-Mutant 3.0, Consurf, TM-align, and have (y)our protein explained were used for comprehensive study of the deleterious effects of variants. The sequences of Chk1, Chk2, and Wee1 were analyzed using Clustal Omega, and the three-dimensional structures of the proteins were aligned using TM-align. The molecular dynamics simulations were performed to explore the differences in folding pattern between Chk1, Chk2, Wee1 wild-type, and mutant protein and also to evaluate the structural integrity. Thirty-six variants in Chk1, 250 Variants in Chk2, and 29 in Wee1 were categorized as pathogenic using in silico prediction tools. Furthermore, 25 mutations in Chk1, 189 in Chk2, and 14 in Wee1 were highly conserved, possessing deleterious effect and also influencing the protein structure and function. These identified mutations may provide underlying genetic intricacies to serve as potential targets for therapeutic inventions and clinical management.
Collapse
Affiliation(s)
- Venessa Colaco
- Advanced Centre for Treatment, Research and Education in Cancer, Kharghar, Navi Mumbai, Maharashtra, India
| | - Nabajyoti Goswami
- Advanced Centre for Treatment, Research and Education in Cancer, Kharghar, Navi Mumbai, Maharashtra, India
| | - Vijay Kumar Goel
- School of Physical Sciences, Jawaharlal Nehru University, New Delhi, India
| | | | | | | | - Probodh Borah
- College of Veterinary Science, Assam Agricultural University, Khanapara, Guwahati, Assam, India
| | - Reshita Baruah
- Advanced Centre for Treatment, Research and Education in Cancer, Kharghar, Navi Mumbai, Maharashtra, India
| | - Ashok K Varma
- Advanced Centre for Treatment, Research and Education in Cancer, Kharghar, Navi Mumbai, Maharashtra, India
- Homi Bhabha National Institute, Training School Complex, Anushaktinagar, Mumbai, India
| |
Collapse
|
14
|
Pandey M, Shah SK, Gromiha MM. Computational approaches for identifying disease-causing mutations in proteins. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2023; 139:141-171. [PMID: 38448134 DOI: 10.1016/bs.apcsb.2023.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Advancements in genome sequencing have expanded the scope of investigating mutations in proteins across different diseases. Amino acid mutations in a protein alter its structure, stability and function and some of them lead to diseases. Identification of disease-causing mutations is a challenging task and it will be helpful for designing therapeutic strategies. Hence, mutation data available in the literature have been curated and stored in several databases, which have been effectively utilized for developing computational methods to identify deleterious mutations (drivers), using sequence and structure-based properties of proteins. In this chapter, we describe the contents of specific databases that have information on disease-causing and neutral mutations followed by sequence and structure-based properties. Further, characteristic features of disease-causing mutations will be discussed along with computational methods for identifying cancer hotspot residues and disease-causing mutations in proteins.
Collapse
Affiliation(s)
- Medha Pandey
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Suraj Kumar Shah
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama, Japan.
| |
Collapse
|
15
|
Aspatwar A, Supuran CT, Waheed A, Sly WS, Parkkila S. Mitochondrial carbonic anhydrase VA and VB: properties and roles in health and disease. J Physiol 2023; 601:257-274. [PMID: 36464834 PMCID: PMC10107955 DOI: 10.1113/jp283579] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 11/30/2022] [Indexed: 12/07/2022] Open
Abstract
Carbonic anhydrase V (CA V), a mitochondrial enzyme, was first isolated from guinea-pig liver and subsequently identified in mice and humans. Later, studies revealed that the mouse genome contains two mitochondrial CA sequences, named Car5A and Car5B. The CA VA enzyme is most highly expressed in the liver, whereas CA VB shows a broad tissue distribution. Car5A knockout mice demonstrated a predominant role for CA VA in ammonia detoxification, whereas the roles of CA VB in ureagenesis and gluconeogenesis were evident only in the absence of CA VA. Previous studies have suggested that CA VA is mainly involved in the provision of HCO3 - for biosynthetic processes. In children, mutations in the CA5A gene led to reduced CA activity, and the enzyme was sensitive to increased temperature. The metabolic profiles of these children showed a reduced supply of HCO3 - to the enzymes that take part in intermediary metabolism: carbamoylphosphate synthetase, pyruvate carboxylase, propionyl-CoA carboxylase and 3-methylcrotonyl-CoA carboxylase. Although the role of CA VB is still poorly understood, a recent study reported that it plays an essential role in human Sertoli cells, which sustain spermatogenesis. Metabolic disease associated with CA VA appears to be more common than other inborn errors of metabolism and responds well to treatment with N-carbamyl-l-glutamate. Therefore, early identification of hyperammonaemia will allow specific treatment with N-carbamyl-l-glutamate and prevent neurological sequelae. Carbonic anhydrase VA deficiency should therefore be considered a treatable condition in the differential diagnosis of hyperammonaemia in neonates and young children.
Collapse
Affiliation(s)
- Ashok Aspatwar
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.,Fimlab Ltd and Tampere University Hospital, Tampere, Finland
| | - Claudiu T Supuran
- Neurofarba Department, Sezione di Chimica Farmaceutica e Nutraceutica, Università degli Studi di Firenze, Sesto Fiorentino, Firenze, Italy
| | - Abdul Waheed
- Department of Biochemistry and Molecular Biology, Edward A. Doisy Research Center, Saint Louis University School of Medicine, St Louis, MO, USA
| | - William S Sly
- Department of Biochemistry and Molecular Biology, Edward A. Doisy Research Center, Saint Louis University School of Medicine, St Louis, MO, USA
| | - Seppo Parkkila
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.,Fimlab Ltd and Tampere University Hospital, Tampere, Finland
| |
Collapse
|
16
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
17
|
Liu Y, Yeung WSB, Chiu PCN, Cao D. Computational approaches for predicting variant impact: An overview from resources, principles to applications. Front Genet 2022; 13:981005. [PMID: 36246661 PMCID: PMC9559863 DOI: 10.3389/fgene.2022.981005] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
One objective of human genetics is to unveil the variants that contribute to human diseases. With the rapid development and wide use of next-generation sequencing (NGS), massive genomic sequence data have been created, making personal genetic information available. Conventional experimental evidence is critical in establishing the relationship between sequence variants and phenotype but with low efficiency. Due to the lack of comprehensive databases and resources which present clinical and experimental evidence on genotype-phenotype relationship, as well as accumulating variants found from NGS, different computational tools that can predict the impact of the variants on phenotype have been greatly developed to bridge the gap. In this review, we present a brief introduction and discussion about the computational approaches for variant impact prediction. Following an innovative manner, we mainly focus on approaches for non-synonymous variants (nsSNVs) impact prediction and categorize them into six classes. Their underlying rationale and constraints, together with the concerns and remedies raised from comparative studies are discussed. We also present how the predictive approaches employed in different research. Although diverse constraints exist, the computational predictive approaches are indispensable in exploring genotype-phenotype relationship.
Collapse
Affiliation(s)
- Ye Liu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - William S. B. Yeung
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Philip C. N. Chiu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Dandan Cao
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| |
Collapse
|
18
|
Talli I, Dovrolis N, Oulas A, Stavrakaki S, Makedou K, Spyrou GM, Maroulakou I. Novel clinical, molecular and bioinformatics insights into the genetic background of autism. Hum Genomics 2022; 16:39. [PMID: 36117207 PMCID: PMC9482726 DOI: 10.1186/s40246-022-00415-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/12/2022] [Indexed: 11/10/2022] Open
Abstract
Background Clinical classification of autistic patients based on current WHO criteria provides a valuable but simplified depiction of the true nature of the disorder. Our goal is to determine the biology of the disorder and the ASD-associated genes that lead to differences in the severity and variability of clinical features, which can enhance the ability to predict clinical outcomes. Method Novel Whole Exome Sequencing data from children (n = 33) with ASD were collected along with extended cognitive and linguistic assessments. A machine learning methodology and a literature-based approach took into consideration known effects of genetic variation on the translated proteins, linking them with specific ASD clinical manifestations, namely non-verbal IQ, memory, attention and oral language deficits. Results Linear regression polygenic risk score results included the classification of severe and mild ASD samples with a 81.81% prediction accuracy. The literature-based approach revealed 14 genes present in all sub-phenotypes (independent of severity) and others which seem to impair individual ones, highlighting genetic profiles specific to mild and severe ASD, which concern non-verbal IQ, memory, attention and oral language skills. Conclusions These genes can potentially contribute toward a diagnostic gene-set for determining ASD severity. However, due to the limited number of patients in this study, our classification approach is mostly centered on the prediction and verification of these genes and does not hold a diagnostic nature per se. Substantial further experimentation is required to validate their role as diagnostic markers. The use of these genes as input for functional analysis highlights important biological processes and bridges the gap between genotype and phenotype in ASD.
Collapse
Affiliation(s)
- Ioanna Talli
- Department of Italian Language and Literature, School of Philosophy, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Nikolas Dovrolis
- Laboratory of Biology, Department of Medicine, Democritus University of Thrace, Alexandroupolis, Greece
| | - Anastasis Oulas
- Bioinformatics Department, The Cyprus Institute of Neurology and Genetics, 6 International Airport Avenue, 2370 Nicosia, Cyprus, P.O. Box 23462, 1683, Nicosia, Cyprus.,The Cyprus School of Molecular Medicine, 6 International Airport Avenue, 2370 Nicosia, Cyprus, P.O. Box 23462, 1683, Nicosia, Cyprus
| | - Stavroula Stavrakaki
- Department of Italian Language and Literature, School of Philosophy, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Kali Makedou
- Laboratory of Biochemistry, School of Medicine, AHEPA General Hospital, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - George M Spyrou
- Bioinformatics Department, The Cyprus Institute of Neurology and Genetics, 6 International Airport Avenue, 2370 Nicosia, Cyprus, P.O. Box 23462, 1683, Nicosia, Cyprus. .,The Cyprus School of Molecular Medicine, 6 International Airport Avenue, 2370 Nicosia, Cyprus, P.O. Box 23462, 1683, Nicosia, Cyprus.
| | - Ioanna Maroulakou
- Laboratory of Genetics, Department of Molecular Biology and Genetics, Democritus University of Thrace, 68100, Alexandroupolis, Greece.
| |
Collapse
|
19
|
Prediction of infectivity of SARS-CoV-2 virus based on Spike-hACE-2 interaction. Virusdisease 2022; 33:244-250. [PMID: 35965884 PMCID: PMC9362045 DOI: 10.1007/s13337-022-00781-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 06/21/2022] [Indexed: 11/08/2022] Open
Abstract
The COVID-19 pandemic caused by SARS-CoV-2 results almost 3 M death worldwide and till continuing in spite of having several vaccine against the virus. One of the main reasons is the mutations occur in the virus to cope with the environment. Detail study of genomics and proteomics level of each components may help to combat the situation. Spike (S) protein that covers the surface of the virus helps in entry by encountering the host receptor Human Angiotensin-Converting Enzyme-2 (hACE-2) with other different roles. In this study, we accomplish our work with the mutations in receptor binding domain (RBD) of Spike (S) protein considering different aspects like the hACE-2 variants in human populations to get an idea about the varying infectivity of different strains for different population. Several other parameters affecting the viral infectivity and in different diseased condition were also studied which may guide to a better insight in developing future therapeutics.
Collapse
|
20
|
Yazar M, Ozbek P. Assessment of 13 in silico pathogenicity methods on cancer-related variants. Comput Biol Med 2022; 145:105434. [DOI: 10.1016/j.compbiomed.2022.105434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/04/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
|
21
|
Lansing F, Mukhametzyanova L, Rojo-Romanos T, Iwasawa K, Kimura M, Paszkowski-Rogacz M, Karpinski J, Grass T, Sonntag J, Schneider PM, Günes C, Hoersten J, Schmitt LT, Rodriguez-Muela N, Knöfler R, Takebe T, Buchholz F. Correction of a Factor VIII genomic inversion with designer-recombinases. Nat Commun 2022; 13:422. [PMID: 35058465 PMCID: PMC8776779 DOI: 10.1038/s41467-022-28080-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 12/22/2021] [Indexed: 01/16/2023] Open
Abstract
Despite advances in nuclease-based genome editing technologies, correcting human disease-causing genomic inversions remains a challenge. Here, we describe the potential use of a recombinase-based system to correct the 140 kb inversion of the F8 gene frequently found in patients diagnosed with severe Hemophilia A. Employing substrate-linked directed molecular evolution, we develop a coupled heterodimeric recombinase system (RecF8) achieving 30% inversion of the target sequence in human tissue culture cells. Transient RecF8 treatment of endothelial cells, differentiated from patient-derived induced pluripotent stem cells (iPSCs) of a hemophilic donor, results in 12% correction of the inversion and restores Factor VIII mRNA expression. In this work, we present designer-recombinases as an efficient and specific means towards treatment of monogenic diseases caused by large gene inversions. Correction of disease-causing large genomic inversions remains challenging. Here, the authors developed a dual designer-recombinase system (RecF8) that efficiently corrects a 140 kb inversion frequently found in patients with severe Hemophilia A.
Collapse
|
22
|
Hauser AS. Personalized Medicine Through GPCR Pharmacogenomics. COMPREHENSIVE PHARMACOLOGY 2022:191-219. [DOI: 10.1016/b978-0-12-820472-6.00100-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
23
|
Currey L, Thor S, Piper M. TEAD family transcription factors in development and disease. Development 2021; 148:269158. [PMID: 34128986 DOI: 10.1242/dev.196675] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The balance between stem cell potency and lineage specification entails the integration of both extrinsic and intrinsic cues, which ultimately influence gene expression through the activity of transcription factors. One example of this is provided by the Hippo signalling pathway, which plays a central role in regulating organ size during development. Hippo pathway activity is mediated by the transcriptional co-factors Yes-associated protein (YAP) and transcriptional co-activator with PDZ-binding motif (TAZ), which interact with TEA domain (TEAD) proteins to regulate gene expression. Although the roles of YAP and TAZ have been intensively studied, the roles played by TEAD proteins are less well understood. Recent studies have begun to address this, revealing that TEADs regulate the balance between progenitor self-renewal and differentiation throughout various stages of development. Furthermore, it is becoming apparent that TEAD proteins interact with other co-factors that influence stem cell biology. This Primer provides an overview of the role of TEAD proteins during development, focusing on their role in Hippo signalling as well as within other developmental, homeostatic and disease contexts.
Collapse
Affiliation(s)
- Laura Currey
- The School of Biomedical Sciences, Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Stefan Thor
- The School of Biomedical Sciences, Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Michael Piper
- The School of Biomedical Sciences, Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia.,Queensland Brain Institute, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
24
|
Wang X, Zhang X, Peng C, Shi Y, Li H, Xu Z, Zhu W. D3DistalMutation: a Database to Explore the Effect of Distal Mutations on Enzyme Activity. J Chem Inf Model 2021; 61:2499-2508. [PMID: 33938221 DOI: 10.1021/acs.jcim.1c00318] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Enzyme activity is affected by amino acid mutations, particularly mutations near the active site. Increasing evidence has shown that distal mutations more than 10 Å away from the active site may significantly affect enzyme activity. However, it is difficult to study the enzyme regulation mechanism of distal mutations due to the lack of a systematic collection of three-dimensional (3D) structures, highlighting distal mutation site and the corresponding enzyme activity change. Therefore, we constructed a distal mutation database, namely, D3DistalMutation, which relates the distal mutation to enzyme activity. As a result, we observed that approximately 80% of distal mutations could affect enzyme activity and 72.7% of distal mutations would decrease or abolish enzyme activity in D3DistalMutation. Only 6.6% of distal mutations in D3DistalMutation could increase enzyme activity, which have great potential to the industrial field. Among these mutations, the Y to F, S to D, and T to D mutations are most likely to increase enzyme activity, which sheds some light on industrial catalysis. Distal mutations decreasing enzyme activity in the allosteric pocket play an indispensable role in allosteric drug design. In addition, the pockets in the enzyme structures are provided to explore the enzyme regulation mechanism of distal mutations. D3DistalMutation is accessible free of charge at https://www.d3pharma.com/D3DistalMutation/index.php.
Collapse
Affiliation(s)
- Xiaoyu Wang
- CAS Key Laboratory of Receptor Research; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China.,College of Mathematics and Physics, Shanghai University of Electric Power, Shanghai 200090, China
| | - Xinben Zhang
- CAS Key Laboratory of Receptor Research; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Cheng Peng
- CAS Key Laboratory of Receptor Research; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yulong Shi
- CAS Key Laboratory of Receptor Research; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Huiyu Li
- College of Mathematics and Physics, Shanghai University of Electric Power, Shanghai 200090, China
| | - Zhijian Xu
- CAS Key Laboratory of Receptor Research; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Weiliang Zhu
- CAS Key Laboratory of Receptor Research; Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| |
Collapse
|
25
|
The Participation of the Intrinsically Disordered Regions of the bHLH-PAS Transcription Factors in Disease Development. Int J Mol Sci 2021; 22:ijms22062868. [PMID: 33799876 PMCID: PMC8001110 DOI: 10.3390/ijms22062868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 03/05/2021] [Accepted: 03/07/2021] [Indexed: 12/14/2022] Open
Abstract
The basic helix–loop–helix/Per-ARNT-SIM (bHLH-PAS) proteins are a family of transcription factors regulating expression of a wide range of genes involved in different functions, ranging from differentiation and development control by oxygen and toxins sensing to circadian clock setting. In addition to the well-preserved DNA-binding bHLH and PAS domains, bHLH-PAS proteins contain long intrinsically disordered C-terminal regions, responsible for regulation of their activity. Our aim was to analyze the potential connection between disordered regions of the bHLH-PAS transcription factors, post-transcriptional modifications and liquid-liquid phase separation, in the context of disease-associated missense mutations. Highly flexible disordered regions, enriched in short motives which are more ordered, are responsible for a wide spectrum of interactions with transcriptional co-regulators. Based on our in silico analysis and taking into account the fact that the functions of transcription factors can be modulated by posttranslational modifications and spontaneous phase separation, we assume that the locations of missense mutations inducing disease states are clearly related to sequences directly undergoing these processes or to sequences responsible for their regulation.
Collapse
|
26
|
Yazar M, Özbek P. In Silico Tools and Approaches for the Prediction of Functional and Structural Effects of Single-Nucleotide Polymorphisms on Proteins: An Expert Review. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2020; 25:23-37. [PMID: 33058752 DOI: 10.1089/omi.2020.0141] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Single-nucleotide polymorphisms (SNPs) are single-base variants that contribute to human biological variation and pathogenesis of many human diseases. Among all SNP types, nonsynonymous single-nucleotide polymorphisms (nsSNPs) can alter many structural, biochemical, and functional features of a protein such as folding characteristics, charge distribution, stability, dynamics, and interactions with other proteins/nucleotides. These modifications in the protein structure can lead nsSNPs to be closely associated with many multifactorial diseases such as cancer, diabetes, and neurodegenerative diseases. Predicting structural and functional effects of nsSNPs with experimental approaches can be time-consuming and costly; hence, computational prediction tools and algorithms are being widely and increasingly utilized in biology and medical research. This expert review examines the in silico tools and algorithms for the prediction of functional or structural effects of SNP variants, in addition to the description of the phenotypic effects of nsSNPs on protein structure, association between pathogenicity of variants, and functional or structural features of disease-associated variants. Finally, case studies investigating the functional and structural effects of nsSNPs on selected protein structures are highlighted. We conclude that creating a consistent workflow with a combination of in silico approaches or tools should be considered to increase the performance, accuracy, and precision of the biological and clinical predictions made in silico.
Collapse
Affiliation(s)
- Metin Yazar
- Department of Bioengineering, Marmara University, Göztepe, İstanbul, Turkey.,Department of Genetics and Bioengineering, Istanbul Okan University, Tuzla, Istanbul, Turkey
| | - Pemra Özbek
- Department of Bioengineering, Marmara University, Göztepe, İstanbul, Turkey
| |
Collapse
|
27
|
Chaudhari S, Naha R, Mukherjee S, Sharma A, Jayaram P, Mallya S, Chakrabarty S, Satyamoorthy K. DINAX- a comprehensive database of inherited ataxias. Comput Biol Med 2020; 126:104000. [PMID: 33007622 DOI: 10.1016/j.compbiomed.2020.104000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 08/29/2020] [Accepted: 08/29/2020] [Indexed: 10/23/2022]
Abstract
BACKGROUND Neurodegenerative disorders such as hereditary ataxia often manifest overlapping symptoms and are likely to be misdiagnosed based on clinical phenotypes. To identify the genes associated with such disorders for diagnostic purposes, geneticists often use high throughput technologies which generate an enormous amount of data on variants whose relevance can be unclear. Besides, analysis and interpretation of high throughput data require gleaning of several web-based resources which can be laborious and time-consuming. To overcome these, we have created a Database for Inherited Ataxia (DINAX), a repository of gene variants from publicly available information. METHODS DINAX is implemented as a MySQL relational database using the PHP scripting language. Web interfaces were developed using HTML, CSS, and JavaScript. Variant and phenotype information was collected and manually curated from published literature and primary databases such as OMIM and ClinVar. These were further analyzed to decipher expression and pathway analysis. RESULTS DINAX is an inventory of 7166 genomic variants (single nucleotide polymorphisms, deletions, insertions, and translocations) reported till date among the 185 genes associated with different subtypes of inherited ataxia. DINAX implements a dual search methodology for genes and phenotypes linking to ataxia associated genes, variants, and their source. Pathway analysis confirmed their association with ataxia. CONCLUSION The database is created to provide a single web source for obtaining information about ataxia related genes. Besides, the database facilitates easy identification of known and reported variants as well as the novel or unreported variants. DINAX is freely available at http://slsdb.manipal.edu/dinax.
Collapse
Affiliation(s)
- Sima Chaudhari
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Ritam Naha
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Sravasti Mukherjee
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Additya Sharma
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Pradyumna Jayaram
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Sandeep Mallya
- Department of Bioinformatics, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Sanjiban Chakrabarty
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India
| | - Kapaettu Satyamoorthy
- Department of Cellular and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, 576104, India.
| |
Collapse
|
28
|
Insights into changes in binding affinity caused by disease mutations in protein-protein complexes. Comput Biol Med 2020; 123:103829. [DOI: 10.1016/j.compbiomed.2020.103829] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/20/2020] [Accepted: 05/20/2020] [Indexed: 01/11/2023]
|
29
|
Rijensky NM, Blondheim Shraga NR, Barnea E, Peled N, Rosenbaum E, Popovtzer A, Stemmer SM, Livoff A, Shlapobersky M, Moskovits N, Perry D, Rubin E, Haviv I, Admon A. Identification of Tumor Antigens in the HLA Peptidome of Patient-derived Xenograft Tumors in Mouse. Mol Cell Proteomics 2020; 19:1360-1374. [PMID: 32451349 PMCID: PMC8015002 DOI: 10.1074/mcp.ra119.001876] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 05/20/2020] [Indexed: 12/15/2022] Open
Abstract
Personalized cancer immunotherapy targeting patient-specific cancer/testis antigens (CTA) and neoantigens may benefit from large-scale tumor human leukocyte antigen (HLA) peptidome (immunopeptidome) analysis, which aims to accurately identify antigens presented by tumor cells. Although significant efforts have been invested in analyzing the HLA peptidomes of fresh tumors, it is often impossible to obtain sufficient volumes of tumor tissues for comprehensive HLA peptidome characterization. This work attempted to overcome some of these obstacles by using patient-derived xenograft tumors (PDX) in mice as the tissue sources for HLA peptidome analysis. PDX tumors provide a proxy for the expansion of the patient tumor by re-grafting them through several passages to immune-compromised mice. The HLA peptidomes of human biopsies were compared with those derived from PDX tumors. Larger HLA peptidomes were obtained from the significantly larger PDX tumors as compared with the patient biopsies. The HLA peptidomes of different PDX tumors derived from the same source tumor biopsy were very reproducible, even following subsequent passages to new naïve mice. Many CTA-derived HLA peptides were discovered, as well as several potential neoantigens/variant sequences. Taken together, the use of PDX tumors for HLA peptidome analysis serves as a highly expandable and stable source of reproducible and authentic peptidomes, opening up new opportunities for defining large HLA peptidomes when only small tumor biopsies are available. This approach provides a large source for tumor antigens identification, potentially useful for personalized immunotherapy.
Collapse
Affiliation(s)
| | | | - Eilon Barnea
- Department of Biology, Technion-Israel Institute of Technology Haifa, Israel
| | - Nir Peled
- Institute of Oncology, Davidoff Center, Rabin Medical Center and Sackler Faculty of Medicine, Tel-Aviv University, Petah Tikva, Israel
| | - Eli Rosenbaum
- Institute of Oncology, Davidoff Center, Rabin Medical Center and Sackler Faculty of Medicine, Tel-Aviv University, Petah Tikva, Israel
| | - Aron Popovtzer
- Institute of Oncology, Davidoff Center, Rabin Medical Center and Sackler Faculty of Medicine, Tel-Aviv University, Petah Tikva, Israel
| | - Solomon M Stemmer
- Davidoff Center, Rabin Medical Center, Beilinson Campus, Petach Tikva, and Felsentien medical research center, Petach Tikva, and Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Alejandro Livoff
- Institute of Pathology, Barzilai University Medical Center, Ashkelon, Israel
| | - Mark Shlapobersky
- Institute of Pathology, Barzilai University Medical Center, Ashkelon, Israel
| | - Neta Moskovits
- Davidoff Center, Rabin Medical Center, Beilinson Campus, Petach Tikva, and Felsentien medical research center, Petach Tikva, Israel
| | - Dafna Perry
- The Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Eitan Rubin
- Faculty of Health Sciences, Ben-Gurion University of the Negev, Beersheva, Israel; The Shraga Segal Department of Microbiology, Immunology and Genetics, Ben-Gurion University of the Negev, Beersheba, Israel
| | - Itzhak Haviv
- The Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel
| | - Arie Admon
- Department of Biology, Technion-Israel Institute of Technology Haifa, Israel.
| |
Collapse
|
30
|
Zaucha J, Heinzinger M, Kulandaisamy A, Kataka E, Salvádor ÓL, Popov P, Rost B, Gromiha MM, Zhorov BS, Frishman D. Mutations in transmembrane proteins: diseases, evolutionary insights, prediction and comparison with globular proteins. Brief Bioinform 2020; 22:5872174. [PMID: 32672331 DOI: 10.1093/bib/bbaa132] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 05/26/2020] [Accepted: 05/28/2020] [Indexed: 12/18/2022] Open
Abstract
Membrane proteins are unique in that they interact with lipid bilayers, making them indispensable for transporting molecules and relaying signals between and across cells. Due to the significance of the protein's functions, mutations often have profound effects on the fitness of the host. This is apparent both from experimental studies, which implicated numerous missense variants in diseases, as well as from evolutionary signals that allow elucidating the physicochemical constraints that intermembrane and aqueous environments bring. In this review, we report on the current state of knowledge acquired on missense variants (referred to as to single amino acid variants) affecting membrane proteins as well as the insights that can be extrapolated from data already available. This includes an overview of the annotations for membrane protein variants that have been collated within databases dedicated to the topic, bioinformatics approaches that leverage evolutionary information in order to shed light on previously uncharacterized membrane protein structures or interaction interfaces, tools for predicting the effects of mutations tailored specifically towards the characteristics of membrane proteins as well as two clinically relevant case studies explaining the implications of mutated membrane proteins in cancer and cardiomyopathy.
Collapse
Affiliation(s)
- Jan Zaucha
- Department of Bioinformatics of the TUM School of Life Sciences Weihenstephan in Freising, Germany
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics and Computational Biology of the TUM Faculty of Informatics in Garching, Germany
| | - A Kulandaisamy
- Department of Biotechnology of the IIT Bhupat and Jyoti Mehta School of BioSciences in Madras, India
| | - Evans Kataka
- Department of Bioinformatics of the TUM School of Life Sciences Weihenstephan in Freising, Germany
| | - Óscar Llorian Salvádor
- Department of Informatics, Bioinformatics and Computational Biology of the TUM Faculty of Informatics in Garching, Germany
| | - Petr Popov
- Center for Computational and Data-Intensive Science and Engineering of the Skolkovo Institute of Science and Technology in Moscow, Russia
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology at the TUM Faculty of Informatics in Garching, Germany
| | | | - Boris S Zhorov
- Department of Biochemistry and Biomedical Sciences, McMaster University in Hamilton, Canada
| | - Dmitrij Frishman
- Department of Bioinformatics at the TUM School of Life Sciences Weihenstephan in Freising, Germany
| |
Collapse
|
31
|
Laskowski RA, Stephenson JD, Sillitoe I, Orengo CA, Thornton JM. VarSite: Disease variants and protein structure. Protein Sci 2020; 29:111-119. [PMID: 31606900 PMCID: PMC6933866 DOI: 10.1002/pro.3746] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 10/04/2019] [Accepted: 10/07/2019] [Indexed: 12/20/2022]
Abstract
VarSite is a web server mapping known disease-associated variants from UniProt and ClinVar, together with natural variants from gnomAD, onto protein 3D structures in the Protein Data Bank. The analyses are primarily image-based and provide both an overview for each human protein, as well as a report for any specific variant of interest. The information can be useful in assessing whether a given variant might be pathogenic or benign. The structural annotations for each position in the protein include protein secondary structure, interactions with ligand, metal, DNA/RNA, or other protein, and various measures of a given variant's possible impact on the protein's function. The 3D locations of the disease-associated variants can be viewed interactively via the 3dmol.js JavaScript viewer, as well as in RasMol and PyMOL. Users can search for specific variants, or sets of variants, by providing the DNA coordinates of the base change(s) of interest. Additionally, various agglomerative analyses are given, such as the mapping of disease and natural variants onto specific Pfam or CATH domains. The server is freely accessible to all at: https://www.ebi.ac.uk/thornton-srv/databases/VarSite.
Collapse
Affiliation(s)
- Roman A. Laskowski
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - James D. Stephenson
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Trust Sanger InstituteCambridgeUK
| | - Ian Sillitoe
- Institute of Structural and Molecular BiologyUniversity College LondonLondonUK
| | - Christine A. Orengo
- Institute of Structural and Molecular BiologyUniversity College LondonLondonUK
| | - Janet M. Thornton
- European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| |
Collapse
|
32
|
Kulandaisamy A, Zaucha J, Sakthivel R, Frishman D, Michael Gromiha M. Pred‐MutHTP: Prediction of disease‐causing and neutral mutations in human transmembrane proteins. Hum Mutat 2019; 41:581-590. [DOI: 10.1002/humu.23961] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 11/05/2019] [Accepted: 11/20/2019] [Indexed: 12/24/2022]
Affiliation(s)
- A. Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciencesIndian Institute of Technology MadrasChennai Tamilnadu India
| | - Jan Zaucha
- Department of Bioinformatics, Wissenschaftszentrum WeihenstephanTechnische Universität MünchenFreising Germany
| | - Ramasamy Sakthivel
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciencesIndian Institute of Technology MadrasChennai Tamilnadu India
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum WeihenstephanTechnische Universität MünchenFreising Germany
- Department of BioinformaticsPeter the Great St. Petersburg Polytechnic UniversitySt. Petersburg Russian Federation
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciencesIndian Institute of Technology MadrasChennai Tamilnadu India
- Advanced Computational Drug Discovery Unit, Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative ResearchTokyo Institute of TechnologyYokohama Japan
| |
Collapse
|
33
|
Standage-Beier K, Tekel SJ, Brookhouser N, Schwarz G, Nguyen T, Wang X, Brafman DA. A transient reporter for editing enrichment (TREE) in human cells. Nucleic Acids Res 2019; 47:e120. [PMID: 31428784 PMCID: PMC6821290 DOI: 10.1093/nar/gkz713] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Revised: 08/01/2019] [Accepted: 08/05/2019] [Indexed: 12/21/2022] Open
Abstract
Current approaches to identify cell populations that have been modified with deaminase base editing technologies are inefficient and rely on downstream sequencing techniques. In this study, we utilized a blue fluorescent protein (BFP) that converts to green fluorescent protein (GFP) upon a C-to-T substitution as an assay to report directly on base editing activity within a cell. Using this assay, we optimize various base editing transfection parameters and delivery strategies. Moreover, we utilize this assay in conjunction with flow cytometry to develop a transient reporter for editing enrichment (TREE) to efficiently purify base-edited cell populations. Compared to conventional cell enrichment strategies that employ reporters of transfection (RoT), TREE significantly improved the editing efficiency at multiple independent loci, with efficiencies approaching 80%. We also employed the BFP-to-GFP conversion assay to optimize base editor vector design in human pluripotent stem cells (hPSCs), a cell type that is resistant to genome editing and in which modification via base editors has not been previously reported. At last, using these optimized vectors in the context of TREE allowed for the highly efficient editing of hPSCs. We envision TREE as a readily adoptable method to facilitate base editing applications in synthetic biology, disease modeling, and regenerative medicine.
Collapse
Affiliation(s)
- Kylie Standage-Beier
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
- Molecular and Cellular Biology graduate program, Arizona State University, Tempe, AZ 85287, USA
| | - Stefan J Tekel
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
| | - Nicholas Brookhouser
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
- Graduate Program in Clinical Translational Sciences, University of Arizona College of Medicine-Phoenix, Phoenix, AZ 85004, USA
| | - Grace Schwarz
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
| | - Toan Nguyen
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
| | - Xiao Wang
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
| | - David A Brafman
- School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
34
|
Flores MA, Lazar IM. XMAn v2-a database of Homo sapiens mutated peptides. Bioinformatics 2019; 36:1311-1313. [PMID: 31539018 PMCID: PMC8215914 DOI: 10.1093/bioinformatics/btz693] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 08/12/2019] [Accepted: 09/03/2019] [Indexed: 01/31/2023] Open
Abstract
SUMMARY The 'Unknown Mutation Analysis (XMAn)' database is a compilation of Homo sapiens mutated peptides in FASTA format, that was constructed for facilitating the identification of protein sequence alterations by tandem mass spectrometry detection. The database comprises 2 539 031 non-redundant mutated entries from 17 599 proteins, of which 2 377 103 are missense and 161 928 are nonsense mutations. It can be used in conjunction with search engines that seek the identification of peptide amino acid sequences by matching experimental tandem mass spectrometry data to theoretical sequences from a database. AVAILABILITY AND IMPLEMENTATION XMAn v2 can be accessed from github.com/lazarlab/XMAnv2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|