1
|
Fallatah MMJ, Demir Ö, Law F, Lauinger L, Baronio R, Hall L, Bournique E, Srivastava A, Metzen LT, Norman Z, Buisson R, Amaro RE, Kaiser P. Pyrimidine Triones as Potential Activators of p53 Mutants. Biomolecules 2024; 14:967. [PMID: 39199355 PMCID: PMC11352488 DOI: 10.3390/biom14080967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 07/22/2024] [Accepted: 08/05/2024] [Indexed: 09/01/2024] Open
Abstract
p53 is a crucial tumor suppressor in vertebrates that is frequently mutated in human cancers. Most mutations are missense mutations that render p53 inactive in suppressing tumor initiation and progression. Developing small-molecule drugs to convert mutant p53 into an active, wild-type-like conformation is a significant focus for personalized cancer therapy. Prior research indicates that reactivating p53 suppresses cancer cell proliferation and tumor growth in animal models. Early clinical evidence with a compound selectively targeting p53 mutants with substitutions of tyrosine 220 suggests potential therapeutic benefits of reactivating p53 in patients. This study identifies and examines the UCI-1001 compound series as a potential corrector for several p53 mutations. The findings indicate that UCI-1001 treatment in p53 mutant cancer cell lines inhibits growth and reinstates wild-type p53 activities, including DNA binding, target gene activation, and induction of cell death. Cellular thermal shift assays, conformation-specific immunofluorescence staining, and differential scanning fluorometry suggest that UCI-1001 interacts with and alters the conformation of mutant p53 in cancer cells. These initial results identify pyrimidine trione derivatives of the UCI-1001 series as candidates for p53 corrector drug development.
Collapse
Affiliation(s)
| | - Özlem Demir
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, CA 92093, USA
| | - Fiona Law
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Linda Lauinger
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Roberta Baronio
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Linda Hall
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Elodie Bournique
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Ambuj Srivastava
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, CA 92093, USA
| | - Landon Tyler Metzen
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Zane Norman
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Rémi Buisson
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| | - Rommie E. Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, CA 92093, USA
| | - Peter Kaiser
- Department of Biological Chemistry, University of California Irvine, Irvine, CA 92697, USA
| |
Collapse
|
2
|
Patil MR, Bihari A. A comprehensive study of p53 protein. J Cell Biochem 2022; 123:1891-1937. [PMID: 36183376 DOI: 10.1002/jcb.30331] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 09/02/2022] [Accepted: 09/13/2022] [Indexed: 01/10/2023]
Abstract
The protein p53 has been extensively investigated since it was found 43 years ago and has become a "guardian of the genome" that regulates the division of cells by preventing the growth of cells and dividing them, that is, inhibits the development of tumors. Initial proof of protein existence by researchers in the mid-1970s was found by altering and regulating the SV40 big T antigen termed the A protein. Researchers demonstrated how viruses play a role in cancer by employing viruses' ability to create T-antigens complex with viral tumors, which was discovered in 1979 following a viral analysis and cancer analog research. Researchers later in the year 1989 explained that in Murine Friend, a virus-caused erythroleukemia, commonly found that p53 was inactivated to suggest that p53 could be a "tumor suppressor gene." The TP53 gene, encoding p53, is one of human cancer's most frequently altered genes. The protein-regulated biological functions of all p53s include cell cycles, apoptosis, senescence, metabolism of the DNA, angiogenesis, cell differentiation, and immunological response. We tried to unfold the history of the p53 protein, which was discovered long back in 1979, that is, 43 years of research on p53, and how p53's function has been developed through time in this article.
Collapse
Affiliation(s)
- Manisha R Patil
- Department of Computer-Applications, School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Anand Bihari
- Department of Computational Intelligence, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| |
Collapse
|
3
|
Keshavarz-Rahaghi F, Pleasance E, Kolisnik T, Jones SJM. A p53 transcriptional signature in primary and metastatic cancers derived using machine learning. Front Genet 2022; 13:987238. [PMID: 36134028 PMCID: PMC9483853 DOI: 10.3389/fgene.2022.987238] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
The tumor suppressor gene, TP53, has the highest rate of mutation among all genes in human cancer. This transcription factor plays an essential role in the regulation of many cellular processes. Mutations in TP53 result in loss of wild-type p53 function in a dominant negative manner. Although TP53 is a well-studied gene, the transcriptome modifications caused by the mutations in this gene have not yet been explored in a pan-cancer study using both primary and metastatic samples. In this work, we used a random forest model to stratify tumor samples based on TP53 mutational status and detected a p53 transcriptional signature. We hypothesize that the existence of this transcriptional signature is due to the loss of wild-type p53 function and is universal across primary and metastatic tumors as well as different tumor types. Additionally, we showed that the algorithm successfully detected this signature in samples with apparent silent mutations that affect correct mRNA splicing. Furthermore, we observed that most of the highly ranked genes contributing to the classification extracted from the random forest have known associations with p53 within the literature. We suggest that other genes found in this list including GPSM2, OR4N2, CTSL2, SPERT, and RPE65 protein coding genes have yet undiscovered linkages to p53 function. Our analysis of time on different therapies also revealed that this signature is more effective than the recorded TP53 status in detecting patients who can benefit from platinum therapies and taxanes. Our findings delineate a p53 transcriptional signature, expand the knowledge of p53 biology and further identify genes important in p53 related pathways.
Collapse
Affiliation(s)
- Faeze Keshavarz-Rahaghi
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Erin Pleasance
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
| | - Tyler Kolisnik
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- School of Natural and Computational Sciences, Massey University, Auckland, New Zealand
| | - Steven J. M. Jones
- Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Vancouver, BC, Canada
- *Correspondence: Steven J. M. Jones,
| |
Collapse
|
4
|
Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases. Comput Biol Chem 2022; 97:107619. [DOI: 10.1016/j.compbiolchem.2021.107619] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/23/2021] [Accepted: 12/17/2021] [Indexed: 12/14/2022]
|
5
|
Petković M, Škrlj B, Kocev D, Simidjievski N. Fuzzy Jaccard Index: A robust comparison of ordered lists. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107849] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
6
|
Arora I, Tollefsbol TO. Computational methods and next-generation sequencing approaches to analyze epigenetics data: Profiling of methods and applications. Methods 2021; 187:92-103. [PMID: 32941995 PMCID: PMC7914156 DOI: 10.1016/j.ymeth.2020.09.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/08/2020] [Accepted: 09/10/2020] [Indexed: 12/20/2022] Open
Abstract
Epigenetics is mainly comprised of features that regulate genomic interactions thereby playing a crucial role in a vast array of biological processes. Epigenetic mechanisms such as DNA methylation and histone modifications influence gene expression by modulating the packaging of DNA in the nucleus. A plethora of studies have emphasized the importance of analyzing epigenetics data through genome-wide studies and high-throughput approaches, thereby providing key insights towards epigenetics-based diseases such as cancer. Recent advancements have been made towards translating epigenetics research into a high throughput approach such as genome-scale profiling. Amongst all, bioinformatics plays a pivotal role in achieving epigenetics-related computational studies. Despite significant advancements towards epigenomic profiling, it is challenging to understand how various epigenetic modifications such as chromatin modifications and DNA methylation regulate gene expression. Next-generation sequencing (NGS) provides accurate and parallel sequencing thereby allowing researchers to comprehend epigenomic profiling. In this review, we summarize different computational methods such as machine learning and other bioinformatics tools, publicly available databases and resources to identify key modifications associated with epigenetic machinery. Additionally, the review also focuses on understanding recent methodologies related to epigenome profiling using NGS methods ranging from library preparation, different sequencing platforms and analytical techniques to evaluate various epigenetic modifications such as DNA methylation and histone modifications. We also provide detailed information on bioinformatics tools and computational strategies responsible for analyzing large scale data in epigenetics.
Collapse
Affiliation(s)
- Itika Arora
- Department of Biology, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA.
| | - Trygve O Tollefsbol
- Department of Biology, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA; Comprehensive Center for Healthy Aging, University of Alabama Birmingham, 1530 3rd Avenue South, Birmingham, AL 35294, USA; Comprehensive Cancer Center, University of Alabama Birmingham, 1802 6th Avenue South, Birmingham, AL 35294, USA; Nutrition Obesity Research Center, University of Alabama Birmingham, 1675 University Boulevard, Birmingham, AL 35294, USA; Comprehensive Diabetes Center, University of Alabama Birmingham, 1825 University Boulevard, Birmingham, AL 35294, USA.
| |
Collapse
|
7
|
|
8
|
Gárate-Escamilla AK, El Hassani AH, Andres E. Big data execution time based on Spark Machine Learning Libraries. PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON CLOUD AND BIG DATA COMPUTING 2019. [DOI: 10.1145/3358505.3358519] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
Affiliation(s)
| | | | - Emmanuel Andres
- Service de Médecine Interne, Diabète et Maladies métaboliques de la Clinique Médicale B, CHRU de Strasbourg, Strasbourg
| |
Collapse
|
9
|
Unsupervised dimensionality reduction versus supervised regularization for classification from sparse data. Data Min Knowl Discov 2019. [DOI: 10.1007/s10618-019-00616-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Feldbauer R, Flexer A. A comprehensive empirical comparison of hubness reduction in high-dimensional spaces. Knowl Inf Syst 2018; 59:137-166. [PMID: 32647403 PMCID: PMC7327987 DOI: 10.1007/s10115-018-1205-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Revised: 02/07/2018] [Accepted: 05/06/2018] [Indexed: 11/25/2022]
Abstract
Hubness is an aspect of the curse of dimensionality related to the distance concentration effect. Hubs occur in high-dimensional data spaces as objects that are particularly often among the nearest neighbors of other objects. Conversely, other data objects become antihubs, which are rarely or never nearest neighbors to other objects. Many machine learning algorithms rely on nearest neighbor search and some form of measuring distances, which are both impaired by high hubness. Degraded performance due to hubness has been reported for various tasks such as classification, clustering, regression, visualization, recommendation, retrieval and outlier detection. Several hubness reduction methods based on different paradigms have previously been developed. Local and global scaling as well as shared neighbors approaches aim at repairing asymmetric neighborhood relations. Global and localized centering try to eliminate spatial centrality, while the related global and local dissimilarity measures are based on density gradient flattening. Additional methods and alternative dissimilarity measures that were argued to mitigate detrimental effects of distance concentration also influence the related hubness phenomenon. In this paper, we present a large-scale empirical evaluation of all available unsupervised hubness reduction methods and dissimilarity measures. We investigate several aspects of hubness reduction as well as its influence on data semantics which we measure via nearest neighbor classification. Scaling and density gradient flattening methods improve evaluation measures such as hubness and classification accuracy consistently for data sets from a wide range of domains, while centering approaches achieve the same only under specific settings.
Collapse
Affiliation(s)
- Roman Feldbauer
- Austrian Research Institute for Artificial Intelligence, Freyung 6/6/7, 1010 Vienna, Austria
| | - Arthur Flexer
- Austrian Research Institute for Artificial Intelligence, Freyung 6/6/7, 1010 Vienna, Austria
| |
Collapse
|
11
|
Tiberti M, Pandini A, Fraternali F, Fornili A. In silico identification of rescue sites by double force scanning. Bioinformatics 2018; 34:207-214. [PMID: 28961796 PMCID: PMC5860198 DOI: 10.1093/bioinformatics/btx515] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Revised: 06/23/2017] [Accepted: 08/10/2017] [Indexed: 01/03/2023] Open
Abstract
Motivation A deleterious amino acid change in a protein can be compensated by a second-site rescue mutation. These compensatory mechanisms can be mimicked by drugs. In particular, the location of rescue mutations can be used to identify protein regions that can be targeted by small molecules to reactivate a damaged mutant. Results We present the first general computational method to detect rescue sites. By mimicking the effect of mutations through the application of forces, the double force scanning (DFS) method identifies the second-site residues that make the protein structure most resilient to the effect of pathogenic mutations. We tested DFS predictions against two datasets containing experimentally validated and putative evolutionary-related rescue sites. A remarkably good agreement was found between predictions and experimental data. Indeed, almost half of the rescue sites in p53 was correctly predicted by DFS, with 65% of remaining sites in contact with DFS predictions. Similar results were found for other proteins in the evolutionary dataset. Availability and implementation The DFS code is available under GPL at https://fornililab.github.io/dfs/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matteo Tiberti
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| | - Alessandro Pandini
- Department of Computer Science, College of Engineering, Design and Physical Sciences and Synthetic Biology Theme, Institute of Environment, Health and Societies, Brunel University London, Uxbridge, London, UK
| | - Franca Fraternali
- Randall Division of Cell and Molecular Biophysics, King‘s College London, London, UK
- The Francis Crick Institute, London, UK
- The Thomas Young Centre for Theory and Simulation of Materials, London, UK
| | - Arianna Fornili
- School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
- The Thomas Young Centre for Theory and Simulation of Materials, London, UK
| |
Collapse
|
12
|
Viegas F, Rocha L, Gonçalves M, Mourão F, Sá G, Salles T, Andrade G, Sandin I. A Genetic Programming approach for feature selection in highly dimensional skewed data. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.08.050] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Treder MS. Improving SNR and Reducing Training Time of Classifiers in Large Datasets via Kernel Averaging. Brain Inform 2018. [DOI: 10.1007/978-3-030-05587-5_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
14
|
Maalouf M, Homouz D, Trafalis TB. Logistic regression in large rare events and imbalanced data: A performance comparison of prior correction and weighting methods. Comput Intell 2017. [DOI: 10.1111/coin.12123] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Maher Maalouf
- Industrial and Systems Engineering; Khalifa University; Abu Dhabi United Arab Emirates
| | - Dirar Homouz
- Applied Mathematics and Science; Khalifa University; Abu Dhabi United Arab Emirates
| | | |
Collapse
|
15
|
Abstract
Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review.
Collapse
Affiliation(s)
- Lawrence B Holder
- a School of Electrical Engineering and Computer Science , Washington State University , Pullman , WA , USA
| | - M Muksitul Haque
- a School of Electrical Engineering and Computer Science , Washington State University , Pullman , WA , USA.,b Center for Reproductive Biology, School of Biological Sciences , Washington State University , Pullman , WA , USA
| | - Michael K Skinner
- b Center for Reproductive Biology, School of Biological Sciences , Washington State University , Pullman , WA , USA
| |
Collapse
|
16
|
Small Random Forest Models for Effective Chemogenomic Active Learning. JOURNAL OF COMPUTER AIDED CHEMISTRY 2017. [DOI: 10.2751/jcac.18.124] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
17
|
Naik AW, Kangas JD, Sullivan DP, Murphy RF. Active machine learning-driven experimentation to determine compound effects on protein patterns. eLife 2016; 5:e10047. [PMID: 26840049 PMCID: PMC4798950 DOI: 10.7554/elife.10047] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 01/28/2016] [Indexed: 12/03/2022] Open
Abstract
High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for many conditions and targets without doing all possible experiments. We have previously described an active machine learning algorithm that can iteratively choose small sets of experiments to learn models of multiple effects. We now show that, with no prior knowledge and with liquid handling robotics and automated microscopy under its control, this learner accurately learned the effects of 48 chemical compounds on the subcellular localization of 48 proteins while performing only 29% of all possible experiments. The results represent the first practical demonstration of the utility of active learning-driven biological experimentation in which the set of possible phenotypes is unknown in advance. DOI:http://dx.doi.org/10.7554/eLife.10047.001 Biomedical scientists have invested significant effort into making it easy to perform lots of experiments quickly and cheaply. These “high throughput” methods are the workhorses of modern “systems biology” efforts. However, we simply cannot perform an experiment for every possible combination of different cell type, genetic mutation and other conditions. In practice this has led researchers to either exhaustively test a few conditions or targets, or to try to pick the experiments that best allow a particular problem to be explored. But which experiments should we pick? The ones we think we can predict the outcome of accurately, the ones for which we are uncertain what the results will be, or a combination of the two? Humans are not particularly well suited for this task because it requires reasoning about many possible outcomes at the same time. However, computers are much better at handling statistics for many experiments, and machine learning algorithms allow computers to “learn” how to make predictions and decisions based on the data they’ve previously processed. Previous computer simulations showed that a machine learning approach termed “active learning” could do a good job of picking a series of experiments to perform in order to efficiently learn a model that predicts the results of experiments that were not done. Now, Naik et al. have performed cell biology experiments in which experiments were chosen by an active learning algorithm and then performed using liquid handling robots and an automated microscope. The key idea behind the approach is that you learn more from an experiment you can’t predict (or that you predicted incorrectly) than from just confirming your confident predictions. The results of the robot-driven experiments showed that the active learning approach outperforms strategies a human might use, even when the potential outcomes of individual experiments are not known beforehand. The next challenge is to apply these methods to reduce the cost of achieving the goals of large projects, such as The Cancer Genome Atlas. DOI:http://dx.doi.org/10.7554/eLife.10047.002
Collapse
Affiliation(s)
- Armaghan W Naik
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, United States.,Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
| | - Joshua D Kangas
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, United States.,Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
| | - Devin P Sullivan
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, United States.,Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States
| | - Robert F Murphy
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, United States.,Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, United States.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States.,Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, United States.,Machine Learning Department, Carnegie Mellon University, Pittsburgh, United States.,Freiburg Institute for Advanced Studies, Albert Ludwig University of Freiburg, Freiburg, Germany.,Faculty of Biology, Albert Ludwig University of Freiburg, Freiburg, Germany
| |
Collapse
|
18
|
Lang T, Flachsenberg F, von Luxburg U, Rarey M. Feasibility of Active Machine Learning for Multiclass Compound Classification. J Chem Inf Model 2016; 56:12-20. [DOI: 10.1021/acs.jcim.5b00332] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | | | - Ulrike von Luxburg
- Department
of Computer Science, University of Tübingen, 72076 Tübingen, Germany
| | | |
Collapse
|
19
|
Abstract
Deleterious or 'disease-associated' mutations are mutations that lead to disease with high phenotype penetrance: they are inherited in a simple Mendelian manner, or, in the case of cancer, accumulate in somatic cells leading directly to disease. However, in some cases, the amino acid that is substituted resulting in disease is the wild-type native residue in the functionally equivalent protein in another species. Such examples are known as 'compensated pathogenic deviations' (CPDs) because, somewhere in the second species, there must be compensatory mutations that allow the protein to function normally despite having a residue which would cause disease in the first species. Depending on the nature of the mutations, compensation can occur in the same protein, or in a different protein with which it interacts. In principle, compensation can be achieved by a single mutation (most probably structurally close to the CPD), or by the cumulative effect of several mutations. Although it is clear that these effects occur in proteins, compensatory mutations are also important in RNA potentially having an impact on disease. As a much simpler molecule, RNA provides an interesting model for understanding mechanisms of compensatory effects, both by looking at naturally occurring RNA molecules and as a means of computational simulation. This review surveys the rather limited literature that has explored these effects. Understanding the nature of CPDs is important in understanding traversal along fitness landscape valleys in evolution. It could also have applications in treating diseases that result from such mutations.
Collapse
|
20
|
Reker D, Schneider G. Active-learning strategies in computer-assisted drug discovery. Drug Discov Today 2015; 20:458-65. [DOI: 10.1016/j.drudis.2014.12.004] [Citation(s) in RCA: 100] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 11/13/2014] [Accepted: 12/02/2014] [Indexed: 12/20/2022]
|
21
|
Hino H, Fujiki J. ADHERENTLY PENALIZED LINEAR DISCRIMINANT ANALYSIS. JOURNAL JAPANESE SOCIETY OF COMPUTATIONAL STATISTICS 2015. [DOI: 10.5183/jjscs.1412001_219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Hideitsu Hino
- Graduate School of Systems and Information Engineering, University of Tsukuba
| | - Jun Fujiki
- Department of Applied Mathematics, Fukuoka University
| |
Collapse
|
22
|
Zeng XQ, Li GZ. Dimension reduction for p53 protein recognition by using incremental partial least squares. IEEE Trans Nanobioscience 2014; 13:73-9. [PMID: 24893361 DOI: 10.1109/tnb.2014.2319234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As an important tumor suppressor protein, reactivating mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In recent years, more and more data extracted from biophysical simulations, which makes the modelling of mutant p53 transcriptional activity suffering from the problems of huge amount of instances and high feature dimension. Incremental feature extraction is effective to facilitate analysis of large-scale data. However, most current incremental feature extraction methods are not suitable for processing big data with high feature dimension. Partial Least Squares (PLS) has been demonstrated to be an effective dimension reduction technique for classification. In this paper, we design a highly efficient and powerful algorithm named Incremental Partial Least Squares (IPLS), which conducts a two-stage extraction process. In the first stage, the PLS target function is adapted to be incremental with updating historical mean to extract the leading projection direction. In the last stage, the other projection directions are calculated through equivalence between the PLS vectors and the Krylov sequence. We compare IPLS with some state-of-the-arts incremental feature extraction methods like Incremental Principal Component Analysis, Incremental Maximum Margin Criterion and Incremental Inter-class Scatter on real p53 proteins data. Empirical results show IPLS performs better than other methods in terms of balanced classification accuracy.
Collapse
|
23
|
Kangas JD, Naik AW, Murphy RF. Efficient discovery of responses of proteins to compounds using active learning. BMC Bioinformatics 2014; 15:143. [PMID: 24884564 PMCID: PMC4030446 DOI: 10.1186/1471-2105-15-143] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Accepted: 05/07/2014] [Indexed: 11/13/2022] Open
Abstract
Background Drug discovery and development has been aided by high throughput screening methods that detect compound effects on a single target. However, when using focused initial screening, undesirable secondary effects are often detected late in the development process after significant investment has been made. An alternative approach would be to screen against undesired effects early in the process, but the number of possible secondary targets makes this prohibitively expensive. Results This paper describes methods for making this global approach practical by constructing predictive models for many target responses to many compounds and using them to guide experimentation. We demonstrate for the first time that by jointly modeling targets and compounds using descriptive features and using active machine learning methods, accurate models can be built by doing only a small fraction of possible experiments. The methods were evaluated by computational experiments using a dataset of 177 assays and 20,000 compounds constructed from the PubChem database. Conclusions An average of nearly 60% of all hits in the dataset were found after exploring only 3% of the experimental space which suggests that active learning can be used to enable more complete characterization of compound effects than otherwise affordable. The methods described are also likely to find widespread application outside drug discovery, such as for characterizing the effects of a large number of compounds or inhibitory RNAs on a large number of cell or tissue phenotypes.
Collapse
Affiliation(s)
| | | | - Robert F Murphy
- Lane Center for Computational Biology, Carnegie Mellon University, 5000 Forbes Ave,, Pittsburgh, PA 15213, USA.
| |
Collapse
|
24
|
Naik AW, Kangas JD, Langmead CJ, Murphy RF. Efficient modeling and active learning discovery of biological responses. PLoS One 2013; 8:e83996. [PMID: 24358322 PMCID: PMC3866149 DOI: 10.1371/journal.pone.0083996] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Accepted: 11/11/2013] [Indexed: 11/18/2022] Open
Abstract
High throughput and high content screening involve determination of the effect of many compounds on a given target. As currently practiced, screening for each new target typically makes little use of information from screens of prior targets. Further, choices of compounds to advance to drug development are made without significant screening against off-target effects. The overall drug development process could be made more effective, as well as less expensive and time consuming, if potential effects of all compounds on all possible targets could be considered, yet the cost of such full experimentation would be prohibitive. In this paper, we describe a potential solution: probabilistic models that can be used to predict results for unmeasured combinations, and active learning algorithms for efficiently selecting which experiments to perform in order to build those models and determining when to stop. Using simulated and experimental data, we show that our approaches can produce powerful predictive models without exhaustive experimentation and can learn them much faster than by selecting experiments at random.
Collapse
Affiliation(s)
- Armaghan W. Naik
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Joshua D. Kangas
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Christopher J. Langmead
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Robert F. Murphy
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Departments of Biological Sciences, Biomedical Engineering and Machine Learning, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Freiburg Institute for Advanced Studies and Faculty of Biology, Albert Ludwig University of Freiburg, Freiburg, Germany
- * E-mail:
| |
Collapse
|
25
|
Danziger SA, Ratushny AV, Smith JJ, Saleem RA, Wan Y, Arens CE, Armstrong AM, Sitko K, Chen WM, Chiang JH, Reiss DJ, Baliga NS, Aitchison JD. Molecular mechanisms of system responses to novel stimuli are predictable from public data. Nucleic Acids Res 2013; 42:1442-60. [PMID: 24185701 PMCID: PMC3919619 DOI: 10.1093/nar/gkt938] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Systems scale models provide the foundation for an effective iterative cycle between hypothesis generation, experiment and model refinement. Such models also enable predictions facilitating the understanding of biological complexity and the control of biological systems. Here, we demonstrate the reconstruction of a globally predictive gene regulatory model from public data: a model that can drive rational experiment design and reveal new regulatory mechanisms underlying responses to novel environments. Specifically, using ∼ 1500 publically available genome-wide transcriptome data sets from Saccharomyces cerevisiae, we have reconstructed an environment and gene regulatory influence network that accurately predicts regulatory mechanisms and gene expression changes on exposure of cells to completely novel environments. Focusing on transcriptional networks that induce peroxisomes biogenesis, the model-guided experiments allow us to expand a core regulatory network to include novel transcriptional influences and linkage across signaling and transcription. Thus, the approach and model provides a multi-scalar picture of gene dynamics and are powerful resources for exploiting extant data to rationally guide experimentation. The techniques outlined here are generally applicable to any biological system, which is especially important when experimental systems are challenging and samples are difficult and expensive to obtain-a common problem in laboratory animal and human studies.
Collapse
Affiliation(s)
- Samuel A Danziger
- Seattle Biomedical Research Institute, Seattle, WA 98109-5219 USA, Institute for Systems Biology, Seattle, WA 98109-5240 USA, The Key Laboratory of Developmental Genes and Human Disease, Ministry of Education, Institute of Life Science, Southeast University, Nanjing 210096, China and Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 704, Taiwan
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Odell AF, Odell LR, Askham JM, Alogheli H, Ponnambalam S, Hollstein M. A novel p53 mutant found in iatrogenic urothelial cancers is dysfunctional and can be rescued by a second-site global suppressor mutation. J Biol Chem 2013; 288:16704-16714. [PMID: 23612969 DOI: 10.1074/jbc.m112.443168] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Exposure to herbal remedies containing the carcinogen aristolochic acid (AA) has been widespread in some regions of the world. Rare A→T TP53 mutations were recently discovered in AA-associated urothelial cancers. The near absence of these mutations among all other sequenced human tumors suggests that they could be biologically silent. There are no cell banks with established lines derived from human tumors with which to explore the influence of the novel mutants on p53 function and cellular behavior. To investigate their impact, we generated isogenic mutant clones by integrase-mediated cassette exchange at the p53 locus of platform (null) murine embryonic fibroblasts and kidney epithelial cells. Common tumor mutants (R248W, R273C) were compared with the AA-associated mutants N131Y, R249W, and Q104L. Assays of cell proliferation, migration, growth in soft agar, apoptosis, senescence, and gene expression revealed contrasting outcomes on cellular behavior following introduction of N131Y or Q104L. The N131Y mutant demonstrated a phenotype akin to common tumor mutants, whereas Q104L clone behavior resembled that of cells with wild-type p53. Wild-type p53 responses were restored in double-mutant cells harboring N131Y and N239Y, a second-site rescue mutation, suggesting that pharmaceutical reactivation of p53 function in tumors expressing N131Y could have therapeutic benefit. N131Y is likely to contribute directly to tumor phenotype and is a promising candidate biomarker of AA exposure and disease. Rare mutations thus do not necessarily point to sites where amino acid exchanges are phenotypically neutral. Encounter with mutagenic insults targeting cryptic sites can reveal specific signature hotspots.
Collapse
Affiliation(s)
- Adam F Odell
- Faculty of Medicine and Health, University of Leeds, Leeds LS2 9JT, United Kingdom.
| | - Luke R Odell
- Division of Organic Pharmaceutical Chemistry, Department of Medicinal Chemistry, Uppsala University, 75123 Uppsala, Sweden
| | - Jon M Askham
- Faculty of Medicine and Health, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Hiba Alogheli
- Division of Organic Pharmaceutical Chemistry, Department of Medicinal Chemistry, Uppsala University, 75123 Uppsala, Sweden
| | - Sreenivasan Ponnambalam
- School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Monica Hollstein
- Faculty of Medicine and Health, University of Leeds, Leeds LS2 9JT, United Kingdom; Department C016, German Cancer Research Centre, 69120 Heidelberg, Germany.
| |
Collapse
|
27
|
Geetha Ramani R, Jacob SG. Prediction of P53 mutants (multiple sites) transcriptional activity based on structural (2D&3D) properties. PLoS One 2013; 8:e55401. [PMID: 23468845 PMCID: PMC3572112 DOI: 10.1371/journal.pone.0055401] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2012] [Accepted: 12/21/2012] [Indexed: 01/05/2023] Open
Abstract
Prediction of secondary site mutations that reinstate mutated p53 to normalcy has been the focus of intense research in the recent past owing to the fact that p53 mutants have been implicated in more than half of all human cancers and restoration of p53 causes tumor regression. However laboratory investigations are more often laborious and resource intensive but computational techniques could well surmount these drawbacks. In view of this, we formulated a novel approach utilizing computational techniques to predict the transcriptional activity of multiple site (one-site to five-site) p53 mutants. The optimal MCC obtained by the proposed approach on prediction of one-site, two-site, three-site, four-site and five-site mutants were 0.775,0.341,0.784,0.916 and 0.655 respectively, the highest reported thus far in literature. We have also demonstrated that 2D and 3D features generate higher prediction accuracy of p53 activity and our findings revealed the optimal results for prediction of p53 status, reported till date. We believe detection of the secondary site mutations that suppress tumor growth may facilitate better understanding of the relationship between p53 structure and function and further knowledge on the molecular mechanisms and biological activity of p53, a targeted source for cancer therapy. We expect that our prediction methods and reported results may provide useful insights on p53 functional mechanisms and generate more avenues for utilizing computational techniques in biological data analysis.
Collapse
Affiliation(s)
- R. Geetha Ramani
- Department of Information Science and Technology, College of Engineering, Guindy, Anna University, Chennai, Tamilnadu, India
| | - Shomona Gracia Jacob
- Faculty of Information and Communication Engineering, Anna University, Chennai, Tamilnadu, India
| |
Collapse
|
28
|
Wassman CD, Baronio R, Demir Ö, Wallentine BD, Chen CK, Hall LV, Salehi F, Lin DW, Chung BP, Wesley Hatfield G, Richard Chamberlin A, Luecke H, Lathrop RH, Kaiser P, Amaro RE. Computational identification of a transiently open L1/S3 pocket for reactivation of mutant p53. Nat Commun 2013; 4:1407. [PMID: 23360998 PMCID: PMC3562459 DOI: 10.1038/ncomms2361] [Citation(s) in RCA: 173] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2012] [Accepted: 12/06/2012] [Indexed: 12/22/2022] Open
Abstract
The tumour suppressor p53 is the most frequently mutated gene in human cancer. Reactivation of mutant p53 by small molecules is an exciting potential cancer therapy. Although several compounds restore wild-type function to mutant p53, their binding sites and mechanisms of action are elusive. Here computational methods identify a transiently open binding pocket between loop L1 and sheet S3 of the p53 core domain. Mutation of residue Cys124, located at the centre of the pocket, abolishes p53 reactivation of mutant R175H by PRIMA-1, a known reactivation compound. Ensemble-based virtual screening against this newly revealed pocket selects stictic acid as a potential p53 reactivation compound. In human osteosarcoma cells, stictic acid exhibits dose-dependent reactivation of p21 expression for mutant R175H more strongly than does PRIMA-1. These results indicate the L1/S3 pocket as a target for pharmaceutical reactivation of p53 mutants.
Collapse
Affiliation(s)
- Christopher D. Wassman
- Department of Computer Science, University of California, Irvine, Irvine, California 92697, USA
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- These authors contributed equally to this work
- Present address: Google Inc., 1600 Amphitheatre Parkway Mountain View, California 94043, USA
| | - Roberta Baronio
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- Department of Biological Chemistry, University of California, Irvine, Irvine, California 92697, USA
- These authors contributed equally to this work
| | - Özlem Demir
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA
- These authors contributed equally to this work
- Present addresses: Department of Chemistry and Biochemistry, University of California, San Diego; La Jolla, California 92093, USA
| | - Brad D. Wallentine
- Department of Molecular Biology and Biochemistry, University of California, Irvine, Irvine, California 92697, USA
| | - Chiung-Kuang Chen
- Department of Molecular Biology and Biochemistry, University of California, Irvine, Irvine, California 92697, USA
| | - Linda V. Hall
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- Department of Biological Chemistry, University of California, Irvine, Irvine, California 92697, USA
| | - Faezeh Salehi
- Department of Computer Science, University of California, Irvine, Irvine, California 92697, USA
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
| | - Da-Wei Lin
- Department of Biological Chemistry, University of California, Irvine, Irvine, California 92697, USA
| | - Benjamin P. Chung
- Department of Biological Chemistry, University of California, Irvine, Irvine, California 92697, USA
| | - G. Wesley Hatfield
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- Department of Microbiology and Molecular Genetics, University of California, Irvine, Irvine, California 92697, USA
- Department of Chemical Engineering and Materials Science, University of California, Irvine, Irvine, California 92697, USA
| | - A. Richard Chamberlin
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, USA
- Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, California 92697, USA
| | - Hartmut Luecke
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- Department of Molecular Biology and Biochemistry, University of California, Irvine, Irvine, California 92697, USA
- Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, California 92697, USA
- Department of Physiology and Biophysics, University of California, Irvine, Irvine, California 92697, USA
- Center for Biomembrane Systems, University of California, Irvine, Irvine, California 92697, USA
| | - Richard H. Lathrop
- Department of Computer Science, University of California, Irvine, Irvine, California 92697, USA
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, California 92697, USA
- Department of Biomedical Engineering, University of California, Irvine, Irvine, California 92697, USA
| | - Peter Kaiser
- Institute for Genomics and Bioinformatics, University of California, Irvine, Irvine, California 92697, USA
- Department of Biological Chemistry, University of California, Irvine, Irvine, California 92697, USA
- Chao Family Comprehensive Cancer Center, University of California, Irvine, Irvine, California 92697, USA
| | - Rommie E. Amaro
- Department of Computer Science, University of California, Irvine, Irvine, California 92697, USA
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, USA
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, USA
- Present addresses: Department of Chemistry and Biochemistry, University of California, San Diego; La Jolla, California 92093, USA
| |
Collapse
|
29
|
Romero PA, Stone E, Lamb C, Chantranupong L, Krause A, Miklos AE, Hughes RA, Fechtel B, Ellington AD, Arnold FH, Georgiou G. SCHEMA-designed variants of human Arginase I and II reveal sequence elements important to stability and catalysis. ACS Synth Biol 2012; 1:221-8. [PMID: 22737599 DOI: 10.1021/sb300014t] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Arginases catalyze the divalent cation-dependent hydrolysis of L-arginine to urea and L-ornithine. There is significant interest in using arginase as a therapeutic antineogenic agent against L-arginine auxotrophic tumors and in enzyme replacement therapy for treating hyperargininemia. Both therapeutic applications require enzymes with sufficient stability under physiological conditions. To explore sequence elements that contribute to arginase stability we used SCHEMA-guided recombination to design a library of chimeric enzymes composed of sequence fragments from the two human isozymes Arginase I and II. We then developed a novel active learning algorithm that selects sequences from this library that are both highly informative and functional. Using high-throughput gene synthesis and our two-step active learning algorithm, we were able to rapidly create a small but highly informative set of seven enzymatically active chimeras that had an average variant distance of 40 mutations from the closest parent arginase. Within this set of sequences, linear regression was used to identify the sequence elements that contribute to the long-term stability of human arginase under physiological conditions. This approach revealed a striking correlation between the isoelectric point and the long-term stability of the enzyme to deactivation under physiological conditions.
Collapse
Affiliation(s)
| | | | | | | | - Andreas Krause
- Department of Computer Science, Swiss Federal Institute of Technology, Zurich, Switzerland
| | | | | | | | | | | | | |
Collapse
|
30
|
Ensemble-based computational approach discriminates functional activity of p53 cancer and rescue mutants. PLoS Comput Biol 2011; 7:e1002238. [PMID: 22028641 PMCID: PMC3197647 DOI: 10.1371/journal.pcbi.1002238] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Accepted: 09/05/2011] [Indexed: 11/19/2022] Open
Abstract
The tumor suppressor protein p53 can lose its function upon single-point missense mutations in the core DNA-binding domain (“cancer mutants”). Activity can be restored by second-site suppressor mutations (“rescue mutants”). This paper relates the functional activity of p53 cancer and rescue mutants to their overall molecular dynamics (MD), without focusing on local structural details. A novel global measure of protein flexibility for the p53 core DNA-binding domain, the number of clusters at a certain RMSD cutoff, was computed by clustering over 0.7 µs of explicitly solvated all-atom MD simulations. For wild-type p53 and a sample of p53 cancer or rescue mutants, the number of clusters was a good predictor of in vivo p53 functional activity in cell-based assays. This number-of-clusters (NOC) metric was strongly correlated (r2 = 0.77) with reported values of experimentally measured ΔΔG protein thermodynamic stability. Interpreting the number of clusters as a measure of protein flexibility: (i) p53 cancer mutants were more flexible than wild-type protein, (ii) second-site rescue mutations decreased the flexibility of cancer mutants, and (iii) negative controls of non-rescue second-site mutants did not. This new method reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants. p53 is a tumor suppressor protein that controls a central apoptotic pathway (programmed cell death). Thus, it is the most-mutated gene in human cancers. Due to the marginal stability of p53, a single mutation can abolish p53 function (“cancer mutants”), while a second mutation (or several) can restore it (“rescue mutants”). Restoring p53 function is a promising therapeutic goal that has been strongly supported by recent experimental results on mice. Understanding of the effects of p53 cancer and rescue mutations would be helpful for designing drugs that are able to achieve the same goal. The challenge is that cancer and rescue mutations are distributed widely in the protein, and experimental testing of all possible combinations of mutations is not feasible. This paper describes a simple computational metric that reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants.
Collapse
|
31
|
Huang T, Niu S, Xu Z, Huang Y, Kong X, Cai YD, Chou KC. Predicting transcriptional activity of multiple site p53 mutants based on hybrid properties. PLoS One 2011; 6:e22940. [PMID: 21857971 PMCID: PMC3152557 DOI: 10.1371/journal.pone.0022940] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2011] [Accepted: 07/01/2011] [Indexed: 11/26/2022] Open
Abstract
As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathew's correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism.
Collapse
Affiliation(s)
- Tao Huang
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China
- Shanghai Center for Bioinformation Technology, Shanghai, People's Republic of China
| | - Shen Niu
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | - Zhongping Xu
- Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China
| | - Yun Huang
- Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China
| | - Xiangyin Kong
- Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China
- State Key Laboratory of Medical Genomics, Ruijin Hospital, Shanghai Jiaotong University, Shanghai, People's Republic of China
| | - Yu-Dong Cai
- Institute of Systems Biology, Shanghai University, Shanghai, People's Republic of China
- Centre for Computational Systems Biology, Fudan University, Shanghai, People's Republic of China
- Gordon Life Science Institute, San Diego, California, United States of America
| | - Kuo-Chen Chou
- Gordon Life Science Institute, San Diego, California, United States of America
| |
Collapse
|
32
|
Affiliation(s)
- Robert F Murphy
- Lane Center for Computational Biology and the Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
| |
Collapse
|
33
|
Baronio R, Danziger SA, Hall LV, Salmon K, Hatfield GW, Lathrop RH, Kaiser P. All-codon scanning identifies p53 cancer rescue mutations. Nucleic Acids Res 2010; 38:7079-88. [PMID: 20581117 PMCID: PMC2978351 DOI: 10.1093/nar/gkq571] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
In vitro scanning mutagenesis strategies are valuable tools to identify critical residues in proteins and to generate proteins with modified properties. We describe the fast and simple All-Codon Scanning (ACS) strategy that creates a defined gene library wherein each individual codon within a specific target region is changed into all possible codons with only a single codon change per mutagenesis product. ACS is based on a multiplexed overlapping mutagenesis primer design that saturates only the targeted gene region with single codon changes. We have used ACS to produce single amino-acid changes in small and large regions of the human tumor suppressor protein p53 to identify single amino-acid substitutions that can restore activity to inactive p53 found in human cancers. Single-tube reactions were used to saturate defined 30-nt regions with all possible codon changes. The same technique was used in 20 parallel reactions to scan the 600-bp fragment encoding the entire p53 core domain. Identification of several novel p53 cancer rescue mutations demonstrated the utility of the ACS approach. ACS is a fast, simple and versatile method, which is useful for protein structure–function analyses and protein design or evolution problems.
Collapse
Affiliation(s)
- Roberta Baronio
- Institute for Genomics and Bioinformatics, Department of Biomedical Engineering, University of California, Irvine, CA 92697, USA
| | | | | | | | | | | | | |
Collapse
|