1
|
Huh E, Agosto MA, Wensel TG, Lichtarge O. Coevolutionary signals in metabotropic glutamate receptors capture residue contacts and long-range functional interactions. J Biol Chem 2023; 299:103030. [PMID: 36806686 PMCID: PMC10060750 DOI: 10.1016/j.jbc.2023.103030] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 02/09/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023] Open
Abstract
Upon ligand binding to a G protein-coupled receptor, extracellular signals are transmitted into a cell through sets of residue interactions that translate ligand binding into structural rearrangements. These interactions needed for functions impose evolutionary constraints so that, on occasion, mutations in one position may be compensated by other mutations at functionally coupled positions. To quantify the impact of amino acid substitutions in the context of major evolutionary divergence in the G protein-coupled receptor subfamily of metabotropic glutamate receptors (mGluRs), we combined two phylogenetic-based algorithms, Evolutionary Trace and covariation Evolutionary Trace, to infer potential structure-function couplings and roles in mGluRs. We found a subset of evolutionarily important residues at known functional sites and evidence of coupling among distinct structural clusters in mGluR. In addition, experimental mutagenesis and functional assays confirmed that some highly covariant residues are coupled, revealing their synergy. Collectively, these findings inform a critical step toward understanding the molecular and structural basis of amino acid variation patterns within mGluRs and provide insight for drug development, protein engineering, and analysis of naturally occurring variants.
Collapse
Affiliation(s)
- Eunna Huh
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Melina A Agosto
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA; Retina and Optic Nerve Research Laboratory, Department of Physiology and Biophysics, Dalhousie University, Halifax, Canada
| | - Theodore G Wensel
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Olivier Lichtarge
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
2
|
Recurrent high-impact mutations at cognate structural positions in class A G protein-coupled receptors expressed in tumors. Proc Natl Acad Sci U S A 2021; 118:2113373118. [PMID: 34916293 PMCID: PMC8713800 DOI: 10.1073/pnas.2113373118] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/01/2021] [Indexed: 12/23/2022] Open
Abstract
GPCRs and GPCR pathways are increasingly being implicated in human malignancies, placing them among the most promising cancer drug candidates. Our results reveal enrichment of highly impactful, recurrent GPCR mutations within cancers. We found that cognate mutations in selected class A GPCRs have deleterious effects on signaling function. The results also suggest that olfactory receptors, often considered inconsequential, display a nonrandom mutation pattern in tumors in which they are expressed. These findings support the idea that protein paralogs can act in parallel as members of an onco-group. G protein-coupled receptors (GPCRs) are the largest family of human proteins. They have a common structure and, signaling through a much smaller set of G proteins, arrestins, and effectors, activate downstream pathways that often modulate hallmark mechanisms of cancer. Because there are many more GPCRs than effectors, mutations in different receptors could perturb signaling similarly so as to favor a tumor. We hypothesized that somatic mutations in tumor samples may not be enriched within a single gene but rather that cognate mutations with similar effects on GPCR function are distributed across many receptors. To test this possibility, we systematically aggregated somatic cancer mutations across class A GPCRs and found a nonrandom distribution of positions with variant amino acid residues. Individual cancer types were enriched for highly impactful, recurrent mutations at selected cognate positions of known functional motifs. We also discovered that no single receptor drives this pattern, but rather multiple receptors contain amino acid substitutions at a few cognate positions. Phenotypic characterization suggests these mutations induce perturbation of G protein activation and/or β-arrestin recruitment. These data suggest that recurrent impactful oncogenic mutations perturb different GPCRs to subvert signaling and promote tumor growth or survival. The possibility that multiple different GPCRs could moonlight as drivers or enablers of a given cancer through mutations located at cognate positions across GPCR paralogs opens a window into cancer mechanisms and potential approaches to therapeutics.
Collapse
|
3
|
Novikov IB, Wilkins AD, Lichtarge O. An Evolutionary Trace method defines functionally important bases and sites common to RNA families. PLoS Comput Biol 2020; 16:e1007583. [PMID: 32208421 PMCID: PMC7092961 DOI: 10.1371/journal.pcbi.1007583] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 11/27/2019] [Indexed: 11/18/2022] Open
Abstract
Functional non-coding (fnc)RNAs are nucleotide sequences of varied lengths, structures, and mechanisms that ubiquitously influence gene expression and translation, genome stability and dynamics, and human health and disease. Here, to shed light on their functional determinants, we seek to exploit the evolutionary record of variation and divergence read from sequence comparisons. The approach follows the phylogenetic Evolutionary Trace (ET) paradigm, first developed and extensively validated on proteins. We assigned a relative rank of importance to every base in a study of 1070 functional RNAs, including the ribosome, and observed evolutionary patterns strikingly similar to those seen in proteins, namely, (1) the top-ranked bases clustered in secondary and tertiary structures. (2) In turn, these clusters mapped functional regions for catalysis, binding proteins and drugs, post-transcriptional modification, and deleterious mutations. (3) Moreover, the quantitative quality of these clusters correlated with the identification of functional regions. (4) As a result of this correlation, smoother structural distributions of evolutionary important nucleotides improved functional site predictions. Thus, in practice, phylogenetic analysis can broadly identify functional determinants in RNA sequences and functional sites in RNA structures, and reveal details on the basis of RNA molecular functions. As example of application, we report several previously undocumented and potentially functional ET nucleotide clusters in the ribosome. This work is broadly relevant to studies of structure-function in ribonucleic acids. Additionally, this generalization of ET shows that evolutionary constraints among sequence, structure, and function are similar in structured RNA and proteins. RNA ET is currently available as part of the ET command-line package, and will be available as a web-server. Traditionally, RNA has been delegated to the role of an intermediate between DNA and proteins. However, we now recognize that RNAs are broadly functional beyond their role in translation, and that a number of diverse classes exist. Because functional, non-coding RNAs are prevalent in biology and impact human health, it is important to better understand their functional determinants. However, the classical solution to this problem, targeted mutagenesis, is time-consuming and scales poorly. We propose an alternative computational approach to this problem, the Evolutionary Trace method. Previously developed and validated for proteins, Evolutionary Trace examines evolutionary history of a molecule and predicts evolutionarily important residues in the sequence. We apply Evolutionary Trace to a set of diverse RNAs, and find that the evolutionarily important nucleotides cluster on the three-dimensional structure, and that these clusters closely overlap functional sites. We also find that the clustering property can be used to refine and improve predictions. These findings are in close agreement with our observations of Evolutionary Trace in proteins, and suggest that structured functional RNAs and proteins evolve under similar constraints. In practice, the approach is to be used by RNA researches seeking insight into their molecule of interest, and the Evolutionary Trace program, along with a working example, is available at https://github.com/LichtargeLab/RNA_ET_ms.
Collapse
Affiliation(s)
- Ilya B. Novikov
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, United States of America
| | - Angela D. Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
4
|
Cancer-associated mutations in the ribosomal protein L5 gene dysregulate the HDM2/p53-mediated ribosome biogenesis checkpoint. Oncogene 2020; 39:3443-3457. [PMID: 32108164 DOI: 10.1038/s41388-020-1231-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/14/2020] [Accepted: 02/17/2020] [Indexed: 01/05/2023]
Abstract
Perturbations in ribosome biogenesis have been associated with cancer. Such aberrations activate p53 through the RPL5/RPL11/5S rRNA complex-mediated inhibition of HDM2. Studies using animal models have suggested that this signaling pathway might constitute an important anticancer barrier. To gain a deeper insight into this issue in humans, here we analyze somatic mutations in RPL5 and RPL11 coding regions, reported in The Cancer Genome Atlas and International Cancer Genome Consortium databases. Using a combined computational and statistical approach, complemented by a range of biochemical and functional analyses in human cancer cell models, we demonstrate the existence of several mechanisms by which RPL5 mutations may impair wild-type p53 upregulation and ribosome biogenesis. Unexpectedly, the same approach provides only modest evidence for a similar role of RPL11, suggesting that RPL5 represents a preferred target during human tumorigenesis in cancers with wild-type p53. Furthermore, we find that several functional cancer-associated RPL5 somatic mutations occur as rare germline variants in general population. Our results shed light on the so-far enigmatic role of cancer-associated mutations in genes encoding ribosomal proteins, with implications for our understanding of the tumor suppressive role of the RPL5/RPL11/5S rRNA complex in human malignancies.
Collapse
|
5
|
Combinatorial inhibition of PTPN12-regulated receptors leads to a broadly effective therapeutic strategy in triple-negative breast cancer. Nat Med 2018; 24:505-511. [PMID: 29578538 DOI: 10.1038/nm.4507] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Accepted: 01/29/2018] [Indexed: 12/28/2022]
Abstract
Triple-negative breast cancer (TNBC) is an aggressive subtype of breast cancer diagnosed in more than 200,000 women each year and is recalcitrant to targeted therapies. Although TNBCs harbor multiple hyperactive receptor tyrosine kinases (RTKs), RTK inhibitors have been largely ineffective in TNBC patients thus far. We developed a broadly effective therapeutic strategy for TNBC that is based on combined inhibition of receptors that share the negative regulator PTPN12. Previously, we and others identified the tyrosine phosphatase PTPN12 as a tumor suppressor that is frequently inactivated in TNBC. PTPN12 restrains several RTKs, suggesting that PTPN12 deficiency leads to aberrant activation of multiple RTKs and a co-dependency on these receptors. This in turn leads to the therapeutic hypothesis that PTPN12-deficient TNBCs may be responsive to combined RTK inhibition. However, the repertoire of RTKs that are restrained by PTPN12 in human cells has not been systematically explored. By methodically identifying the suite of RTK substrates (MET, PDGFRβ, EGFR, and others) inhibited by PTPN12, we rationalized a combination RTK-inhibitor therapy that induced potent tumor regression across heterogeneous models of TNBC. Orthogonal approaches revealed that PTPN12 was recruited to and inhibited these receptors after ligand stimulation, thereby serving as a feedback mechanism to limit receptor signaling. Cancer-associated mutation of PTPN12 or reduced PTPN12 protein levels diminished this feedback mechanism, leading to aberrant activity of these receptors. Restoring PTPN12 protein levels restrained signaling from RTKs, including PDGFRβ and MET, and impaired TNBC survival. In contrast with single agents, combined inhibitors targeting the PDGFRβ and MET receptors induced the apoptosis in TNBC cells in vitro and in vivo. This therapeutic strategy resulted in tumor regressions in chemo-refractory patient-derived TNBC models. Notably, response correlated with PTPN12 deficiency, suggesting that impaired receptor feedback may establish a combined addiction to these proto-oncogenic receptors. Taken together, our data provide a rationale for combining RTK inhibitors in TNBC and other malignancies that lack receptor-activating mutations.
Collapse
|
6
|
Nemoto W, Saito A, Oikawa H. Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions. Comput Struct Biotechnol J 2013; 8:e201308007. [PMID: 24688747 PMCID: PMC3962155 DOI: 10.5936/csbj.201308007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2013] [Revised: 11/12/2013] [Accepted: 11/13/2013] [Indexed: 11/22/2022] Open
Abstract
Structural genomics projects have solved many new structures with unknown functions. One strategy to investigate the function of a structure is to computationally find the functionally important residues or regions on it. Therefore, the development of functional region prediction methods has become an important research subject. An effective approach is to use a method employing structural and evolutionary information, such as the evolutionary trace (ET) method. ET ranks the residues of a protein structure by calculating the scores for relative evolutionary importance, and locates functionally important sites by identifying spatial clusters of highly ranked residues. After ET was developed, numerous ET-like methods were subsequently reported, and many of them are in practical use, although they require certain conditions. In this mini review, we first introduce the remaining problems and the recent improvements in the methods using structural and evolutionary information. We then summarize the recent developments of the methods. Finally, we conclude by describing possible extensions of the evolution- and structure-based methods.
Collapse
Affiliation(s)
- Wataru Nemoto
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University (TDU), Ishizaka, Hatoyama-cho, Hiki-gun, Saitama, 350-0394, Japan
| | - Akira Saito
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University (TDU), Ishizaka, Hatoyama-cho, Hiki-gun, Saitama, 350-0394, Japan
| | - Hayato Oikawa
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University (TDU), Ishizaka, Hatoyama-cho, Hiki-gun, Saitama, 350-0394, Japan
| |
Collapse
|
7
|
Wilkins AD, Venner E, Marciano DC, Erdin S, Atri B, Lua RC, Lichtarge O. Accounting for epistatic interactions improves the functional analysis of protein structures. Bioinformatics 2013; 29:2714-21. [PMID: 24021383 PMCID: PMC3799481 DOI: 10.1093/bioinformatics/btt489] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact:lichtarge@bcm.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Angela D Wilkins
- Department of Molecular and Human Genetics, CIBR Center for Computational and Integrative Biomedical Research and Program in Structural and Computational Biology & Molecular Biophysics, Baylor College of Medicine, Houston, TX 77030 and Center for Human Genetic Research, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | | | | | | | | | | | | |
Collapse
|
8
|
Anusuya S, Natarajan J. Multi-targeted therapy for leprosy: insilico strategy to overcome multi drug resistance and to improve therapeutic efficacy. INFECTION GENETICS AND EVOLUTION 2012; 12:1899-910. [PMID: 22981928 DOI: 10.1016/j.meegid.2012.08.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2012] [Revised: 08/01/2012] [Accepted: 08/17/2012] [Indexed: 02/02/2023]
Abstract
Leprosy remains a major public health problem, since single and multi-drug resistance has been reported worldwide over the last two decades. In the present study, we report the novel multi-targeted therapy for leprosy to overcome multi drug resistance and to improve therapeutic efficacy. If multiple enzymes of an essential metabolic pathway of a bacterium were targeted, then the therapy would become more effective and can prevent the occurrence of drug resistance. The MurC, MurD, MurE and MurF enzymes of peptidoglycan biosynthetic pathway were selected for multi targeted therapy. The conserved or class specific active site residues important for function or stability were predicted using evolutionary trace analysis and site directed mutagenesis studies. Ten such residues which were present in at least any three of the four Mur enzymes (MurC, MurD, MurE and MurF) were identified. Among the ten residues G125, K126, T127 and G293 (numbered based on their position in MurC) were found to be conserved in all the four Mur enzymes of the entire bacterial kingdom. In addition K143, T144, T166, G168, H234 and Y329 (numbered based on their position in MurE) were significant in binding substrates and/co-factors needed for the functional events in any three of the Mur enzymes. These are the probable residues for designing newer anti-leprosy drugs in an attempt to reduce drug resistance.
Collapse
Affiliation(s)
- Shanmugam Anusuya
- Department of Bioinformatics, VMKV Engineering College, Vinayaka Missions University, Salem 636 308, India.
| | | |
Collapse
|
9
|
Li B, Kihara D. Protein docking prediction using predicted protein-protein interface. BMC Bioinformatics 2012; 13:7. [PMID: 22233443 PMCID: PMC3287255 DOI: 10.1186/1471-2105-13-7] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2011] [Accepted: 01/10/2012] [Indexed: 11/10/2022] Open
Abstract
Background Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. Results We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. Conclusion We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Collapse
Affiliation(s)
- Bin Li
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
10
|
Abstract
The evolutionary trace (ET) is the single most validated approach to identify protein functional determinants and to target mutational analysis, protein engineering and drug design to the most relevant sites of a protein. It applies to the entire proteome; its predictions come with a reliability score; and its results typically reach significance in most protein families with 20 or more sequence homologs. In order to identify functional hot spots, ET scans a multiple sequence alignment for residue variations that correlate with major evolutionary divergences. In case studies this enables the selective separation, recoding, or mimicry of functional sites and, on a large scale, this enables specific function predictions based on motifs built from select ET-identified residues. ET is therefore an accurate, scalable and efficient method to identify the molecular determinants of protein function and to direct their rational perturbation for therapeutic purposes. Public ET servers are located at: http://mammoth.bcm.tmc.edu/.
Collapse
|
11
|
Determinants, discriminants, conserved residues--a heuristic approach to detection of functional divergence in protein families. PLoS One 2011; 6:e24382. [PMID: 21931701 PMCID: PMC3171465 DOI: 10.1371/journal.pone.0024382] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 08/08/2011] [Indexed: 11/19/2022] Open
Abstract
In this work, belonging to the field of comparative analysis of protein sequences, we focus on detection of functional specialization on the residue level. As the input, we take a set of sequences divided into groups of orthologues, each group known to be responsible for a different function. This provides two independent pieces of information: within group conservation and overlap in amino acid type across groups. We build our discussion around the set of scoring functions that keep the two separated and the source of the signal easy to trace back to its source.We propose a heuristic description of functional divergence that includes residue type exchangeability, both in the conservation and in the overlap measure, and does not make any assumptions on the rate of evolution in the groups other than the one under consideration. Residue types acceptable at a certain position within an orthologous group are described as a distribution which evolves in time, starting from a single ancestral type, and is subject to constraints that can be inferred only indirectly. To estimate the strength of the constraints, we compare the observed degrees of conservation and overlap with those expected in the hypothetical case of a freely evolving distribution.Our description matches the experiment well, but we also conclude that any attempt to capture the evolutionary behavior of specificity determining residues in terms of a scalar function will be tentative, because no single model can cover the variety of evolutionary behavior such residues exhibit. Especially, models expecting the same type of evolutionary behavior across functionally divergent groups tend to miss a portion of information otherwise retrievable by the conservation and overlap measures they use.
Collapse
|
12
|
Häberle J, Shchelochkov OA, Wang J, Katsonis P, Hall L, Reiss S, Eeds A, Willis A, Yadav M, Summar S, Lichtarge O, Rubio V, Wong LJ, Summar M. Molecular defects in human carbamoy phosphate synthetase I: mutational spectrum, diagnostic and protein structure considerations. Hum Mutat 2011; 32:579-89. [PMID: 21120950 DOI: 10.1002/humu.21406] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2010] [Accepted: 10/27/2010] [Indexed: 11/09/2022]
Abstract
Deficiency of carbamoyl phosphate synthetase I (CPSI) results in hyperammonemia ranging from neonatally lethal to environmentally induced adult-onset disease. Over 24 years, analysis of tissue and DNA samples from 205 unrelated individuals diagnosed with CPSI deficiency (CPSID) detected 192 unique CPS1 gene changes, of which 130 are reported here for the first time. Pooled with the already reported mutations, they constitute a total of 222 changes, including 136 missense, 15 nonsense, 50 changes of other types resulting in enzyme truncation, and 21 other changes causing in-frame alterations. Only ∼10% of the mutations recur in unrelated families, predominantly affecting CpG dinucleotides, further complicating the diagnosis because of the "private" nature of such mutations. Missense changes are unevenly distributed along the gene, highlighting the existence of CPSI regions having greater functional importance than other regions. We exploit the crystal structure of the CPSI allosteric domain to rationalize the effects of mutations affecting it. Comparative modeling is used to create a structural model for the remainder of the enzyme. Missense changes are found to directly correlate, respectively, with the one-residue evolutionary importance and inversely correlate with solvent accessibility of the mutated residue. This is the first large-scale report of CPS1 mutations spanning a wide variety of molecular defects highlighting important regions in this protein.
Collapse
Affiliation(s)
- Johannes Häberle
- University Children's Hospital Zurich, Division of Metabolism, Zurich, Switzerland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Lua RC, Lichtarge O. PyETV: a PyMOL evolutionary trace viewer to analyze functional site predictions in protein complexes. Bioinformatics 2010; 26:2981-2. [PMID: 20929911 DOI: 10.1093/bioinformatics/btq566] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
SUMMARY PyETV is a PyMOL plugin for viewing, analyzing and manipulating predictions of evolutionarily important residues and sites in protein structures and their complexes. It seamlessly captures the output of the Evolutionary Trace server, namely ranked importance of residues, for multiple chains of a complex. It then yields a high resolution graphical interface showing their distribution and clustering throughout a quaternary structure, including at interfaces. Together with other tools in the popular PyMOL viewer, PyETV thus provides a novel tool to integrate evolutionary forces into the design of experiments targeting the most functionally relevant sites of a protein. AVAILABILITY The PyETV module is written in Python. Installation instructions and video demonstrations may be found at the URL http://mammoth.bcm.tmc.edu/traceview/HelpDocs/PyETVHelp/pyInstructions.html. CONTACT lichtarge@bcm.tmc.edu.
Collapse
Affiliation(s)
- Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | |
Collapse
|
14
|
Wilkins AD, Lua R, Erdin S, Ward RM, Lichtarge O. Sequence and structure continuity of evolutionary importance improves protein functional site discovery and annotation. Protein Sci 2010; 19:1296-311. [PMID: 20506260 DOI: 10.1002/pro.406] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Protein functional sites control most biological processes and are important targets for drug design and protein engineering. To characterize them, the evolutionary trace (ET) ranks the relative importance of residues according to their evolutionary variations. Generally, top-ranked residues cluster spatially to define evolutionary hotspots that predict functional sites in structures. Here, various functions that measure the physical continuity of ET ranks among neighboring residues in the structure, or in the sequence, are shown to inform sequence selection and to improve functional site resolution. This is shown first, in 110 proteins, for which the overlap between top-ranked residues and actual functional sites rose by 8% in significance. Then, on a structural proteomic scale, optimized ET led to better 3D structure-function motifs (3D templates) and, in turn, to enzyme function prediction by the Evolutionary Trace Annotation (ETA) method with better sensitivity of (40% to 53%) and positive predictive value (93% to 94%). This suggests that the similarity of evolutionary importance among neighboring residues in the sequence and in the structure is a universal feature of protein evolution. In practice, this yields a tool for optimizing sequence selections for comparative analysis and, via ET, for better predictions of functional site and function. This should prove useful for the efficient mutational redesign of protein function and for pharmaceutical targeting.
Collapse
Affiliation(s)
- A D Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | |
Collapse
|
15
|
Evolution: a guide to perturb protein function and networks. Curr Opin Struct Biol 2010; 20:351-9. [PMID: 20444593 DOI: 10.1016/j.sbi.2010.04.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Accepted: 04/08/2010] [Indexed: 12/11/2022]
Abstract
Protein interactions give rise to networks that control cell fate in health and disease; selective means to probe these interactions are therefore of wide interest. We discuss here Evolutionary Tracing (ET), a comparative method to identify protein functional sites and to guide experiments that selectively block, recode, or mimic their amino acid determinants. These studies suggest, in principle, a scalable approach to perturb individual links in protein networks.
Collapse
|
16
|
Kalman M, Ben-Tal N. Quality assessment of protein model-structures using evolutionary conservation. ACTA ACUST UNITED AC 2010; 26:1299-307. [PMID: 20385730 PMCID: PMC2865859 DOI: 10.1093/bioinformatics/btq114] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Motivation: Programs that evaluate the quality of a protein structural model are important both for validating the structure determination procedure and for guiding the model-building process. Such programs are based on properties of native structures that are generally not expected for faulty models. One such property, which is rarely used for automatic structure quality assessment, is the tendency for conserved residues to be located at the structural core and for variable residues to be located at the surface. Results: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern. We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure. We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs. Availability: A perl implementation of the method, as well as the various perl and R scripts used for the analysis are available at http://bental.tau.ac.il/ConQuass/. Contact:nirb@tauex.tau.ac.il Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matan Kalman
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | | |
Collapse
|
17
|
Evolution-guided discovery and recoding of allosteric pathway specificity determinants in psychoactive bioamine receptors. Proc Natl Acad Sci U S A 2010; 107:7787-92. [PMID: 20385837 DOI: 10.1073/pnas.0914877107] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
G protein-coupled receptors for dopamine and serotonin control signaling pathways targeted by many psychoactive drugs. A puzzle is how receptors with similar functions and nearly identical binding site structures, such as D2 dopamine receptors and 5-HT2A serotonin receptors, could evolve a mechanism that discriminates stringently in their cellular responses between endogenous neurotransmitters. We used the Difference Evolutionary Trace (Difference-ET) and residue-swapping to uncover two distinct sets of specificity-determining sequence positions. One at the ligand-binding pocket determines the relative affinities for these two ligands, and a distinct, surprising set of positions outside the binding site determines whether a bound ligand can trigger the conformational rearrangement leading to G protein activation. Thus one site specifies affinity while the other encodes a filter for efficacy. These findings demonstrate that allosteric pathways linking distant interactions via alternate conformational states enforce specificity independently of the ligand-binding site, such that either one may be rationally rekeyed to different ligands. The conversion of a dopamine receptor effectively into a serotonin receptor illustrates the plasticity of GPCR signaling during evolution, or in pathological states, and suggests new approaches to drug discovery, targeting both classes of sites.
Collapse
|
18
|
Baameur F, Morgan DH, Yao H, Tran TM, Hammitt RA, Sabui S, McMurray JS, Lichtarge O, Clark RB. Role for the regulator of G-protein signaling homology domain of G protein-coupled receptor kinases 5 and 6 in beta 2-adrenergic receptor and rhodopsin phosphorylation. Mol Pharmacol 2010; 77:405-15. [PMID: 20038610 PMCID: PMC2835418 DOI: 10.1124/mol.109.058115] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2009] [Accepted: 12/28/2009] [Indexed: 11/22/2022] Open
Abstract
Phosphorylation of G protein-coupled receptors (GPCRs) by GPCR kinases (GRKs) is a major mechanism of desensitization of these receptors. GPCR activation of GRKs involves an allosteric site on GRKs distinct from the catalytic site. Although recent studies have suggested an important role of the N- and C-termini and domains surrounding the kinase active site in allosteric activation, the nature of that site and the relative roles of the RH domain in particular remain unknown. Based on evolutionary trace analysis of both the RH and kinase domains of the GRK family, we identified an important cluster encompassing helices 3, 9, and 10 in the RH domain in addition to sites in the kinase domain. To define its function, a panel of GRK5 and -6 mutants was generated and screened by intact-cell assay of constitutive GRK phosphorylation of the beta(2)-adrenergic receptor (beta 2AR), in vitro GRK phosphorylation of light-activated rhodopsin, and basal catalytic activity measured by tubulin phosphorylation and autophosphorylation. A number of double mutations within helices 3, 9, and 10 reduced phosphorylation of the beta2AR and rhodopsin by 50 to 90% relative to wild-type GRK, as well as autophosphorylation and tubulin phosphorylation. Based on these results, helix 9 peptide mimetics were designed, and several were found to inhibit rhodopsin phosphorylation by GRK5 with an IC(50) of approximately 30 microM. In summary, our studies have uncovered previously unrecognized functionally important sites in the regulator of G-protein signaling homology domain of GRK5 and -6 and identified a peptide inhibitor with potential for specific blockade of GRK-mediated phosphorylation of receptors.
Collapse
Affiliation(s)
- Faiza Baameur
- Department of Integrative Biology and Pharmacology, University of Texas Health Science Center, Medical School, 6431 Fannin St, Houston, TX 77030, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Nimrod G, Schushan M, Steinberg DM, Ben-Tal N. Detection of functionally important regions in "hypothetical proteins" of known structure. Structure 2009; 16:1755-63. [PMID: 19081051 DOI: 10.1016/j.str.2008.10.017] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2008] [Revised: 10/16/2008] [Accepted: 10/19/2008] [Indexed: 10/21/2022]
Abstract
Structural genomics initiatives provide ample structures of "hypothetical proteins" (i.e., proteins of unknown function) at an ever increasing rate. However, without function annotation, this structural goldmine is of little use to biologists who are interested in particular molecular systems. To this end, we used (an improved version of) the PatchFinder algorithm for the detection of functional regions on the protein surface, which could mediate its interactions with, e.g., substrates, ligands, and other proteins. Examination, using a data set of annotated proteins, showed that PatchFinder outperforms similar methods. We collected 757 structures of hypothetical proteins and their predicted functional regions in the N-Func database. Inspection of several of these regions demonstrated that they are useful for function prediction. For example, we suggested an interprotein interface and a putative nucleotide-binding site. A web-server implementation of PatchFinder and the N-Func database are available at http://patchfinder.tau.ac.il/.
Collapse
Affiliation(s)
- Guy Nimrod
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, 69978 Tel Aviv, Israel
| | | | | | | |
Collapse
|
20
|
Rajagopalan L, Pereira FA, Lichtarge O, Brownell WE. Identification of functionally important residues/domains in membrane proteins using an evolutionary approach coupled with systematic mutational analysis. Methods Mol Biol 2009; 493:287-97. [PMID: 18839354 PMCID: PMC2673147 DOI: 10.1007/978-1-59745-523-7_17] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Structure-function studies of membrane proteins present a unique challenge to researchers due to the numerous technical difficulties associated with their expression, purification and structural characterization. In the absence of structural information, rational identification of putative functionally important residues/regions is difficult. Phylogenetic relationships could provide valuable information about the functional significance of a particular residue or region of a membrane protein. Evolutionary Trace (ET) analysis is a method developed to utilize this phylogenetic information to predict functional sites in proteins. In this method, residues are ranked according to conservation or divergence through evolution, based on the hypothesis that mutations at key positions should coincide with functional evolutionary divergences. This information can be used as the basis for a systematic mutational analysis of identified residues, leading to the identification of functionally important residues and/or domains in membrane proteins, in the absence of structural information apart from the primary amino acid sequence. This approach is potentially useful in the context of the auditory system, as several key processes in audition involve the action of membrane proteins, many of which are novel and not well characterized structurally or functionally to date.
Collapse
Affiliation(s)
- Lavanya Rajagopalan
- Bobby R. Alford Department of Otolaryngology- Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Fred A. Pereira
- Bobby R. Alford Department of Otolaryngology- Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA, Huffington Center on Aging and Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - William E. Brownell
- Bobby R. Alford Department of Otolaryngology- Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA
| |
Collapse
|
21
|
Fasnacht M, Zhu J, Honig B. Local quality assessment in homology models using statistical potentials and support vector machines. Protein Sci 2007; 16:1557-68. [PMID: 17600147 PMCID: PMC2203356 DOI: 10.1110/ps.072856307] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In this study, we address the problem of local quality assessment in homology models. As a prerequisite for the evaluation of methods for predicting local model quality, we first examine the problem of measuring local structural similarities between a model and the corresponding native structure. Several local geometric similarity measures are evaluated. Two methods based on structural superposition are found to best reproduce local model quality assessments by human experts. We then examine the performance of state-of-the-art statistical potentials in predicting local model quality on three qualitatively distinct data sets. The best statistical potential, DFIRE, is shown to perform on par with the best current structure-based method in the literature, ProQres. A combination of different statistical potentials and structural features using support vector machines is shown to provide somewhat improved performance over published methods.
Collapse
Affiliation(s)
- Marc Fasnacht
- Howard Hughes Medical Institute at Columbia University, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, New York, New York 10032, USA
| | | | | |
Collapse
|
22
|
Abstract
Protein-protein interactions create the macromolecular assemblies and sequential signaling pathways essential for cell function. Their number far exceeds the number of proteins themselves and their experimental characterization, while improving, remains relatively slow. For these reasons, novel computational methods have important roles to play in understanding the physical basis of protein interactions, and in constraining the molecular basis of their specificity. This paper discusses methods based on multiple sequence alignments of protein homologues and phylogenetic trees.
Collapse
Affiliation(s)
- Ivica Res
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | | |
Collapse
|
23
|
Yao H, Mihalek I, Lichtarge O. Rank information: a structure-independent measure of evolutionary trace quality that improves identification of protein functional sites. Proteins 2006; 65:111-23. [PMID: 16894615 DOI: 10.1002/prot.21101] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Protein functional sites are key targets for drug design and protein engineering, but their large-scale experimental characterization remains difficult. The evolutionary trace (ET) is a computational approach to this problem that has been useful in a variety of case studies, but its proteomic scale application is partially hindered because automated retrieval of input sequences from databases often includes some with errors that degrade functional site identification. To recognize and purge these sequences, this study introduces a novel and structure-free measure of ET quality called rank information (RI). It is shown that RI decreases in response to errors in sequences, alignments, or functional classifications. Conversely, an automated procedure to increase RI by selectively removing sequences improves functional site identification so as to nearly match manually curated traces in kinases and in a test set of 79 diverse proteins. Thus we conclude that RI partially reflects the evolutionary consistency of sequence, structure, and function. In practice, as the size of the proteome continues to grow exponentially, it provides a novel and structure-free measure of ET quality that increases its accuracy for large-scale automated annotation of protein functional sites.
Collapse
Affiliation(s)
- Hui Yao
- Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine,Houston, Texas 77030, USA
| | | | | |
Collapse
|
24
|
Raviscioni M, He Q, Salicru EM, Smith CL, Lichtarge O. Evolutionary identification of a subtype specific functional site in the ligand binding domain of steroid receptors. Proteins 2006; 64:1046-57. [PMID: 16835908 DOI: 10.1002/prot.21074] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Nuclear receptors are ubiquitous eukaryotic ligand-activated transcription factors that modulate gene expression through varied interactions. However, the highly conserved functional sites known today seem insufficient to explain receptor specific recruitment of different coactivator and corepressor proteins and regulation of transcription. To search for new receptor-subtype specific functional sites, we applied difference evolutionary trace (difference ET) analysis to the ligand binding domain of steroid receptors, a subgroup of the nuclear receptor (NR) family. This computational approach identified a new functional site located on a surface opposite to currently known protein-protein interaction sites and distinct from the ligand binding pocket. Strikingly, the literature shows that in vivo variations at residues in the new site are linked to androgen resistance and leukemia, and our own targeted mutations to this site lower but do not eradicate transcriptional activation by estrogen receptor alpha (ERalpha), with reduced ligand binding affinity and SRC-1 interaction. Thus, these data demonstrate that this evolutionary important surface can function as an allosteric site that modulates some but not all receptor binding interactions. Evolutionary analysis further shows that this allosteric regulatory site is shared among all NRs from groups 2 (HNF4-like) and 4 (NGFIB-like), suggesting a role among many nuclear receptors. Its concave structure, hydrophobic composition, and residue variability among nuclear receptors further suggest that it would be amenable for specific drug design. This highlights the power of evolutionary information for the identification of new functional sites even in a protein family as well studied as NRs.
Collapse
Affiliation(s)
- Michele Raviscioni
- W. M. Keck Center for Computational and Structural Biology, Baylor College of Medicine, Houston Texas 77030, USA
| | | | | | | | | |
Collapse
|
25
|
Morgan DH, Kristensen DM, Mittelman D, Lichtarge O. ET viewer: an application for predicting and visualizing functional sites in protein structures. Bioinformatics 2006; 22:2049-50. [PMID: 16809388 DOI: 10.1093/bioinformatics/btl285] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
SUMMARY The Evolutionary Trace Viewer (ETV) provides a one-stop environment in which to run, visualize and interpret Evolutionary Trace (ET) predictions of functional sites in protein structures. ETV is implemented using Java to run across different operating systems using Java Web Start technology. AVAILABILITY The ETV is available for download from our website at http://mammoth.bcm.tmc.edu/traceview/index.html. This webpage also links to sample trace results and a user manual that describes ET Viewer functions in detail.
Collapse
Affiliation(s)
- Daniel H Morgan
- Department of Molecular and Human Genetics, One Baylor Plaza, Houston, TX 77030, USA
| | | | | | | |
Collapse
|
26
|
Abstract
In this report, we demonstrate that phylogenetic motifs, sequence regions conserving the overall familial phylogeny, represent a promising approach to protein functional site prediction. Across our structurally and functionally heterogeneous data set, phylogenetic motifs consistently correspond to functional sites defined by both surface loops and active site clefts. Additionally, the partially buried prosthetic group regions of cytochrome P450 and succinate dehydrogenase are identified as phylogenetic motifs. In nearly all instances, phylogenetic motifs are structurally clustered, despite little overall sequence proximity, around key functional site features. Based on calculated false-positive expectations and standard motif identification methods, we show that phylogenetic motifs are generally conserved in sequence. This result implies that they can be considered motifs in the traditional sense as well. However, there are instances where phylogenetic motifs are not (overall) well conserved in sequence. This point is enticing, because it implies that phylogenetic motifs are able to identify key sequence regions that traditional motif-based approaches would not. Further, phylogenetic motif results are also shown to be consistent with evolutionary trace results, and bootstrapping is used to demonstrate tree significance.
Collapse
Affiliation(s)
- David La
- Department of Biological Sciences, California State Polytechnic University, Pomona, California 91768, USA
| | | | | |
Collapse
|
27
|
Mihalek I, Res I, Lichtarge O. Evolutionary and structural feedback on selection of sequences for comparative analysis of proteins. Proteins 2006; 63:87-99. [PMID: 16397893 DOI: 10.1002/prot.20866] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
It has been noted that slowly evolving protein residues have two properties: (a) they tend to cluster in the native fold, and (b) they delineate functional surfaces-parts of the surface through which the protein interacts with other proteins or small ligands. Herein, we demonstrate that the two are coupled sufficiently strongly that one effect, when observed, statistically implies the other. Detection of both can be accomplished in multiple sequence alignment related methods by the careful selection of relevant sequences. For the demonstration, we use two sets of protein families: a small set of diverse proteins with diverse functional surfaces, and a large set of homodimerizing enzymes. A practical outcome of our considerations is a simple prescriptive rule for the selection of homologous sequences for the comparative analysis of proteins: in order to optimize the detection of (potentially unknown) functional surfaces, it is sufficient to select sequences in such a way that the residues observed at any level of evolutionary divergence, as implied by the alignment, cluster on the folded protein.
Collapse
Affiliation(s)
- I Mihalek
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| | | | | |
Collapse
|
28
|
Kristensen DM, Chen BY, Fofanov VY, Ward RM, Lisewski AM, Kimmel M, Kavraki LE, Lichtarge O. Recurrent use of evolutionary importance for functional annotation of proteins based on local structural similarity. Protein Sci 2006; 15:1530-6. [PMID: 16672239 PMCID: PMC2242527 DOI: 10.1110/ps.062152706] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The annotation of protein function has not kept pace with the exponential growth of raw sequence and structure data. An emerging solution to this problem is to identify 3D motifs or templates in protein structures that are necessary and sufficient determinants of function. Here, we demonstrate the recurrent use of evolutionary trace information to construct such 3D templates for enzymes, search for them in other structures, and distinguish true from spurious matches. Serine protease templates built from evolutionarily important residues distinguish between proteases and other proteins nearly as well as the classic Ser-His-Asp catalytic triad. In 53 enzymes spanning 33 distinct functions, an automated pipeline identifies functionally related proteins with an average positive predictive power of 62%, including correct matches to proteins with the same function but with low sequence identity (the average identity for some templates is only 17%). Although these template building, searching, and match classification strategies are not yet optimized, their sequential implementation demonstrates a functional annotation pipeline which does not require experimental information, but only local molecular mimicry among a small number of evolutionarily important residues.
Collapse
Affiliation(s)
- David M Kristensen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Muppirala UK, Li Z. A simple approach for protein structure discrimination based on the network pattern of conserved hydrophobic residues. Protein Eng Des Sel 2006; 19:265-75. [PMID: 16565147 DOI: 10.1093/protein/gzl009] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Evolutionarily conserved hydrophobic residues at the core of protein structures are generally assumed to play a structural role in protein folding and stability. Recent studies have implicated that their importance to protein structures is uneven, with a few of them being crucial and the rest of them being secondary. In this work, we explored the possibility of employing this feature of native structures for discriminating non-native structures from native ones. First, we developed a network tool to quantitatively measure the structural contributions of individual amino acid residues. We systematically applied this method to diverse fold-type sets of native proteins. It was confirmed that this method could grasp the essential structural features of native proteins. Next, we applied it to a number of decoy sets of proteins. The results indicate that such an approach indeed identified non-native structures in most test cases. This finding should be of help for the investigation of the fundamental problem of protein structure prediction.
Collapse
Affiliation(s)
- Usha K Muppirala
- Bioinformatics Program, University of the Sciences in Philadelphia Philadelphia, PA 19104, USA
| | | |
Collapse
|
30
|
Bowman BR, Welschhans RL, Jayaram H, Stow ND, Preston VG, Quiocho FA. Structural characterization of the UL25 DNA-packaging protein from herpes simplex virus type 1. J Virol 2006; 80:2309-17. [PMID: 16474137 PMCID: PMC1395411 DOI: 10.1128/jvi.80.5.2309-2317.2006] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2005] [Accepted: 12/08/2005] [Indexed: 11/20/2022] Open
Abstract
Herpesviruses replicate their double stranded DNA genomes as high-molecular-weight concatemers which are subsequently cleaved into unit-length genomes by a complex mechanism that is tightly coupled to DNA insertion into a preformed capsid structure, the procapsid. The herpes simplex virus type 1 UL25 protein is incorporated into the capsid during DNA packaging, and previous studies of a null mutant have demonstrated that its function is essential at the late stages of the head-filling process, either to allow packaging to proceed to completion or for retention of the viral genome within the capsid. We have expressed and purified an N-terminally truncated form of the 580-residue UL25 protein and have determined the crystallographic structure of the region corresponding to amino acids 134 to 580 at 2.1-Angstroms resolution. This structure, the first for any herpesvirus protein involved in processing and packaging of viral DNA, reveals a novel fold, a distinctive electrostatic distribution, and a unique "flexible" architecture in which numerous flexible loops emanate from a stable core. Evolutionary trace analysis of UL25 and its homologues in other herpesviruses was used to locate potentially important amino acids on the surface of the protein, leading to the identification of four putative docking regions for protein partners.
Collapse
Affiliation(s)
- Brian R Bowman
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | | | | | | | |
Collapse
|
31
|
Mihalek I, Res I, Lichtarge O. A structure and evolution-guided Monte Carlo sequence selection strategy for multiple alignment-based analysis of proteins. ACTA ACUST UNITED AC 2005; 22:149-56. [PMID: 16303797 DOI: 10.1093/bioinformatics/bti791] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Various multiple sequence alignment-based methods have been proposed to detect functional surfaces in proteins, such as active sites or protein interfaces. The effect that the choice of sequences has on the conclusions of such analysis has seldom been discussed. In particular, no method has been discussed in terms of its ability to optimize the sequence selection for the reliable detection of functional surfaces. RESULTS Here we propose, for the case of proteins with known structure, a heuristic Metropolis Monte Carlo strategy to select sequences from a large set of homologues, in order to improve detection of functional surfaces. The quantity guiding the optimization is the clustering of residues which are under increased evolutionary pressure, according to the sample of sequences under consideration. We show that we can either improve the overlap of our prediction with known functional surfaces in comparison with the sequence similarity criteria of selection or match the quality of prediction obtained through more elaborate non-structure based-methods of sequence selection. For the purpose of demonstration we use a set of 50 homodimerizing enzymes which were co-crystallized with their substrates and cofactors.
Collapse
Affiliation(s)
- I Mihalek
- Department of Molecular and Human Genetics, Baylor College of Medicine One Baylor Plaza, Houston, TX 77030, USA.
| | | | | |
Collapse
|
32
|
Res I, Mihalek I, Lichtarge O. An evolution based classifier for prediction of protein interfaces without using protein structures. Bioinformatics 2005; 21:2496-501. [PMID: 15728113 DOI: 10.1093/bioinformatics/bti340] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The number of available protein structures still lags far behind the number of known protein sequences. This makes it important to predict which residues participate in protein-protein interactions using only sequence information. Few studies have tackled this problem until now. RESULTS We applied support vector machines to sequences in order to generate a classification of all protein residues into those that are part of a protein interface and those that are not. For the first time evolutionary information was used as one of the attributes and this inclusion of evolutionary importance rankings improves the classification. Leave-one-out cross-validation experiments show that prediction accuracy reaches 64%.
Collapse
Affiliation(s)
- I Res
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | | | | |
Collapse
|
33
|
Chakravarty S, Hutson AM, Estes MK, Prasad BVV. Evolutionary trace residues in noroviruses: importance in receptor binding, antigenicity, virion assembly, and strain diversity. J Virol 2005; 79:554-68. [PMID: 15596848 PMCID: PMC538680 DOI: 10.1128/jvi.79.1.554-568.2005] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2004] [Accepted: 08/30/2004] [Indexed: 11/20/2022] Open
Abstract
Noroviruses cause major epidemic gastroenteritis in humans. A large number of strains of these single-stranded RNA viruses have been reported. Due to the absence of infectious clones of noroviruses and the high sequence variability in their capsids, it has not been possible to identify functionally important residues in these capsids. Consequently, norovirus strain diversity is not understood on the basis of capsid functions, and the development of therapeutic compounds has been hampered. To determine functionally important residues in noroviruses, we have analyzed a number of norovirus capsid sequences in the context of the Norwalk virus capsid crystal structure by using the evolutionary trace method. This analysis has identified capsid protein residues that uniquely characterize different norovirus strains and provide new insights into capsid assembly and disassembly pathways and the strain diversity of these viruses. Such residues form specific three-dimensional clusters that may be of functional importance in noroviruses. One of these clusters includes residues known to participate in the proteolytic cleavage of these viruses at high pH. Other clusters are formed in capsid regions known to be important in the binding of antibodies to noroviruses, thereby indicating residues that may be important in the antigenicity of these viruses. The highly variable region of the capsid shows a distinct cluster whose residues may participate in norovirus-receptor interactions.
Collapse
Affiliation(s)
- Sugoto Chakravarty
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030-3498, USA
| | | | | | | |
Collapse
|
34
|
Sadreyev RI, Grishin NV. Estimates of statistical significance for comparison of individual positions in multiple sequence alignments. BMC Bioinformatics 2004; 5:106. [PMID: 15296518 PMCID: PMC516024 DOI: 10.1186/1471-2105-5-106] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2004] [Accepted: 08/05/2004] [Indexed: 11/17/2022] Open
Abstract
Background Profile-based analysis of multiple sequence alignments (MSA) allows for accurate comparison of protein families. Here, we address the problems of detecting statistically confident dissimilarities between (1) MSA position and a set of predicted residue frequencies, and (2) between two MSA positions. These problems are important for (i) evaluation and optimization of methods predicting residue occurrence at protein positions; (ii) detection of potentially misaligned regions in automatically produced alignments and their further refinement; and (iii) detection of sites that determine functional or structural specificity in two related families. Results For problems (1) and (2), we propose analytical estimates of P-value and apply them to the detection of significant positional dissimilarities in various experimental situations. (a) We compare structure-based predictions of residue propensities at a protein position to the actual residue frequencies in the MSA of homologs. (b) We evaluate our method by the ability to detect erroneous position matches produced by an automatic sequence aligner. (c) We compare MSA positions that correspond to residues aligned by automatic structure aligners. (d) We compare MSA positions that are aligned by high-quality manual superposition of structures. Detected dissimilarities reveal shortcomings of the automatic methods for residue frequency prediction and alignment construction. For the high-quality structural alignments, the dissimilarities suggest sites of potential functional or structural importance. Conclusion The proposed computational method is of significant potential value for the analysis of protein families.
Collapse
Affiliation(s)
- Ruslan I Sadreyev
- Howard Hughes Medical Institute, and Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, and Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
| |
Collapse
|
35
|
Mihalek I, Res I, Lichtarge O. A family of evolution-entropy hybrid methods for ranking protein residues by importance. J Mol Biol 2004; 336:1265-82. [PMID: 15037084 DOI: 10.1016/j.jmb.2003.12.078] [Citation(s) in RCA: 231] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2003] [Revised: 12/05/2003] [Accepted: 12/19/2003] [Indexed: 11/17/2022]
Abstract
In order to identify the amino acids that determine protein structure and function it is useful to rank them by their relative importance. Previous approaches belong to two groups; those that rely on statistical inference, and those that focus on phylogenetic analysis. Here, we introduce a class of hybrid methods that combine evolutionary and entropic information from multiple sequence alignments. A detailed analysis in insulin receptor kinase domain and tests on proteins that are well-characterized experimentally show the hybrids' greater robustness with respect to the input choice of sequences, as well as improved sensitivity and specificity of prediction. This is a further step toward proteome scale analysis of protein structure and function.
Collapse
Affiliation(s)
- I Mihalek
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza T921, Houston, TX 77030, USA
| | | | | |
Collapse
|
36
|
Innis CA, Anand AP, Sowdhamini R. Prediction of functional sites in proteins using conserved functional group analysis. J Mol Biol 2004; 337:1053-68. [PMID: 15033369 DOI: 10.1016/j.jmb.2004.01.053] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2003] [Revised: 01/20/2004] [Accepted: 01/28/2004] [Indexed: 11/21/2022]
Abstract
A detailed knowledge of a protein's functional site is an absolute prerequisite for understanding its mode of action at the molecular level. However, the rapid pace at which sequence and structural information is being accumulated for proteins greatly exceeds our ability to determine their biochemical roles experimentally. As a result, computational methods are required which allow for the efficient processing of the evolutionary information contained in this wealth of data, in particular that related to the nature and location of functionally important sites and residues. The method presented here, referred to as conserved functional group (CFG) analysis, relies on a simplified representation of the chemical groups found in amino acid side-chains to identify functional sites from a single protein structure and a number of its sequence homologues. We show that CFG analysis can fully or partially predict the location of functional sites in approximately 96% of the 470 cases tested and that, unlike other methods available, it is able to tolerate wide variations in sequence identity. In addition, we discuss its potential in a structural genomics context, where automation, scalability and efficiency are critical, and an increasing number of protein structures are determined with no prior knowledge of function. This is exemplified by our analysis of the hypothetical protein Ydde_Ecoli, whose structure was recently solved by members of the North East Structural Genomics consortium. Although the proposed active site for this protein needs to be validated experimentally, this example illustrates the scope of CFG analysis as a general tool for the identification of residues likely to play an important role in a protein's biochemical function. Thus, our method offers a convenient solution to rapidly and automatically process the vast amounts of data that are beginning to emerge from structural genomics projects.
Collapse
Affiliation(s)
- C Axel Innis
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS-GKVK Campus, Bellary Road, Bangalore 560065, India.
| | | | | |
Collapse
|
37
|
Madabushi S, Gross AK, Philippi A, Meng EC, Wensel TG, Lichtarge O. Evolutionary trace of G protein-coupled receptors reveals clusters of residues that determine global and class-specific functions. J Biol Chem 2003; 279:8126-32. [PMID: 14660595 DOI: 10.1074/jbc.m312671200] [Citation(s) in RCA: 152] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
G protein-coupled receptor (GPCR) activation mediated by ligand-induced structural reorganization of its helices is poorly understood. To determine the universal elements of this conformational switch, we used evolutionary tracing (ET) to identify residue positions commonly important in diverse GPCRs. When mapped onto the rhodopsin structure, these trace residues cluster into a network of contacts from the retinal binding site to the G protein-coupling loops. Their roles in a generic transduction mechanism were verified by 211 of 239 published mutations that caused functional defects. When grouped according to the nature of the defects, these residues sub-divided into three striking sub-clusters: a trigger region, where mutations mostly affect ligand binding, a coupling region near the cytoplasmic interface to the G protein, where mutations affect G protein activation, and a linking core in between where mutations cause constitutive activity and other defects. Differential ET analysis of the opsin family revealed an additional set of opsin-specific residues, several of which form part of the retinal binding pocket, and are known to cause functional defects upon mutation. To test the predictive power of ET, we introduced novel mutations in bovine rhodopsin at a globally important position, Leu-79, and at an opsin-specific position, Trp-175. Both were functionally critical, causing constitutive G protein activation of the mutants and rapid loss of regeneration after photobleaching. These results define in GPCRs a canonical signal transduction mechanism where ligand binding induces conformational changes propagated through adjacent trigger, linking core, and coupling regions.
Collapse
Affiliation(s)
- Srinivasan Madabushi
- Program in Structural and Computational Biology and Molecular Biophysics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | |
Collapse
|