1
|
Usmanova DR, Plata G, Vitkup D. Functional Optimization in Distinct Tissues and Conditions Constrains the Rate of Protein Evolution. Mol Biol Evol 2024; 41:msae200. [PMID: 39431545 PMCID: PMC11523136 DOI: 10.1093/molbev/msae200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/29/2024] [Accepted: 08/05/2024] [Indexed: 10/22/2024] Open
Abstract
Understanding the main determinants of protein evolution is a fundamental challenge in biology. Despite many decades of active research, the molecular and cellular mechanisms underlying the substantial variability of evolutionary rates across cellular proteins are not currently well understood. It also remains unclear how protein molecular function is optimized in the context of multicellular species and why many proteins, such as enzymes, are only moderately efficient on average. Our analysis of genomics and functional datasets reveals in multiple organisms a strong inverse relationship between the optimality of protein molecular function and the rate of protein evolution. Furthermore, we find that highly expressed proteins tend to be substantially more functionally optimized. These results suggest that cellular expression costs lead to more pronounced functional optimization of abundant proteins and that the purifying selection to maintain high levels of functional optimality significantly slows protein evolution. We observe that in multicellular species both the rate of protein evolution and the degree of protein functional efficiency are primarily affected by expression in several distinct cell types and tissues, specifically, in developed neurons with upregulated synaptic processes in animals and in young and fast-growing tissues in plants. Overall, our analysis reveals how various constraints from the molecular, cellular, and species' levels of biological organization jointly affect the rate of protein evolution and the level of protein functional adaptation.
Collapse
Affiliation(s)
- Dinara R Usmanova
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
| | - Germán Plata
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- BiomEdit, Fishers, IN 46037, USA
| | - Dennis Vitkup
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
2
|
Man J, Harrington TA, Lally K, Bartlett ME. Asymmetric Evolution of Protein Domains in the Leucine-Rich Repeat Receptor-Like Kinase Family of Plant Signaling Proteins. Mol Biol Evol 2023; 40:msad220. [PMID: 37787619 PMCID: PMC10588794 DOI: 10.1093/molbev/msad220] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 08/29/2023] [Accepted: 09/26/2023] [Indexed: 10/04/2023] Open
Abstract
The coding sequences of developmental genes are expected to be deeply conserved, with cis-regulatory change driving the modulation of gene function. In contrast, proteins with roles in defense are expected to evolve rapidly, in molecular arms races with pathogens. However, some gene families include both developmental and defense genes. In these families, does the tempo and mode of evolution differ between genes with divergent functions, despite shared ancestry and structure? The leucine-rich repeat receptor-like kinase (LRR-RLKs) protein family includes members with roles in plant development and defense, thus providing an ideal system for answering this question. LRR-RLKs are receptors that traverse plasma membranes. LRR domains bind extracellular ligands; RLK domains initiate intracellular signaling cascades in response to ligand binding. In LRR-RLKs with roles in defense, LRR domains evolve faster than RLK domains. To determine whether this asymmetry extends to LRR-RLKs that function primarily in development, we assessed evolutionary rates and tested for selection acting on 11 subfamilies of LRR-RLKs, using deeply sampled protein trees. To assess functional evolution, we performed heterologous complementation assays in Arabidopsis thaliana (Arabidopsis). We found that the LRR domains of all tested LRR-RLK proteins evolved faster than their cognate RLK domains. All tested subfamilies of LRR-RLKs had strikingly similar patterns of molecular evolution, despite divergent functions. Heterologous transformation experiments revealed that multiple mechanisms likely contribute to the evolution of LRR-RLK function, including escape from adaptive conflict. Our results indicate specific and distinct evolutionary pressures acting on LRR versus RLK domains, despite diverse organismal roles for LRR-RLK proteins.
Collapse
Affiliation(s)
- Jarrett Man
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | - T A Harrington
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | - Kyra Lally
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| | - Madelaine E Bartlett
- Department of Biology, University of Massachusetts Amherst, Amherst, MA 01002, USA
| |
Collapse
|
3
|
Ma R, Sun L, Chen X, Mei B, Chang G, Wang M, Zhao D. Proteomic Analyses Provide Novel Insights into Plant Growth and Ginsenoside Biosynthesis in Forest Cultivated Panax ginseng (F. Ginseng). FRONTIERS IN PLANT SCIENCE 2016; 7:1. [PMID: 26858731 PMCID: PMC4726751 DOI: 10.3389/fpls.2016.00001] [Citation(s) in RCA: 91] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 01/05/2016] [Indexed: 05/18/2023]
Abstract
F. Ginseng (Panax ginseng) is planted in the forest to enhance the natural ginseng resources, which have an immense medicinal and economic value. The morphology of the cultivated plants becomes similar to that of wild growing ginseng (W. Ginseng) over the years. So far, there have been no studies highlighting the physiological or functional changes in F. Ginseng and its wild counterparts. In the present study, we used proteomic technologies (2DE and iTRAQ) coupled to mass spectrometry to compare W. Ginseng and F. Ginseng at various growth stages. Hierarchical cluster analysis based on protein abundance revealed that the protein expression profile of 25-year-old F. Ginseng was more like W. Ginseng than less 20-year-old F. Ginseng. We identified 192 differentially expressed protein spots in F. Ginseng. These protein spots increased with increase in growth years of F. Ginseng and were associated with proteins involved in energy metabolism, ginsenosides biosynthesis, and stress response. The mRNA, physiological, and metabolic analysis showed that the external morphology, protein expression profile, and ginsenoside synthesis ability of the F. Ginseng increased just like that of W. Ginseng with the increase in age. Our study represents the first characterization of the proteome of F. Ginseng during development and provides new insights into the metabolism and accumulation of ginsenosides.
Collapse
Affiliation(s)
- Rui Ma
- Jilin Technology Innovation Center for Chinese Medicine Biotechnology, College of Chemistry and Biology, Beihua UniversityJilin, China
- Ginseng Research Center, Changchun University of Chinese MedicineChangchun, China
| | - Liwei Sun
- Jilin Technology Innovation Center for Chinese Medicine Biotechnology, College of Chemistry and Biology, Beihua UniversityJilin, China
- *Correspondence: Liwei Sun
| | - Xuenan Chen
- Ginseng Research Center, Changchun University of Chinese MedicineChangchun, China
- The first affiliated hospital to Changchun University of Chinese MedicineChangchun, China
| | - Bing Mei
- Ginseng Research Center, Changchun University of Chinese MedicineChangchun, China
| | - Guijuan Chang
- Ginseng Research Center, Changchun University of Chinese MedicineChangchun, China
| | - Manying Wang
- Jilin Technology Innovation Center for Chinese Medicine Biotechnology, College of Chemistry and Biology, Beihua UniversityJilin, China
| | - Daqing Zhao
- Ginseng Research Center, Changchun University of Chinese MedicineChangchun, China
- Daqing Zhao
| |
Collapse
|
4
|
Faure G, Koonin EV. Universal distribution of mutational effects on protein stability, uncoupling of protein robustness from sequence evolution and distinct evolutionary modes of prokaryotic and eukaryotic proteins. Phys Biol 2015; 12:035001. [PMID: 25927823 DOI: 10.1088/1478-3975/12/3/035001] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Robustness to destabilizing effects of mutations is thought of as a key factor of protein evolution. The connections between two measures of robustness, the relative core size and the computationally estimated effect of mutations on protein stability (ΔΔG), protein abundance and the selection pressure on protein-coding genes (dN/dS) were analyzed for the organisms with a large number of available protein structures including four eukaryotes, two bacteria and one archaeon. The distribution of the effects of mutations in the core on protein stability is universal and indistinguishable in eukaryotes and bacteria, centered at slightly destabilizing amino acid replacements, and with a heavy tail of more strongly destabilizing replacements. The distribution of mutational effects in the hyperthermophilic archaeon Thermococcus gammatolerans is significantly shifted toward strongly destabilizing replacements which is indicative of stronger constraints that are imposed on proteins in hyperthermophiles. The median effect of mutations is strongly, positively correlated with the relative core size, in evidence of the congruence between the two measures of protein robustness. However, both measures show only limited correlations to the expression level and selection pressure on protein-coding genes. Thus, the degree of robustness reflected in the universal distribution of mutational effects appears to be a fundamental, ancient feature of globular protein folds whereas the observed variations are largely neutral and uncoupled from short term protein evolution. A weak anticorrelation between protein core size and selection pressure is observed only for surface residues in prokaryotes but a stronger anticorrelation is observed for all residues in eukaryotic proteins. This substantial difference between proteins of prokaryotes and eukaryotes is likely to stem from the demonstrable higher compactness of prokaryotic proteins.
Collapse
Affiliation(s)
- Guilhem Faure
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | |
Collapse
|
5
|
Schüler A, Ghanbarian AT, Hurst LD. Purifying selection on splice-related motifs, not expression level nor RNA folding, explains nearly all constraint on human lincRNAs. Mol Biol Evol 2014; 31:3164-83. [PMID: 25158797 PMCID: PMC4245815 DOI: 10.1093/molbev/msu249] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
There are two strong and equally important predictors of rates of human protein evolution: The amount the gene is expressed and the proportion of exonic sequence devoted to control splicing, mediated largely by selection on exonic splice enhancer (ESE) motifs. Is the same true for noncoding RNAs, known to be under very weak purifying selection? Prior evidence suggests that selection at splice sites in long intergenic noncoding RNAs (lincRNAs) is important. We now report multiple lines of evidence indicating that the great majority of purifying selection operating on lincRNAs in humans is splice related. Splice-related parameters explain much of the between-gene variation in evolutionary rate in humans. Expression rate is not a relevant predictor, although expression breadth is weakly so. In contrast to protein-coding RNAs, we observe no relationship between evolutionary rate and lincRNA stability. As in protein-coding genes, ESEs are especially abundant near splice junctions and evolve slower than non-ESE sequence equidistant from boundaries. Nearly all constraint in lincRNAs is at exon ends (N.B. the same is not witnessed in Drosophila). Although we cannot definitely answer the question as to why splice-related selection is so important, we find no evidence that splicing might enable the nonsense-mediated decay pathway to capture transcripts incorrectly processed by ribosomes. We find evidence consistent with the notion that splicing modifies the underlying chromatin through recruitment of splice-coupled chromatin modifiers, such as CHD1, which in turn might modulate neighbor gene activity. We conclude that most selection on human lincRNAs is splice mediated and suggest that the possibility of splice-chromatin coupling is worthy of further scrutiny.
Collapse
Affiliation(s)
- Andreas Schüler
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Avazeh T Ghanbarian
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
6
|
Haag KL, James TY, Pombert JF, Larsson R, Schaer TMM, Refardt D, Ebert D. Evolution of a morphological novelty occurred before genome compaction in a lineage of extreme parasites. Proc Natl Acad Sci U S A 2014; 111:15480-5. [PMID: 25313038 PMCID: PMC4217409 DOI: 10.1073/pnas.1410442111] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Intracellular parasitism results in extreme adaptations, whose evolutionary history is difficult to understand, because the parasites and their known free-living relatives are so divergent from one another. Microsporidia are intracellular parasites of humans and other animals, which evolved highly specialized morphological structures, but also extreme physiologic and genomic simplification. They are suggested to be an early-diverging branch on the fungal tree, but comparisons to other species are difficult because their rates of molecular evolution are exceptionally high. Mitochondria in microsporidia have degenerated into organelles called mitosomes, which have lost a genome and the ability to produce ATP. Here we describe a gut parasite of the crustacean Daphnia that despite having remarkable morphological similarity to the microsporidia, has retained genomic features of its fungal ancestors. This parasite, which we name Mitosporidium daphniae gen. et sp. nov., possesses a mitochondrial genome including genes for oxidative phosphorylation, yet a spore stage with a highly specialized infection apparatus--the polar tube--uniquely known only from microsporidia. Phylogenomics places M. daphniae at the root of the microsporidia. A comparative genomic analysis suggests that the reduction in energy metabolism, a prominent feature of microsporidian evolution, was preceded by a reduction in the machinery controlling cell cycle, DNA recombination, repair, and gene expression. These data show that the morphological features unique to M. daphniae and other microsporidia were already present before the lineage evolved the extreme host metabolic dependence and loss of mitochondrial respiration for which microsporidia are well known.
Collapse
Affiliation(s)
- Karen L Haag
- Zoological Institute, Basel University, 4051 Basel, Switzerland; Department of Genetics, Federal University of Rio Grande do Sul, Porto Alegre, 91501-970 RS, Brazil;
| | - Timothy Y James
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109
| | - Jean-François Pombert
- Department of Biological and Chemical Sciences, Illinois Institute of Technology, Chicago, IL 60616
| | - Ronny Larsson
- Department of Biology, University of Lund, SE-223 62 Lund, Sweden; and
| | | | - Dominik Refardt
- Zoological Institute, Basel University, 4051 Basel, Switzerland; Institute of Natural Resource Sciences, Zurich University of Applied Sciences, 8820 Wädenswil, Switzerland
| | - Dieter Ebert
- Zoological Institute, Basel University, 4051 Basel, Switzerland
| |
Collapse
|
7
|
Control of catalytic efficiency by a coevolving network of catalytic and noncatalytic residues. Proc Natl Acad Sci U S A 2014; 111:E2376-83. [PMID: 24912189 DOI: 10.1073/pnas.1322352111] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The active sites of enzymes consist of residues necessary for catalysis and structurally important noncatalytic residues that together maintain the architecture and function of the active site. Examples of evolutionary interactions between catalytic and noncatalytic residues have been difficult to define and experimentally validate due to a general intolerance of these residues to substitution. Here, using computational methods to predict coevolving residues, we identify a network of positions consisting of two catalytic metal-binding residues and two adjacent noncatalytic residues in LAGLIDADG homing endonucleases (LHEs). Distinct combinations of the four residues in the network map to distinct LHE subfamilies, with a striking distribution of the metal-binding Asp (D) and Glu (E) residues. Mutation of these four positions in three LHEs--I-LtrI, I-OnuI, and I-HjeMI--indicate that the combinations of residues tolerated are specific to each enzyme. Kinetic analyses under single-turnover conditions revealed that I-LtrI activity could be modulated over an ∼100-fold range by mutation of residues in the coevolving network. I-LtrI catalytic site variants with low activity could be rescued by compensatory mutations at adjacent noncatalytic sites that restore an optimal coevolving network and vice versa. Our results demonstrate that LHE activity is constrained by an evolutionary barrier of residues with strong context-dependent effects. Creation of optimal coevolving active-site networks is therefore an important consideration in engineering of LHEs and other enzymes.
Collapse
|
8
|
Nielsen MM, Tehler D, Vang S, Sudzina F, Hedegaard J, Nordentoft I, Ørntoft TF, Lund AH, Pedersen JS. Identification of expressed and conserved human noncoding RNAs. RNA (NEW YORK, N.Y.) 2014; 20:236-251. [PMID: 24344320 PMCID: PMC3895275 DOI: 10.1261/rna.038927.113] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2013] [Accepted: 11/07/2013] [Indexed: 06/03/2023]
Abstract
The past decade has shown mammalian genomes to be pervasively transcribed and identified thousands of noncoding (nc) transcripts. It is currently unclear to what extent these transcripts are of functional importance, as experimental functional evidence exists for only a small fraction. Here, we characterize the expression and evolutionary conservation properties of 12,115 known and novel nc transcripts, including structural RNAs, long nc RNAs (lncRNAs), antisense RNAs, EvoFold predictions, ultraconserved elements, and expressed nc regions. Expression levels are evaluated across 12 human tissues using a custom-designed microarray, supplemented with RNAseq. Conservation levels are evaluated at both the base level and at the syntenic level. We combine these measures with epigenetic mark annotations to identify subsets of novel nc transcripts that show characteristics similar to known functional ncRNAs. Few novel nc transcripts show both high expression and conservation levels. However, overall, we observe a positive correlation between expression and both conservation and epigenetic annotations, suggesting that a subset of the expressed transcripts are under purifying selection and likely functional. The identified subsets of expressed and conserved novel nc transcripts may form the basis for further functional characterization.
Collapse
Affiliation(s)
- Morten Muhlig Nielsen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| | - Disa Tehler
- Biotech Research and Innovation Centre, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Søren Vang
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| | - Frantisek Sudzina
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| | - Jakob Hedegaard
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| | - Iver Nordentoft
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| | - Torben Falck Ørntoft
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| | - Anders H. Lund
- Biotech Research and Innovation Centre, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Jakob Skou Pedersen
- Department of Molecular Medicine (MOMA), Aarhus University Hospital, Skejby, DK-8200 Aarhus N, Denmark
| |
Collapse
|
9
|
Abstract
Levels of selective constraint vary among proteins. Although strong constraint on a protein is often attributed to its functional importance, evolutionary rate may also be limited if a protein is fragile, such that a large proportion of amino acid replacements reduce its fitness. To determine the relative contributions of essentiality and fragility to selective constraint, we compared relationships of selection against nonsense mutations (snon) and selection against missense mutations (smis) to protein sequence conservation (Ka). As expected, snon is greater than smis; however, the correlation between smis and Ka is nearly three times stronger than the correlation between snon and Ka. Moreover, examination of relationships to gene expression level, tissue specificity, and number of protein-protein interactions shows that smis is more strongly correlated than snon to all three measures of biological function. Thus, our analysis reveals that slowly evolving proteins are under strong selective constraint primarily because they are fragile, and that this association likely exists because allowing a protein to function improperly, rather than removing it from a biological network, can negatively affect the functions of other molecules it interacts with and their downstream products.
Collapse
Affiliation(s)
- Raquel Assis
- Department of Biology, Pennsylvania State University
| | | |
Collapse
|
10
|
Wu GCT, Chen FC. Determinants of exon-level evolutionary rates in Arabidopsis species. Evol Bioinform Online 2012; 8:389-415. [PMID: 22844194 PMCID: PMC3399485 DOI: 10.4137/ebo.s9743] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
What causes the variations in evolutionary rates is fundamental to molecular evolution. However, in plants, the causes of within-gene evolutionary rate variations remain underexplored. Here we use the principal component regression to examine the contributions of eleven exon features to the within-gene variations in nonsynonymous substitution rate (d(N)), synonymous substitution rate (d(S)), and the d(N)/d(S) ratio in Arabidopsis species. We demonstrate that exon features related to protein structural-functional constraints and mRNA splicing account for the largest proportions of within-gene variations in d(N)/d(S) and d(N). Meanwhile, for d(S), a combination of expression level, exon length, and structural-functional features explains the largest proportion of within-gene variances. Our results suggest that the determinants of within-gene variations differ from those of between-gene variations in evolutionary rates. Furthermore, the relative importance of different exon features also differs between plants and animals. Our study thus may shed a new light on the evolution of plant genes.
Collapse
Affiliation(s)
- Gideon C-T Wu
- Graduate Institute of Life Sciences, National Defense Medical Center, 114 Taiwan
| | | |
Collapse
|
11
|
Ortutay C, Vihinen M. Conserved and quickly evolving immunome genes have different evolutionary paths. Hum Mutat 2012; 33:1456-63. [PMID: 22623381 DOI: 10.1002/humu.22125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2012] [Accepted: 05/15/2012] [Indexed: 12/11/2022]
Abstract
Genetic, transcript, and protein level variations have important functional and evolutionary consequences. We performed systematic data collection and analysis of copy-number variations, single-nucleotide polymorphisms, disease-causing variations, messenger RNA splicing variants, and protein posttranslational modifications for the genes and proteins essential for human immune system. Information about polymorphic and evolutionarily fixed genetic variations was used to group immunome genes to the most conserved and the most quickly changing ones under directed selection during the recent immunome evolution. Gene Ontology terms related to adaptive immunity are associated with gene groups subject to recent directing selection. In addition, several other characteristics of the immunome genes and proteins in these two categories have statistically significant differences. The presented findings question the usability of directed mouse genes as models for human diseases and conditions and shed light on the fine tuning of human immunity and its diverse functions.
Collapse
Affiliation(s)
- Csaba Ortutay
- Institute of Biomedical Technology, University of Tampere, Tampere, Finland
| | | |
Collapse
|
12
|
Juritz E, Palopoli N, Fornasari MS, Fernandez-Alberti S, Parisi G. Protein Conformational Diversity Modulates Sequence Divergence. Mol Biol Evol 2012; 30:79-87. [DOI: 10.1093/molbev/mss080] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
13
|
Evolutionary systems biology: historical and philosophical perspectives on an emerging synthesis. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 751:1-28. [PMID: 22821451 DOI: 10.1007/978-1-4614-3567-9_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Systems biology (SB) is at least a decade old now and maturing rapidly. A more recent field, evolutionary systems biology (ESB), is in the process of further developing system-level approaches through the expansion of their explanatory and potentially predictive scope. This chapter will outline the varieties of ESB existing today by tracing the diverse roots and fusions that make up this integrative project. My approach is philosophical and historical. As well as examining the recent origins of ESB, I will reflect on its central features and the different clusters of research it comprises. In its broadest interpretation, ESB consists of five overlapping approaches: comparative and correlational ESB; network architecture ESB; network property ESB; population genetics ESB; and finally, standard evolutionary questions answered with SB methods. After outlining each approach with examples, I will examine some strong general claims about ESB, particularly that it can be viewed as the next step toward a fuller modern synthesis of evolutionary biology (EB), and that it is also the way forward for evolutionary and systems medicine. I will conclude with a discussion of whether the emerging field of ESB has the capacity to combine an even broader scope of research aims and efforts than it presently does.
Collapse
|
14
|
Lobkovsky AE, Wolf YI, Koonin EV. Predictability of evolutionary trajectories in fitness landscapes. PLoS Comput Biol 2011; 7:e1002302. [PMID: 22194675 PMCID: PMC3240586 DOI: 10.1371/journal.pcbi.1002302] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 10/29/2011] [Indexed: 11/19/2022] Open
Abstract
Experimental studies on enzyme evolution show that only a small fraction of all possible mutation trajectories are accessible to evolution. However, these experiments deal with individual enzymes and explore a tiny part of the fitness landscape. We report an exhaustive analysis of fitness landscapes constructed with an off-lattice model of protein folding where fitness is equated with robustness to misfolding. This model mimics the essential features of the interactions between amino acids, is consistent with the key paradigms of protein folding and reproduces the universal distribution of evolutionary rates among orthologous proteins. We introduce mean path divergence as a quantitative measure of the degree to which the starting and ending points determine the path of evolution in fitness landscapes. Global measures of landscape roughness are good predictors of path divergence in all studied landscapes: the mean path divergence is greater in smooth landscapes than in rough ones. The model-derived and experimental landscapes are significantly smoother than random landscapes and resemble additive landscapes perturbed with moderate amounts of noise; thus, these landscapes are substantially robust to mutation. The model landscapes show a deficit of suboptimal peaks even compared with noisy additive landscapes with similar overall roughness. We suggest that smoothness and the substantial deficit of peaks in the fitness landscapes of protein evolution are fundamental consequences of the physics of protein folding. Is evolution deterministic, hence predictable, or stochastic, that is unpredictable? What would happen if one could “replay the tape of evolution”: will the outcomes of evolution be completely different or is evolution so constrained that history will be repeated? Arguably, these questions are among the most intriguing and most difficult in evolutionary biology. In other words, the predictability of evolution depends on the fraction of the trajectories on fitness landscapes that are accessible for evolutionary exploration. Because direct experimental investigation of fitness landscapes is technically challenging, the available studies only explore a minuscule portion of the landscape for individual enzymes. We therefore sought to investigate the topography of fitness landscapes within the framework of a previously developed model of protein folding and evolution where fitness is equated with robustness to misfolding. We show that model-derived and experimental landscapes are significantly smoother than random landscapes and resemble moderately perturbed additive landscapes; thus, these landscapes are substantially robust to mutation. The model landscapes show a deficit of suboptimal peaks even compared with noisy additive landscapes with similar overall roughness. Thus, the smoothness and substantial deficit of peaks in fitness landscapes of protein evolution could be fundamental consequences of the physics of protein folding.
Collapse
Affiliation(s)
- Alexander E. Lobkovsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
15
|
Luisi P, Alvarez-Ponce D, Dall'Olio GM, Sikora M, Bertranpetit J, Laayouni H. Network-Level and Population Genetics Analysis of the Insulin/TOR Signal Transduction Pathway Across Human Populations. Mol Biol Evol 2011; 29:1379-92. [DOI: 10.1093/molbev/msr298] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
|
16
|
Managadze D, Rogozin IB, Chernikova D, Shabalina SA, Koonin EV. Negative correlation between expression level and evolutionary rate of long intergenic noncoding RNAs. Genome Biol Evol 2011; 3:1390-404. [PMID: 22071789 PMCID: PMC3242500 DOI: 10.1093/gbe/evr116] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Mammalian genomes contain numerous genes for long noncoding RNAs (lncRNAs). The functions of the lncRNAs remain largely unknown but their evolution appears to be constrained by purifying selection, albeit relatively weakly. To gain insights into the mode of evolution and the functional range of the lncRNA, they can be compared with much better characterized protein-coding genes. The evolutionary rate of the protein-coding genes shows a universal negative correlation with expression: highly expressed genes are on average more conserved during evolution than the genes with lower expression levels. This correlation was conceptualized in the misfolding-driven protein evolution hypothesis according to which misfolding is the principal cost incurred by protein expression. We sought to determine whether long intergenic ncRNAs (lincRNAs) follow the same evolutionary trend and indeed detected a moderate but statistically significant negative correlation between the evolutionary rate and expression level of human and mouse lincRNA genes. The magnitude of the correlation for the lincRNAs is similar to that for equal-sized sets of protein-coding genes with similar levels of sequence conservation. Additionally, the expression level of the lincRNAs is significantly and positively correlated with the predicted extent of lincRNA molecule folding (base-pairing), however, the contributions of evolutionary rates and folding to the expression level are independent. Thus, the anticorrelation between evolutionary rate and expression level appears to be a general feature of gene evolution that might be caused by similar deleterious effects of protein and RNA misfolding and/or other factors, for example, the number of interacting partners of the gene product.
Collapse
Affiliation(s)
- David Managadze
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| | | | | | | | | |
Collapse
|
17
|
Peralta H, Guerrero G, Aguilar A, Mora J. Sequence variability of Rhizobiales orthologs and relationship with physico-chemical characteristics of proteins. Biol Direct 2011; 6:48. [PMID: 21970442 PMCID: PMC3198989 DOI: 10.1186/1745-6150-6-48] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2011] [Accepted: 10/04/2011] [Indexed: 12/03/2022] Open
Abstract
Background Chromosomal orthologs can reveal the shared ancestral gene set and their evolutionary trends. Additionally, physico-chemical properties of encoded proteins could provide information about functional adaptation and ecological niche requirements. Results We analyzed 7080 genes (five groups of 1416 orthologs each) from Rhizobiales species (S. meliloti, R. etli, and M. loti, plant symbionts; A. tumefaciens, a plant pathogen; and B. melitensis, an animal pathogen). We evaluated their phylogenetic relationships and observed three main topologies. The first, with closer association of R. etli to A. tumefaciens; the second with R. etli closer to S. meliloti; and the third with A. tumefaciens and S. meliloti as the closest pair. This was not unusual, given the close relatedness of these three species. We calculated the synonymous (dS) and nonsynonymous (dN) substitution rates of these orthologs, and found that informational and metabolic functions showed relatively low dN rates; in contrast, genes from hypothetical functions and cellular processes showed high dN rates. An alternative measure of sequence variability, percentage of changes by species, was used to evaluate the most specific proportion of amino acid residues from alignments. When dN was compared with that measure a high correlation was obtained, revealing that much of evolutive information was extracted with the percentage of changes by species at the amino acid level. By analyzing the sequence variability of orthologs with a set of five properties (polarity, electrostatic charge, formation of secondary structures, molecular volume, and amino acid composition), we found that physico-chemical characteristics of proteins correlated with specific functional roles, and association of species did not follow their typical phylogeny, probably reflecting more adaptation to their life styles and niche preferences. In addition, orthologs with low dN rates had residues with more positive values of polarity, volume and electrostatic charge. Conclusions These findings revealed that even when orthologs perform the same function in each genomic background, their sequences reveal important evolutionary tendencies and differences related to adaptation. This article was reviewed by: Dr. Purificación López-García, Prof. Jeffrey Townsend (nominated by Dr. J. Peter Gogarten), and Ms. Olga Kamneva.
Collapse
Affiliation(s)
- Humberto Peralta
- Programa de Genómica Funcional de Procariotes, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Apdo, postal 565-A, Cuernavaca, Morelos, México
| | | | | | | |
Collapse
|
18
|
Koonin EV, Wolf YI. Constraints and plasticity in genome and molecular-phenome evolution. Nat Rev Genet 2011; 11:487-98. [PMID: 20548290 DOI: 10.1038/nrg2810] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Multiple constraints variously affect different parts of the genomes of diverse life forms. The selective pressures that shape the evolution of viral, archaeal, bacterial and eukaryotic genomes differ markedly, even among relatively closely related animal and bacterial lineages; by contrast, constraints affecting protein evolution seem to be more universal. The constraints that shape the evolution of genomes and phenomes are complemented by the plasticity and robustness of genome architecture, expression and regulation. Taken together, these findings are starting to reveal complex networks of evolutionary processes that must be integrated to attain a new synthesis of evolutionary biology.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | |
Collapse
|
19
|
Liberles DA, Tisdell MDM, Grahnen JA. Binding constraints on the evolution of enzymes and signalling proteins: the important role of negative pleiotropy. Proc Biol Sci 2011; 278:1930-5. [PMID: 21490020 DOI: 10.1098/rspb.2010.2637] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A number of biophysical and population-genetic processes influence amino acid substitution rates. It is commonly recognized that proteins must fold into a native structure with preference over an unfolded state, and must bind to functional interacting partners favourably to function properly. What is less clear is how important folding and binding specificity are to amino acid substitution rates. A hypothesis of the importance of binding specificity in constraining sequence and functional evolution is presented. Examples include an evolutionary simulation of a population of SH2 sequences evolved by threading through the structure and binding to a native ligand, as well as SH3 domain signalling in yeast and selection for specificity in enzymatic reactions. An example in vampire bats where negative pleiotropy appears to have been adaptive is presented. Finally, considerations of compartmentalization and macromolecular crowding on negative pleiotropy are discussed.
Collapse
Affiliation(s)
- David A Liberles
- Department of Molecular Biology, University of Wyoming, Laramie, WY 82071, USA.
| | | | | |
Collapse
|
20
|
Theis FJ, Latif N, Wong P, Frishman D. Complex principal component and correlation structure of 16 yeast genomic variables. Mol Biol Evol 2011; 28:2501-12. [PMID: 21444651 DOI: 10.1093/molbev/msr077] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
A quickly growing number of characteristics reflecting various aspects of gene function and evolution can be either measured experimentally or computed from DNA and protein sequences. The study of pairwise correlations between such quantitative genomic variables as well as collective analysis of their interrelations by multidimensional methods have delivered crucial insights into the processes of molecular evolution. Here, we present a principal component analysis (PCA) of 16 genomic variables from Saccharomyces cerevisiae, the largest data set analyzed so far. Because many missing values and potential outliers hinder the direct calculation of principal components, we introduce the application of Bayesian PCA. We confirm some of the previously established correlations, such as evolutionary rate versus protein expression, and reveal new correlations such as those between translational efficiency, phosphorylation density, and protein age. Although the first principal component primarily contrasts genomic change and protein expression, the second component separates variables related to gene existence and expressed protein functions. Enrichment analysis on genes affecting variable correlations unveils classes of influential genes. For example, although ribosomal and nuclear transport genes make important contributions to the correlation between protein isoelectric point and molecular weight, protein synthesis and amino acid metabolism genes help cause the lack of significant correlation between propensity for gene loss and protein age. We present the novel Quagmire database (Quantitative Genomics Resource) which allows exploring relationships between more genomic variables in three model organisms-Escherichia coli, S. cerevisiae, and Homo sapiens (http://webclu.bio.wzw.tum.de:18080/quagmire).
Collapse
Affiliation(s)
- Fabian J Theis
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Bioinformatics and Systems Biology, Ingolstädter Landstraße 1, Neuherberg, Germany
| | | | | | | |
Collapse
|
21
|
Yang JR, Zhuang SM, Zhang J. Impact of translational error-induced and error-free misfolding on the rate of protein evolution. Mol Syst Biol 2011; 6:421. [PMID: 20959819 PMCID: PMC2990641 DOI: 10.1038/msb.2010.78] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 08/31/2010] [Indexed: 11/26/2022] Open
Abstract
Theoretical calculations suggest that, in addition to translational error-induced protein misfolding, a non-negligible fraction of misfolded proteins are error free. We propose that the anticorrelation between the expression level of a protein and its rate of sequence evolution be explained by an overarching protein-misfolding-avoidance hypothesis that includes selection against both error-induced and error-free protein misfolding, and verify this model by a molecular-level evolutionary simulation. We provide strong empirical evidence for the protein-misfolding-avoidance hypothesis, including a positive correlation between protein expression level and stability, enrichment of misfolding-minimizing codons and amino acids in highly expressed genes, and stronger evolutionary conservation of residues in which nonsynonymous changes are more likely to increase protein misfolding.
The rate of protein sequence evolution has long been of central interest to molecular evolutionists. Different proteins of the same species evolve at vastly different rates, which is commonly explained by a variation in functional constraint among different proteins (Kimura and Ohta, 1974). However, it is unclear how to quantify the functional constraint of a protein from the knowledge of its function. In the past decade, various types of genomic data from model organisms have been examined to look for the determinants of the rate of protein sequence evolution. The most unexpected discovery was a very strong anticorrelation between the expression level and evolutionary rate of a protein (E–R anticorrelation) (Pal et al, 2001). The prevailing explanation of the E–R anticorrelation is the translational robustness hypothesis (Drummond et al, 2005). This hypothesis posits that mistranslation induces protein misfolding, which is toxic to cells (Figure 1). Consequently, highly expressed proteins are under stronger pressures to be translationally robust and thus are more constrained in sequence evolution. However, the impact of the other source of misfolded proteins, translational error-free proteins (Figure 1), has not been evaluated. By theoretical calculation, computer simulation, and empirical data analysis, we examined the role of selection against both error-induced and error-free protein misfolding in creating the E–R correlation. Our theoretical calculations suggested that a non-negligible fraction of misfolded proteins are error free. We estimated that when a protein is not very stable, on average ∼20% of misfolded molecules are error free. However, when a protein is very stable, this fraction reduces to ∼5%, which is probably a result of natural selection against protein misfolding. We conducted a molecular-level evolutionary simulation (Figure 2A) using three different schemes: error-induced misfolding only, error-free misfolding only, and both types of misfolding. As expected, results from the first simulation are similar to those from a previous study that considers only error-induced misfolding (Drummond and Wilke, 2008). Interestingly, the second and third simulations can also generate the same patterns, including a positive correlation between the protein expression level and the unfolding energy (ΔG) of the error-free protein (Figure 2B), a negative correlation between the expression level and the fraction of protein molecules that misfold after being mistranslated (Figure 2C), a negative correlation between ΔG and the evolutionary rate (Figure 2D), and a negative correlation between the expression level and the evolutionary rate (i.e., the E–R anticorrelation) (Figure 2E). Furthermore, we found that selection against protein misfolding is more effective in reducing error-free misfolding than error-induced misfolding. Based on these results, we propose that an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the prevailing translational robustness hypothesis, which considers only error-induced misfolding. We tested three key predictions of the protein-misfolding-avoidance hypotheses using yeast data. First, we showed that, consistent with our prediction, a positive correlation exists between the protein expression level and stability, which is measured by the unfolding energy or melting temperature. In addition, protein expression level is negatively correlated with protein aggregation propensity. Second, we found that codons minimizing protein misfolding are used more frequently in highly expressed proteins than in lowly expressed ones. Third, we showed that, within the same protein, amino acid residues in which random nonsynonymous mutations are more likely to increase protein misfolding are evolutionarily more conserved. Together, these results provide unambiguous evidence that avoidance of both error-induced and error-free protein misfolding is a major source of the E–R anticorrelation and that protein stability and mistranslation have important roles in protein evolution. What determines the rate of protein evolution is a fundamental question in biology. Recent genomic studies revealed a surprisingly strong anticorrelation between the expression level of a protein and its rate of sequence evolution. This observation is currently explained by the translational robustness hypothesis in which the toxicity of translational error-induced protein misfolding selects for higher translational robustness of more abundant proteins, which constrains sequence evolution. However, the impact of error-free protein misfolding has not been evaluated. We estimate that a non-negligible fraction of misfolded proteins are error free and demonstrate by a molecular-level evolutionary simulation that selection against protein misfolding results in a greater reduction of error-free misfolding than error-induced misfolding. Thus, an overarching protein-misfolding-avoidance hypothesis that includes both sources of misfolding is superior to the translational robustness hypothesis. We show that misfolding-minimizing amino acids are preferentially used in highly abundant yeast proteins and that these residues are evolutionarily more conserved than other residues of the same proteins. These findings provide unambiguous support to the role of protein-misfolding-avoidance in determining the rate of protein sequence evolution.
Collapse
Affiliation(s)
- Jian-Rong Yang
- Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China
| | | | | |
Collapse
|
22
|
Abstract
There is great variation in the rates of sequence evolution among proteins encoded by the same genome. The strongest correlate of evolutionary rate is expression level: highly expressed proteins tend to evolve slowly. This observation has led to the proposal that a major determinant of protein evolutionary rate involves the toxic effects of protein that misfolds due to transcriptional and translational errors (the mistranslation-induced misfolding [MIM] hypothesis). Here, I present a model that explains the correlation of evolutionary rate and expression level by selection for function. The basis of this model is that selection keeps expression levels near optima that reflect a trade-off between beneficial effects of the protein's function and some nonspecific cost of expression (e.g., the biochemical cost of synthesizing protein). Simulations confirm the predictions of the model. Like the MIM hypothesis, this model predicts several other relationships that are observed empirically. Although the model is based on selection for protein function, it is consistent with findings that a protein's rate of evolution is at most weakly correlated with its importance for fitness as measured by gene knockout experiments.
Collapse
Affiliation(s)
- Joshua L Cherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
| |
Collapse
|
23
|
Abstract
Molecular chaperones are highly conserved and ubiquitous proteins that help other proteins in the cell to fold. Pioneering work by Rutherford and Lindquist suggested that the chaperone Hsp90 could buffer (i.e., suppress) phenotypic variation in its client proteins and that alternate periods of buffering and expression of these variants might be important in adaptive evolution. More recently, Tokuriki and Tawfik presented an explicit mechanism for chaperone-dependent evolution, in which the Escherichia coli chaperonin GroEL facilitated the folding of clients that had accumulated structurally destabilizing but neofunctionalizing mutations in the protein core. But how important an evolutionary force is chaperonin-mediated buffering in nature? Here, we address this question by modeling the per-residue evolutionary rate of the crystallized E. coli proteome, evaluating the relative contributions of chaperonin buffering, functional importance, and structural features such as residue contact density. Previous findings suggest an interaction between codon bias and GroEL in limiting the effects of misfolding errors. Our results suggest that the buffering of deleterious mutations by GroEL increases the evolutionary rate of client proteins. We then examine the evolutionary fate of GroEL clients in the Mycoplasmas, a group of bacteria containing the only known organisms that lack chaperonins. We show that GroEL was lost once in the common ancestor of a monophyletic subgroup of Mycoplasmas, and we evaluate the effect of this loss on the subsequent evolution of client proteins, providing evidence that client homologs in 11 Mycoplasma species have lost their obligate dependency on GroEL for folding. Our analyses indicate that individual molecules such as chaperonins can have significant effects on proteome evolution through their modulation of protein folding.
Collapse
Affiliation(s)
- Tom A Williams
- Department of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | | |
Collapse
|