1
|
Hamamsy T, Morton JT, Blackwell R, Berenberg D, Carriero N, Gligorijevic V, Strauss CEM, Leman JK, Cho K, Bonneau R. Protein remote homology detection and structural alignment using deep learning. Nat Biotechnol 2023:10.1038/s41587-023-01917-2. [PMID: 37679542 DOI: 10.1038/s41587-023-01917-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 07/26/2023] [Indexed: 09/09/2023]
Abstract
Exploiting sequence-structure-function relationships in biotechnology requires improved methods for aligning proteins that have low sequence similarity to previously annotated proteins. We develop two deep learning methods to address this gap, TM-Vec and DeepBLAST. TM-Vec allows searching for structure-structure similarities in large sequence databases. It is trained to accurately predict TM-scores as a metric of structural similarity directly from sequence pairs without the need for intermediate computation or solution of structures. Once structurally similar proteins have been identified, DeepBLAST can structurally align proteins using only sequence information by identifying structurally homologous regions between proteins. It outperforms traditional sequence alignment methods and performs similarly to structure-based alignment methods. We show the merits of TM-Vec and DeepBLAST on a variety of datasets, including better identification of remotely homologous proteins compared with state-of-the-art sequence alignment and structure prediction methods.
Collapse
Affiliation(s)
- Tymor Hamamsy
- Center for Data Science, New York University, New York, NY, USA
| | - James T Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Robert Blackwell
- Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Daniel Berenberg
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
- Prescient Design, New York, NY, USA
| | - Nicholas Carriero
- Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
| | | | | | - Julia Koehler Leman
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Kyunghyun Cho
- Center for Data Science, New York University, New York, NY, USA.
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA.
- Prescient Design, New York, NY, USA.
- CIFAR, Toronto, Ontario, Canada.
| | - Richard Bonneau
- Center for Data Science, New York University, New York, NY, USA.
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA.
- Prescient Design, New York, NY, USA.
- Department of Biology, New York University, New York, NY, USA.
| |
Collapse
|
2
|
Revisiting Papillomavirus Taxonomy: A Proposal for Updating the Current Classification in Line with Evolutionary Evidence. Viruses 2022; 14:v14102308. [PMID: 36298863 PMCID: PMC9612317 DOI: 10.3390/v14102308] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Revised: 10/11/2022] [Accepted: 10/13/2022] [Indexed: 11/06/2022] Open
Abstract
Papillomaviruses infect a wide array of animal hosts and are responsible for roughly 5% of all human cancers. Comparative genomics between different virus types belonging to specific taxonomic groupings (e.g., species, and genera) has the potential to illuminate physiological differences between viruses with different biological outcomes. Likewise, extrapolation of features between related viruses can be very powerful but requires a solid foundation supporting the evolutionary relationships between viruses. The current papillomavirus classification system is based on pairwise sequence identity. However, with the advent of metagenomics as facilitated by high-throughput sequencing and molecular tools of enriching circular DNA molecules using rolling circle amplification, there has been a dramatic increase in the described diversity of this viral family. Not surprisingly, this resulted in a dramatic increase in absolute number of viral types (i.e., sequences sharing <90% L1 gene pairwise identity). Many of these novel viruses are the sole member of a novel species within a novel genus (i.e., singletons), highlighting that we have only scratched the surface of papillomavirus diversity. I will discuss how this increase in observed sequence diversity complicates papillomavirus classification. I will propose a potential solution to these issues by explicitly basing the species and genera classification on the evolutionary history of these viruses based on the core viral proteins (E1, E2, and L1) of papillomaviruses. This strategy means that it is possible that a virus identified as the closest neighbor based on the E1, E2, L1 phylogenetic tree, is not the closest neighbor based on L1 nucleotide identity. In this case, I propose that a virus would be considered a novel type if it shares less than 90% identity with its closest neighbors in the E1, E2, L1 phylogenetic tree.
Collapse
|
3
|
Zhang H, Sigeman H, Hansson B. Assessment of phylogenetic approaches to study the timing of recombination cessation on sex chromosomes. J Evol Biol 2022; 35:1721-1733. [PMID: 35895083 PMCID: PMC10086819 DOI: 10.1111/jeb.14068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 06/18/2022] [Accepted: 06/26/2022] [Indexed: 12/01/2022]
Abstract
The evolution of sex chromosomes is hypothesized to be punctuated by consecutive recombination cessation events, forming "evolutionary strata" that ceased to recombine at different time points. The demarcation of evolutionary strata is often assessed by estimates of the timing of recombination cessation (tRC ) along the sex chromosomes, commonly inferred from the level of synonymous divergence or with species phylogenies at gametologous (X-Y or Z-W) sequence data. However, drift and selection affect sequences unpredictably and introduce uncertainty when inferring tRC . Here, we assess two alternative phylogenetic approaches to estimate tRC ; (i) the expected likelihood weight (ELW) approach that finds the most likely topology among a set of hypothetical topologies and (ii) the BEAST approach that estimates tRC with specified calibration priors on a reference species topology. By using Z and W gametologs of an old and a young evolutionary stratum on the neo-sex chromosome of Sylvioidea songbirds, we show that the ELW and BEAST approaches yield similar tRC estimates, and that both outperform two frequently applied approaches utilizing synonymous substitution rates (dS) and maximum likelihood (ML) trees, respectively. Moreover, we demonstrate that both ELW and BEAST provide more precise tRC estimates when sequences of multiple species are included in the analyses.
Collapse
Affiliation(s)
- Hongkai Zhang
- Department of Biology, Lund University, Lund, Sweden
| | - Hanna Sigeman
- Department of Biology, Lund University, Lund, Sweden
| | - Bengt Hansson
- Department of Biology, Lund University, Lund, Sweden
| |
Collapse
|
4
|
Yu Y, Li Y, Dong Y, Wang X, Li C, Jiang W. Natural selection on synonymous mutations in SARS-CoV-2 and the impact on estimating divergence time. Future Virol 2021. [PMCID: PMC8132620 DOI: 10.2217/fvl-2021-0078] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
To adapt to human host environment, synonymous mutations in SARS-CoV-2 are shaped by tRNA selection, energy cost and RNA structure.
Collapse
Affiliation(s)
- Yuanyuan Yu
- Department of Anesthesiology, Qingdao Haici Hospital, Qingdao, Shandong, China
| | - Yan Li
- Department of Cardiology, Qingdao Center Hospital, Qingdao, Shandong, China
| | - Yu Dong
- Department of Intervention, Qingdao Center Hospital, Qingdao, Shandong, China
| | - Xuekun Wang
- Department of Cardiology, Qingdao Center Hospital, Qingdao, Shandong, China
| | - Chunxiao Li
- Department of Cardiology, Qingdao Center Hospital, Qingdao, Shandong, China
| | - Wenqing Jiang
- Department of Respiratory Diseases, Qingdao Haici Hospital, Qingdao, Shandong, China
| |
Collapse
|
5
|
Li Q, Li J, Yu CP, Chang S, Xie LL, Wang S. Synonymous mutations that regulate translation speed might play a non-negligible role in liver cancer development. BMC Cancer 2021; 21:388. [PMID: 33836673 PMCID: PMC8033552 DOI: 10.1186/s12885-021-08131-w] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 03/30/2021] [Indexed: 01/11/2023] Open
Abstract
Background Synonymous mutations do not change the protein sequences. Automatically, they have been regarded as neutral events and are ignored in the mutation-based cancer studies. However, synonymous mutations will change the codon optimality, resulting in altered translational velocity. Methods We fully utilized the transcriptome and translatome of liver cancer and normal tissue from ten patients. We profiled the mutation spectrum and examined the effect of synonymous mutations on translational velocity. Results Synonymous mutations that increase the codon optimality significantly enhanced the translational velocity, and were enriched in oncogenes. Meanwhile, synonymous mutations decreasing codon optimality slowed down translation, and were enriched in tumor suppressor genes. These synonymous mutations significantly contributed to the translational changes in tumor samples compared to normal samples. Conclusions Synonymous mutations might play a role in liver cancer development by altering codon optimality and translational velocity. Synonymous mutations should no longer be ignored in the genome-wide studies.
Collapse
Affiliation(s)
- Qun Li
- Department of interventional radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Jian Li
- Department of interventional radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Chun-Peng Yu
- Department of interventional radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Shuai Chang
- Department of interventional radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Ling-Ling Xie
- Department of interventional radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Song Wang
- Department of interventional radiology, The Affiliated Hospital of Qingdao University, Qingdao, China.
| |
Collapse
|
6
|
Tyagi K, Chakraborty R, Cameron SL, Sweet AD, Chandra K, Kumar V. Rearrangement and evolution of mitochondrial genomes in Thysanoptera (Insecta). Sci Rep 2020; 10:695. [PMID: 31959910 PMCID: PMC6971079 DOI: 10.1038/s41598-020-57705-4] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Accepted: 12/21/2019] [Indexed: 11/11/2022] Open
Abstract
Prior to this study, complete mitochondrial genomes from Order Thysanoptera were restricted to a single family, the Thripidae, resulting in a biased view of their evolution. Here we present the sequences for the mitochondrial genomes of four additional thrips species, adding three extra families and an additional subfamily, thus greatly improving taxonomic coverage. Thrips mitochondrial genomes are marked by high rates of gene rearrangement, duplications of the control region and tRNA mutations. Derived features of mitochondrial tRNAs in thrips include gene duplications, anticodon mutations, loss of secondary structures and high gene translocation rates. Duplicated control regions are found in the Aeolothripidae and the 'core' Thripinae clade but do not appear to promote gene rearrangement as previously proposed. Phylogenetic analysis of thrips mitochondrial sequence data supports the monophyly of two suborders, a sister-group relationship between Stenurothripidae and Thripidae, and suggests a novel set of relationships between thripid genera. Ancestral state reconstructions indicate that genome rearrangements are common, with just eight gene blocks conserved between any thrips species and the ancestral insect mitochondrial genome. Conversely, 71 derived rearrangements are shared between at least two species, and 24 of these are unambiguous synapomorphies for clades identified by phylogenetic analysis. While the reconstructed sequence of genome rearrangements among the protein-coding and ribosomal RNA genes could be inferred across the phylogeny, direct inference of phylogeny from rearrangement data in MLGO resulted in a highly discordant set of relationships inconsistent with both sequence-based phylogenies and previous morphological analysis. Given the demonstrated rates of genomic evolution within thrips, extensive sampling is needed to fully understand these phenomena across the order.
Collapse
Affiliation(s)
- Kaomud Tyagi
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, Kolkata, 750053, India
| | - Rajasree Chakraborty
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, Kolkata, 750053, India
| | - Stephen L Cameron
- Department of Entomology, Purdue University, West Lafayette, IN, 47907, USA
| | - Andrew D Sweet
- Department of Entomology, Purdue University, West Lafayette, IN, 47907, USA
| | - Kailash Chandra
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, Kolkata, 750053, India
| | - Vikas Kumar
- Centre for DNA Taxonomy, Molecular Systematics Division, Zoological Survey of India, Kolkata, 750053, India.
| |
Collapse
|
7
|
Turissini DA, McGirr JA, Patel SS, David JR, Matute DR. The Rate of Evolution of Postmating-Prezygotic Reproductive Isolation in Drosophila. Mol Biol Evol 2018; 35:312-334. [PMID: 29048573 PMCID: PMC5850467 DOI: 10.1093/molbev/msx271] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Reproductive isolation is an intrinsic aspect of species formation. For that reason, the identification of the precise isolating traits, and the rates at which they evolve, is crucial to understanding how species originate and persist. Previous work has measured the rates of evolution of prezygotic and postzygotic barriers to gene flow, yet no systematic analysis has studied the rates of evolution of postmating-prezygotic (PMPZ) barriers. We measured the magnitude of two barriers to gene flow that act after mating occurs but before fertilization. We also measured the magnitude of a premating barrier (female mating rate in nonchoice experiments) and two postzygotic barriers (hybrid inviability and hybrid sterility) for all pairwise crosses of all nine known extant species within the melanogaster subgroup. Our results indicate that PMPZ isolation evolves faster than hybrid inviability but slower than premating isolation. Next, we partition postzygotic isolation into different components and find that, as expected, hybrid sterility evolves faster than hybrid inviability. These results lend support for the hypothesis that, in Drosophila, reproductive isolation mechanisms (RIMs) that act early in reproduction (or in development) tend to evolve faster than those that act later in the reproductive cycle. Finally, we tested whether there was evidence for reinforcing selection at any RIM. We found no evidence for generalized evolution of reproductive isolation via reinforcement which indicates that there is no pervasive evidence of this evolutionary process. Our results indicate that PMPZ RIMs might have important evolutionary consequences in initiating speciation and in the persistence of new species.
Collapse
Affiliation(s)
- David A Turissini
- Department of Biology, University of North Carolina, Chapel Hill, NC
| | - Joseph A McGirr
- Department of Biology, University of North Carolina, Chapel Hill, NC
| | - Sonali S Patel
- Department of Biology, University of North Carolina, Chapel Hill, NC
| | - Jean R David
- Laboratoire Evolution, Génomes, Comportement, Ecologie (EGCE) CNRS, IRD, Univ. Paris-sud, Université Paris-Saclay, 91198 Gif sur Yvette, France
- Institut de Systématique, Evolution, Biodiversité, UMR 7205, CNRS, MNHN, UPMC, EPHE, Muséum National d’Histoire Naturelle, Sorbonne Universités, rue Buffon, 75005, Paris, France
| | - Daniel R Matute
- Department of Biology, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
8
|
Tong KJ, Duchêne S, Lo N, Ho SYW. The impacts of drift and selection on genomic evolution in insects. PeerJ 2017; 5:e3241. [PMID: 28462044 PMCID: PMC5410144 DOI: 10.7717/peerj.3241] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Accepted: 03/28/2017] [Indexed: 11/20/2022] Open
Abstract
Genomes evolve through a combination of mutation, drift, and selection, all of which act heterogeneously across genes and lineages. This leads to differences in branch-length patterns among gene trees. Genes that yield trees with the same branch-length patterns can be grouped together into clusters. Here, we propose a novel phylogenetic approach to explain the factors that influence the number and distribution of these gene-tree clusters. We apply our method to a genomic dataset from insects, an ancient and diverse group of organisms. We find some evidence that when drift is the dominant evolutionary process, each cluster tends to contain a large number of fast-evolving genes. In contrast, strong negative selection leads to many distinct clusters, each of which contains only a few slow-evolving genes. Our work, although preliminary in nature, illustrates the use of phylogenetic methods to shed light on the factors driving rate variation in genomic evolution.
Collapse
Affiliation(s)
- K Jun Tong
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Sebastián Duchêne
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia.,Centre for Systems Genomics, University of Melbourne, Melbourne, Victoria, Australia
| | - Nathan Lo
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
9
|
Abstract
The horseshoe effect is often considered an artifact of dimensionality reduction. We show that this is not true in the case for microbiome data and that, in fact, horseshoes can help analysts discover microbial niches across environments. The horseshoe effect is a phenomenon that has long intrigued ecologists. The effect was commonly thought to be an artifact of dimensionality reduction, and multiple techniques were developed to unravel this phenomenon and simplify interpretation. Here, we provide evidence that horseshoes arise as a consequence of distance metrics that saturate—a familiar concept in other fields but new to microbial ecology. This saturation property loses information about community dissimilarity, simply because it cannot discriminate between samples that do not share any common features. The phenomenon illuminates niche differentiation in microbial communities and indicates species turnover along environmental gradients. Here we propose a rationale for the observed horseshoe effect from multiple dimensionality reduction techniques applied to simulations, soil samples, and samples from postmortem mice. An in-depth understanding of this phenomenon allows targeting of niche differentiation patterns from high-level ordination plots, which can guide conventional statistical tools to pinpoint microbial niches along environmental gradients. IMPORTANCE The horseshoe effect is often considered an artifact of dimensionality reduction. We show that this is not true in the case for microbiome data and that, in fact, horseshoes can help analysts discover microbial niches across environments.
Collapse
|
10
|
Pauli T, Vedder L, Dowling D, Petersen M, Meusemann K, Donath A, Peters RS, Podsiadlowski L, Mayer C, Liu S, Zhou X, Heger P, Wiehe T, Hering L, Mayer G, Misof B, Niehuis O. Transcriptomic data from panarthropods shed new light on the evolution of insulator binding proteins in insects : Insect insulator proteins. BMC Genomics 2016; 17:861. [PMID: 27809783 PMCID: PMC5094011 DOI: 10.1186/s12864-016-3205-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 10/25/2016] [Indexed: 01/19/2023] Open
Abstract
Background Body plan development in multi-cellular organisms is largely determined by homeotic genes. Expression of homeotic genes, in turn, is partially regulated by insulator binding proteins (IBPs). While only a few enhancer blocking IBPs have been identified in vertebrates, the common fruit fly Drosophila melanogaster harbors at least twelve different enhancer blocking IBPs. We screened recently compiled insect transcriptomes from the 1KITE project and genomic and transcriptomic data from public databases, aiming to trace the origin of IBPs in insects and other arthropods. Results Our study shows that the last common ancestor of insects (Hexapoda) already possessed a substantial number of IBPs. Specifically, of the known twelve insect IBPs, at least three (i.e., CP190, Su(Hw), and CTCF) already existed prior to the evolution of insects. Furthermore we found GAF orthologs in early branching insect orders, including Zygentoma (silverfish and firebrats) and Diplura (two-pronged bristletails). Mod(mdg4) is most likely a derived feature of Neoptera, while Pita is likely an evolutionary novelty of holometabolous insects. Zw5 appears to be restricted to schizophoran flies, whereas BEAF-32, ZIPIC and the Elba complex, are probably unique to the genus Drosophila. Selection models indicate that insect IBPs evolved under neutral or purifying selection. Conclusions Our results suggest that a substantial number of IBPs either pre-date the evolution of insects or evolved early during insect evolution. This suggests an evolutionary history of insulator binding proteins in insects different to that previously thought. Moreover, our study demonstrates the versatility of the 1KITE transcriptomic data for comparative analyses in insects and other arthropods. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3205-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Thomas Pauli
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany.
| | - Lucia Vedder
- University of Tübingen, Geschwister-Scholl-Platz, 72074, Tübingen, Germany
| | - Daniel Dowling
- Johannes Gutenberg University Mainz, Institute of Molecular Biology (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Malte Petersen
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany
| | - Karen Meusemann
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany.,Department for Evolutionary Biology and Ecology (Institut for Biology I, Zoology), University of Freiburg, Hauptstr. 1, 79104, Freiburg, Germany.,Australian National Insect Collection, CSIRO National Research Collections Australia, Clunies Ross Street, Acton, ACT, 2601, Australia
| | - Alexander Donath
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany
| | - Ralph S Peters
- Zoological Research Museum Alexander Koenig, Arthropod Department, Adenauerallee 160, 53113, Bonn, Germany
| | - Lars Podsiadlowski
- University of Bonn, Institute of Evolutionary Biology and Ecology, An der Immenburg 1, 53121, Bonn, Germany
| | - Christoph Mayer
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany
| | - Shanlin Liu
- China National GeneBank-Shenzhen, BGI-Shenzhen, Shenzhen, Guangdong Province, 518083, China.,Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen, Denmark
| | - Xin Zhou
- Beijing Advanced Innovation Center for Food Nutrition and Human Health, China Agricultural University, Beijing, 100193, China.,College of Food Science and Nutritional Engineering, China Agricultural University, Beijing, 100083, China
| | - Peter Heger
- University of Cologne, Cologne Biocenter, Institute for Genetics, Zülpicher Straße 47a, 50674, Köln, Germany
| | - Thomas Wiehe
- University of Cologne, Cologne Biocenter, Institute for Genetics, Zülpicher Straße 47a, 50674, Köln, Germany
| | - Lars Hering
- Department of Zoology, University of Kassel, Heinrich-Plett-Str. 40, 34132, Kassel, Germany
| | - Georg Mayer
- Department of Zoology, University of Kassel, Heinrich-Plett-Str. 40, 34132, Kassel, Germany
| | - Bernhard Misof
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany
| | - Oliver Niehuis
- Center of Molecular Biodiversity Research, Zoological Research Museum Alexander Koenig, Adenauerallee 160, 51113, Bonn, Germany.
| |
Collapse
|
11
|
Yang W, Qi Y, Fu J. Genetic signals of high-altitude adaptation in amphibians: a comparative transcriptome analysis. BMC Genet 2016; 17:134. [PMID: 27716028 PMCID: PMC5048413 DOI: 10.1186/s12863-016-0440-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 09/20/2016] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND High-altitude adaptation provides an excellent system for studying how organisms cope with multiple environmental stressors and interacting genetic modifications. To explore the genetic basis of high-altitude adaptation in poikilothermic animals, we acquired transcriptome sequences from a high-altitude population and a low-altitude population of the Asiatic toad (Bufo gargarizans). Transcriptome data from another high-altitude amphibian, Rana kukunoris and its low-altitude relative R. chensiensis, which are from a previous study, were also incorporated into our comparative analysis. RESULTS More than 40,000 transcripts were obtained from each transcriptome, and 5107 one-to-one orthologs were identified among the four taxa for comparative analysis. A total of 29 (Bufo) and 33 (Rana) putative positively selected genes were identified for the two high-altitude species, which were mainly concentrated in nutrient metabolism related functions. Using SNP-tagging and FST outlier analysis, we further tested 89 other nutrient metabolism related genes for signatures of natural selection, and found that two genes, CAPN2 and ITPR1, were likely under balancing selection. We did not detect any positively selected genes associated with response to hypoxia. CONCLUSIONS Amphibians clearly employ different genetic mechanisms for high-altitude adaptation compared to endotherms. Modifications of genes associated with nutrient metabolism feature prominently while genes related to hypoxia tolerance appear to be insignificant. Poikilotherms represent the majority of animal diversity, and we hope that our results will provide useful directions for future studies of amphibians as well as other poikilotherms.
Collapse
Affiliation(s)
- Weizhao Yang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, 610041, China.,Present address: Department of Biology, Lund University, 223 62, Lund, Sweden
| | - Yin Qi
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, 610041, China
| | - Jinzhong Fu
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, 610041, China. .,Department of Integrative Biology, University of Guelph, Guelph, N1G 2 W1, ON, Canada.
| |
Collapse
|
12
|
Luisi P, Alvarez-Ponce D, Pybus M, Fares MA, Bertranpetit J, Laayouni H. Recent positive selection has acted on genes encoding proteins with more interactions within the whole human interactome. Genome Biol Evol 2015; 7:1141-54. [PMID: 25840415 PMCID: PMC4419801 DOI: 10.1093/gbe/evv055] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Genes vary in their likelihood to undergo adaptive evolution. The genomic factors that determine adaptability, however, remain poorly understood. Genes function in the context of molecular networks, with some occupying more important positions than others and thus being likely to be under stronger selective pressures. However, how positive selection distributes across the different parts of molecular networks is still not fully understood. Here, we inferred positive selection using comparative genomics and population genetics approaches through the comparison of 10 mammalian and 270 human genomes, respectively. In agreement with previous results, we found that genes with lower network centralities are more likely to evolve under positive selection (as inferred from divergence data). Surprisingly, polymorphism data yield results in the opposite direction than divergence data: Genes with higher centralities are more likely to have been targeted by recent positive selection during recent human evolution. Our results indicate that the relationship between centrality and the impact of adaptive evolution highly depends on the mode of positive selection and/or the evolutionary time-scale.
Collapse
Affiliation(s)
- Pierre Luisi
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - David Alvarez-Ponce
- Integrative Systems Biology Group, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas (CSIC)-Universidad Politécnica de Valencia (UPV), Spain Biology Department, University of Nevada, Reno Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - Marc Pybus
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - Mario A Fares
- Integrative Systems Biology Group, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas (CSIC)-Universidad Politécnica de Valencia (UPV), Spain Smurfit Institute of Genetics, University of Dublin, Trinity College, Ireland
| | - Jaume Bertranpetit
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - Hafid Laayouni
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain Departament de Genètica i de Microbiologia, Grup de Biologia Evolutiva (GBE), Universitat Autonòma de Barcelona, Bellaterra, Spain
| |
Collapse
|
13
|
Shirai K, Inomata N, Mizoiri S, Aibara M, Terai Y, Okada N, Tachida H. High prevalence of non-synonymous substitutions in mtDNA of cichlid fishes from Lake Victoria. Gene 2014; 552:239-45. [PMID: 25241383 DOI: 10.1016/j.gene.2014.09.039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2014] [Revised: 08/28/2014] [Accepted: 09/17/2014] [Indexed: 10/24/2022]
Abstract
When a population size is reduced, genetic drift may fix slightly deleterious mutations, and an increase in nonsynonymous substitution is expected. It has been suggested that past aridity has seriously affected and decreased the populations of cichlid fishes in Lake Victoria, while geographical studies have shown that the water levels in Lake Tanganyika and Lake Malawi have remained fairly constant. The comparably stable environments in the latter two lakes might have kept the populations of cichlid fishes large enough to remove slightly deleterious mutations. The difference in the stability of cichlid fish population sizes between Lake Victoria and the Lakes Tanganyika and Malawi is expected to have caused differences in the nonsynonymous/synonymous ratio, ω (=dN/dS), of the evolutionary rate. Here, we estimated ω and compared it between the cichlids of the three lakes for 13 mitochondrial protein-coding genes using maximum likelihood methods. We found that the lineages of the cichlids in Lake Victoria had a significantly higher ω for several mitochondrial loci. Moreover, positive selection was indicated for several codons in the mtDNA of the Lake Victoria cichlid lineage. Our results indicate that both adaptive and slightly deleterious molecular evolution has taken place in the Lake Victoria cichlids' mtDNA genes, whose nonsynonymous sites are generally conserved.
Collapse
Affiliation(s)
- Kazumasa Shirai
- Graduate School of Systems Life Sciences, Kyushu University, Fukuoka, Japan
| | - Nobuyuki Inomata
- International College of Arts and Sciences, Fukuoka Women's University, Fukuoka, Japan
| | | | - Mitsuto Aibara
- Foundation for Advancement of International Science, Tsukuba, Japan
| | - Yohey Terai
- The Graduate University for Advanced Studies, Kanagawa, Japan
| | - Norihiro Okada
- Foundation for Advancement of International Science, Tsukuba, Japan; Department of Life Sciences, National Cheng Kung University, Tainan 701, Taiwan
| | - Hidenori Tachida
- Department of Biology, Faculty of Sciences, Kyushu University, Fukuoka, Japan.
| |
Collapse
|
14
|
van Mierlo JT, Overheul GJ, Obadia B, van Cleef KWR, Webster CL, Saleh MC, Obbard DJ, van Rij RP. Novel Drosophila viruses encode host-specific suppressors of RNAi. PLoS Pathog 2014; 10:e1004256. [PMID: 25032815 PMCID: PMC4102588 DOI: 10.1371/journal.ppat.1004256] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 06/03/2014] [Indexed: 12/24/2022] Open
Abstract
The ongoing conflict between viruses and their hosts can drive the co-evolution between host immune genes and viral suppressors of immunity. It has been suggested that an evolutionary ‘arms race’ may occur between rapidly evolving components of the antiviral RNAi pathway of Drosophila and viral genes that antagonize it. We have recently shown that viral protein 1 (VP1) of Drosophila melanogaster Nora virus (DmelNV) suppresses Argonaute-2 (AGO2)-mediated target RNA cleavage (slicer activity) to antagonize antiviral RNAi. Here we show that viral AGO2 antagonists of divergent Nora-like viruses can have host specific activities. We have identified novel Nora-like viruses in wild-caught populations of D. immigrans (DimmNV) and D. subobscura (DsubNV) that are 36% and 26% divergent from DmelNV at the amino acid level. We show that DimmNV and DsubNV VP1 are unable to suppress RNAi in D. melanogaster S2 cells, whereas DmelNV VP1 potently suppresses RNAi in this host species. Moreover, we show that the RNAi suppressor activity of DimmNV VP1 is restricted to its natural host species, D. immigrans. Specifically, we find that DimmNV VP1 interacts with D. immigrans AGO2, but not with D. melanogaster AGO2, and that it suppresses slicer activity in embryo lysates from D. immigrans, but not in lysates from D. melanogaster. This species-specific interaction is reflected in the ability of DimmNV VP1 to enhance RNA production by a recombinant Sindbis virus in a host-specific manner. Our results emphasize the importance of analyzing viral RNAi suppressor activity in the relevant host species. We suggest that rapid co-evolution between RNA viruses and their hosts may result in host species-specific activities of RNAi suppressor proteins, and therefore that viral RNAi suppressors could be host-specificity factors. Viruses and their hosts can engage in an evolutionary arms race. Viruses may select for hosts with more effective immune responses, whereas the immune response of the host may select for viruses that evade the immune system. These viral counter-defenses may in turn drive adaptations in host immune genes. A potential outcome of this perpetual cycle is that the interaction between virus and host becomes more specific. In insects, the host antiviral RNAi machinery exerts strong evolutionary pressure that has led to the evolution of viral proteins that can antagonize the RNAi response. We have identified novel viruses that infect different fruit fly species and we show that the RNAi suppressor proteins of these viruses can be specific to their host. Furthermore, we show that these proteins can enhance virus replication in a host-specific manner. These results are in line with the hypothesis that virus-host co-evolution shapes the genomes of both virus and host. Moreover, our results suggest that RNAi suppressor proteins have the potential to determine host specificity of viruses.
Collapse
Affiliation(s)
- Joël T. van Mierlo
- Department of Medical Microbiology, Radboud University Nijmegen Medical Centre, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands
| | - Gijs J. Overheul
- Department of Medical Microbiology, Radboud University Nijmegen Medical Centre, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands
| | - Benjamin Obadia
- Institut Pasteur, Viruses and RNA interference Unit and Centre National de la Recherche Scientifique, UMR 3569, Paris, France
| | - Koen W. R. van Cleef
- Department of Medical Microbiology, Radboud University Nijmegen Medical Centre, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands
| | - Claire L. Webster
- Institute of Evolutionary Biology and Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, United Kingdom
| | - Maria-Carla Saleh
- Institut Pasteur, Viruses and RNA interference Unit and Centre National de la Recherche Scientifique, UMR 3569, Paris, France
| | - Darren J. Obbard
- Institute of Evolutionary Biology and Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, United Kingdom
- * E-mail: (DJO); (RPvR)
| | - Ronald P. van Rij
- Department of Medical Microbiology, Radboud University Nijmegen Medical Centre, Radboud Institute for Molecular Life Sciences, Nijmegen, The Netherlands
- * E-mail: (DJO); (RPvR)
| |
Collapse
|
15
|
Kashtan N, Roggensack SE, Rodrigue S, Thompson JW, Biller SJ, Coe A, Ding H, Marttinen P, Malmstrom RR, Stocker R, Follows MJ, Stepanauskas R, Chisholm SW. Single-cell genomics reveals hundreds of coexisting subpopulations in wild Prochlorococcus. Science 2014; 344:416-20. [PMID: 24763590 DOI: 10.1126/science.1248575] [Citation(s) in RCA: 304] [Impact Index Per Article: 30.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Extensive genomic diversity within coexisting members of a microbial species has been revealed through selected cultured isolates and metagenomic assemblies. Yet, the cell-by-cell genomic composition of wild uncultured populations of co-occurring cells is largely unknown. In this work, we applied large-scale single-cell genomics to study populations of the globally abundant marine cyanobacterium Prochlorococcus. We show that they are composed of hundreds of subpopulations with distinct "genomic backbones," each backbone consisting of a different set of core gene alleles linked to a small distinctive set of flexible genes. These subpopulations are estimated to have diverged at least a few million years ago, suggesting ancient, stable niche partitioning. Such a large set of coexisting subpopulations may be a general feature of free-living bacterial species with huge populations in highly mixed habitats.
Collapse
Affiliation(s)
- Nadav Kashtan
- Department of Civil and Environmental Engineering, Massachusetts Institute of Technology (MIT), 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Lemos de Matos A, McFadden G, Esteves PJ. Positive evolutionary selection on the RIG-I-like receptor genes in mammals. PLoS One 2013; 8:e81864. [PMID: 24312370 PMCID: PMC3842351 DOI: 10.1371/journal.pone.0081864] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 10/17/2013] [Indexed: 12/25/2022] Open
Abstract
The mammalian RIG-I-like receptors, RIG-I, MDA5 and LGP2, are a family of DExD/H box RNA helicases responsible for the cytoplasmic detection of viral RNA. These receptors detect a variety of RNA viruses, or DNA viruses that express unusual RNA species, many of which are responsible for a great number of severe and lethal diseases. Host innate sentinel proteins involved in pathogen recognition must rapidly evolve in a dynamic arms race with pathogens, and thus are subjected to long-term positive selection pressures to avoid potential infections. Using six codon-based Maximum Likelihood methods, we were able to identify specific codons under positive selection in each of these three genes. The highest number of positively selected codons was detected in MDA5, but a great percentage of these codons were located outside of the currently defined protein domains for MDA5, which likely reflects the imposition of both functional and structural constraints. Additionally, our results support LGP2 as being the least prone to evolutionary change, since the lowest number of codons under selection was observed for this gene. On the other hand, the preponderance of positively selected codons for RIG-I were detected in known protein functional domains, suggesting that pressure has been imposed by the vast number of viruses that are recognized by this RNA helicase. Furthermore, the RIG-I repressor domain, the region responsible for recognizing and binding to its RNA substrates, exhibited the strongest evidence of selective pressures. Branch-site analyses were performed and several species branches on the three receptor gene trees showed evidence of episodic positive selection. In conclusion, by looking for evidence of positive evolutionary selection on mammalian RIG-I-like receptor genes, we propose that a multitude of viruses have crafted the receptors biological function in host defense, specifically for the RIG-I gene, contributing to the innate species-specific resistance/susceptibility to diverse viral pathogens.
Collapse
Affiliation(s)
- Ana Lemos de Matos
- CIBIO - Centro de Investigação em Biodiversidade e Recursos Genéticos/InBio Laboratório Associado, Universidade do Porto, Vairão, Portugal ; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal ; Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, Florida, United States of America
| | | | | |
Collapse
|
17
|
Sun YB, Zhou WP, Liu HQ, Irwin DM, Shen YY, Zhang YP. Genome-wide scans for candidate genes involved in the aquatic adaptation of dolphins. Genome Biol Evol 2013; 5:130-9. [PMID: 23246795 PMCID: PMC3595024 DOI: 10.1093/gbe/evs123] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Since their divergence from the terrestrial artiodactyls, cetaceans have fully adapted to an aquatic lifestyle, which represents one of the most dramatic transformations in mammalian evolutionary history. Numerous morphological and physiological characters of cetaceans have been acquired in response to this drastic habitat transition, such as thickened blubber, echolocation, and ability to hold their breath for a long period of time. However, knowledge about the molecular basis underlying these adaptations is still limited. The sequence of the genome of Tursiops truncates provides an opportunity for a comparative genomic analyses to examine the molecular adaptation of this species. Here, we constructed 11,838 high-quality orthologous gene alignments culled from the dolphin and four other terrestrial mammalian genomes and screened for positive selection occurring in the dolphin lineage. In total, 368 (3.1%) of the genes were identified as having undergone positive selection by the branch-site model. Functional characterization of these genes showed that they are significantly enriched in the categories of lipid transport and localization, ATPase activity, sense perception of sound, and muscle contraction, areas that are potentially related to cetacean adaptations. In contrast, we did not find a similar pattern in the cow, a closely related species. We resequenced some of the positively selected sites (PSSs), within the positively selected genes, and showed that most of our identified PSSs (50/52) could be replicated. The results from this study should have important implications for our understanding of cetacean evolution and their adaptations to the aquatic environment.
Collapse
Affiliation(s)
- Yan-Bo Sun
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | | | | | | | | | | |
Collapse
|
18
|
Heneberg P. Phylogenetic data suggest the reclassification of Fasciola jacksoni (Digenea: Fasciolidae) as Fascioloides jacksoni comb. nov. Parasitol Res 2013; 112:1679-89. [DOI: 10.1007/s00436-013-3326-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2012] [Accepted: 01/25/2013] [Indexed: 11/29/2022]
|
19
|
Vanneste K, Van de Peer Y, Maere S. Inference of genome duplications from age distributions revisited. Mol Biol Evol 2012; 30:177-90. [PMID: 22936721 DOI: 10.1093/molbev/mss214] [Citation(s) in RCA: 112] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Whole-genome duplications (WGDs), thought to facilitate evolutionary innovations and adaptations, have been uncovered in many phylogenetic lineages. WGDs are frequently inferred from duplicate age distributions, where they manifest themselves as peaks against a small-scale duplication background. However, the interpretation of duplicate age distributions is complicated by the use of K(S), the number of synonymous substitutions per synonymous site, as a proxy for the age of paralogs. Two particular concerns are the stochastic nature of synonymous substitutions leading to increasing uncertainty in K(S) with increasing age since duplication and K(S) saturation caused by the inability of evolutionary models to fully correct for the occurrence of multiple substitutions at the same site. K(S) stochasticity is expected to erode the signal of older WGDs, whereas K(S) saturation may lead to artificial peaks in the distribution. Here, we investigate the consequences of these effects on K(S)-based age distributions and WGD inference by simulating the evolution of duplicated sequences according to predefined real age distributions and re-estimating the corresponding K(S) distributions. We show that, although K(S) estimates can be used for WGD inference far beyond the commonly accepted K(S) threshold of 1, K(S) saturation effects can cause artificial peaks at higher ages. Moreover, K(S) stochasticity and saturation may lead to confounded peaks encompassing multiple WGD events and/or saturation artifacts. We argue that K(S) effects need to be properly accounted for when inferring WGDs from age distributions and that the failure to do so could lead to false inferences.
Collapse
Affiliation(s)
- Kevin Vanneste
- Department of Plant Systems Biology, VIB, Ghent, Belgium
| | | | | |
Collapse
|
20
|
Kamneva OK, Liberles DA, Ward NL. Genome-wide influence of indel Substitutions on evolution of bacteria of the PVC superphylum, revealed using a novel computational method. Genome Biol Evol 2010; 2:870-86. [PMID: 21048002 PMCID: PMC3000692 DOI: 10.1093/gbe/evq071] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Whole-genome scans for positive Darwinian selection are widely used to detect evolution of genome novelty. Most approaches are based on evaluation of nonsynonymous to synonymous substitution rate ratio across evolutionary lineages. These methods are sensitive to saturation of synonymous sites and thus cannot be used to study evolution of distantly related organisms. In contrast, indels occur less frequently than amino acid replacements, accumulate more slowly, and can be employed to characterize evolution of diverged organisms. As indels are also subject to the forces of natural selection, they can generate functional changes through positive selection. Here, we present a new computational approach to detect selective constraints on indel substitutions at the whole-genome level for distantly related organisms. Our method is based on ancestral sequence reconstruction, takes into account the varying susceptibility of different types of secondary structure to indels, and according to simulation studies is conservative. We applied this newly developed framework to characterize the evolution of organisms of the Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) bacterial superphylum. The superphylum contains organisms with unique cell biology, physiology, and diverse lifestyles. It includes bacteria with simple cell organization and more complex eukaryote-like compartmentalization. Lifestyles range from free-living organisms to obligate pathogens. In this study, we conduct a whole-genome level analysis of indel substitutions specific to evolutionary lineages of the PVC superphylum and found that indels evolved under positive selection on up to 12% of gene tree branches. We also analyzed possible functional consequences for several case studies of predicted indel events.
Collapse
Affiliation(s)
| | | | - Naomi L. Ward
- Department of Molecular Biology, University of Wyoming
- Department of Botany, University of Wyoming
- Program in Ecology, University of Wyoming
- Corresponding author: E-mail:
| |
Collapse
|
21
|
Ekblom R, French L, Slate J, Burke T. Evolutionary analysis and expression profiling of zebra finch immune genes. Genome Biol Evol 2010; 2:781-90. [PMID: 20884724 PMCID: PMC2975445 DOI: 10.1093/gbe/evq061] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Genes of the immune system are generally considered to evolve rapidly due to host-parasite coevolution. They are therefore of great interest in evolutionary biology and molecular ecology. In this study, we manually annotated 144 avian immune genes from the zebra finch (Taeniopygia guttata) genome and conducted evolutionary analyses of these by comparing them with their orthologs in the chicken (Gallus gallus). Genes classified as immune receptors showed elevated d(N)/d(S) ratios compared with other classes of immune genes. Immune genes in general also appear to be evolving more rapidly than other genes, as inferred from a higher d(N)/d(S) ratio compared with the rest of the genome. Furthermore, ten genes (of 27) for which sequence data were available from at least three bird species showed evidence of positive selection acting on specific codons. From transcriptome data of eight different tissues, we found evidence for expression of 106 of the studied immune genes, with primary expression of most of these in bursa, blood, and spleen. These immune-related genes showed a more tissue-specific expression pattern than other genes in the zebra finch genome. Several of the avian immune genes investigated here provide strong candidates for in-depth studies of molecular adaptation in birds.
Collapse
Affiliation(s)
- Robert Ekblom
- University of Sheffield, Department of Animal and Plant Sciences, Sheffield, UK.
| | | | | | | |
Collapse
|
22
|
Wang Z, Dong X, Ding G, Li Y. Comparing the retention mechanisms of tandem duplicates and retrogenes in human and mouse genomes. Genet Sel Evol 2010; 42:24. [PMID: 20584267 PMCID: PMC2902415 DOI: 10.1186/1297-9686-42-24] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2010] [Accepted: 06/28/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Multiple models have been proposed to interpret the retention of duplicated genes. In this study, we attempted to compare whether the duplicates arising from tandem duplications and retropositions are retained by the same mechanisms in human and mouse genomes. RESULTS Both sequence and expression similarity analyses revealed that tandem duplicates tend to be more conserved, whereas retrogenes tend to be more divergent. The duplicability of tandem duplicates is also higher than that of retrogenes. However, positive selection seems to play significant roles in the retention of both types of duplicates. CONCLUSIONS We propose that dosage effect is more prevalent in the retention of tandem duplicates, while 'escape from adaptive conflict' (EAC) effect is more prevalent in the retention of retrogenes.
Collapse
Affiliation(s)
- Zhen Wang
- Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai, PR China
| | | | | | | |
Collapse
|
23
|
Proost S, Van Bel M, Sterck L, Billiau K, Van Parys T, Van de Peer Y, Vandepoele K. PLAZA: a comparative genomics resource to study gene and genome evolution in plants. THE PLANT CELL 2009; 21:3718-31. [PMID: 20040540 PMCID: PMC2814516 DOI: 10.1105/tpc.109.071506] [Citation(s) in RCA: 193] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2009] [Revised: 12/04/2009] [Accepted: 12/10/2009] [Indexed: 05/17/2023]
Abstract
The number of sequenced genomes of representatives within the green lineage is rapidly increasing. Consequently, comparative sequence analysis has significantly altered our view on the complexity of genome organization, gene function, and regulatory pathways. To explore all this genome information, a centralized infrastructure is required where all data generated by different sequencing initiatives is integrated and combined with advanced methods for data mining. Here, we describe PLAZA, an online platform for plant comparative genomics (http://bioinformatics.psb.ugent.be/plaza/). This resource integrates structural and functional annotation of published plant genomes together with a large set of interactive tools to study gene function and gene and genome evolution. Precomputed data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, intraspecies whole-genome dot plots, and genomic colinearity between species. Through the integration of high confidence Gene Ontology annotations and tree-based orthology between related species, thousands of genes lacking any functional description are functionally annotated. Advanced query systems, as well as multiple interactive visualization tools, are available through a user-friendly and intuitive Web interface. In addition, detailed documentation and tutorials introduce the different tools, while the workbench provides an efficient means to analyze user-defined gene sets through PLAZA's interface. In conclusion, PLAZA provides a comprehensible and up-to-date research environment to aid researchers in the exploration of genome information within the green plant lineage.
Collapse
Affiliation(s)
- Sebastian Proost
- Department of Plant Systems Biology, Flanders Institute for Biotechnology, B-9052 Ghent, Belgium.
| | | | | | | | | | | | | |
Collapse
|
24
|
Wurdack KJ, Davis CC. Malpighiales phylogenetics: Gaining ground on one of the most recalcitrant clades in the angiosperm tree of life. AMERICAN JOURNAL OF BOTANY 2009; 96:1551-1570. [PMID: 21628300 DOI: 10.3732/ajb.0800207] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The eudicot order Malpighiales contains ∼16000 species and is the most poorly resolved large rosid clade. To clarify phylogenetic relationships in the order, we used maximum likelihood, Bayesian, and parsimony analyses of DNA sequence data from 13 gene regions, totaling 15604 bp, and representing all three genomic compartments (i.e., plastid: atpB, matK, ndhF, and rbcL; mitochondrial: ccmB, cob, matR, nad1B-C, nad6, and rps3; and nuclear: 18S rDNA, PHYC, and newly developed low-copy EMB2765). Our sampling of 190 taxa includes representatives from all families of Malpighiales. These data provide greatly increased support for the recent additions of Aneulophus, Bhesa, Centroplacus, Ploiarium, and Rafflesiaceae to Malpighiales; sister relations of Phyllanthaceae + Picrodendraceae, monophyly of Hypericaceae, and polyphyly of Clusiaceae. Oxalidales + Huaceae, followed by Celastrales are successive sisters to Malpighiales. Parasitic Rafflesiaceae, which produce the world's largest flowers, are confirmed as embedded within a paraphyletic Euphorbiaceae. Novel findings show a well-supported placement of Ctenolophonaceae with Erythroxylaceae + Rhizophoraceae, sister-group relationships of Bhesa + Centroplacus, and the exclusion of Medusandra from Malpighiales. New taxonomic circumscriptions include the addition of Bhesa to Centroplacaceae, Medusandra to Peridiscaceae (Saxifragales), Calophyllaceae applied to Clusiaceae subfamily Kielmeyeroideae, Peraceae applied to Euphorbiaceae subfamily Peroideae, and Huaceae included in Oxalidales.
Collapse
Affiliation(s)
- Kenneth J Wurdack
- Department of Botany, Smithsonian Institution, P.O. Box 37012 NMNH MRC-166, Washington, District of Columbia 20013-7012 USA
| | | |
Collapse
|
25
|
Anisimova M, Kosiol C. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol 2008; 26:255-71. [PMID: 18922761 DOI: 10.1093/molbev/msn232] [Citation(s) in RCA: 127] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
This review is motivated by the true explosion in the number of recent studies both developing and ameliorating probabilistic models of codon evolution. Traditionally parametric, the first codon models focused on estimating the effects of selective pressure on the protein via an explicit parameter in the maximum likelihood framework. Likelihood ratio tests of nested codon models armed the biologists with powerful tools, which provided unambiguous evidence for positive selection in real data. This, in turn, triggered a new wave of methodological developments. The new generation of models views the codon evolution process in a more sophisticated way, relaxing several mathematical assumptions. These models make a greater use of physicochemical amino acid properties, genetic code machinery, and the large amounts of data from the public domain. The overview of the most recent advances on modeling codon evolution is presented here, and a wide range of their applications to real data is discussed. On the downside, availability of a large variety of models, each accounting for various biological factors, increases the margin for misinterpretation; the biological meaning of certain parameters may vary among models, and model selection procedures also deserve greater attention. Solid understanding of the modeling assumptions and their applicability is essential for successful statistical data analysis.
Collapse
Affiliation(s)
- Maria Anisimova
- Institute of Computational Science, Swiss Federal Institute of Technology, Zurich, Switzerland.
| | | |
Collapse
|
26
|
Seo TK, Kishino H. Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins. Syst Biol 2008; 57:367-77. [PMID: 18570032 DOI: 10.1080/10635150802158670] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
Codon-and amino acid-substitution models are widely used for the evolutionary analysis of protein-coding DNA sequences. Using codon models, the amounts of both nonsynonymous and synonymous DNA substitutions can be estimated. The ratio of these amounts represents the strength of selective pressure. Using amino acid models, the amount of nonsynonymous substitutions is estimated, but that of synonymous substitutions is ignored. Although amino acid models lose any information regarding synonymous substitutions, they explicitly incorporate the information for amino acid replacement, which is empirically derived from databases. It is often presumed that when the protein-coding sequences are highly divergent, synonymous substitutions might be saturated and the evolutionary analysis may be hampered by synonymous noise. However, there exists no quantitative procedure to verify whether synonymous substitutions can be ignored; therefore, amino acid models have been arbitrarily selected. In this study, we investigate the issue of a statistical comparison between codon-and amino acid-substitution models. For this purpose, we propose a new procedure to transform a 20-dimensional amino acid model to a 61-dimensional codon model. This transformation reveals that amino acid models belong to a subset of the codon models and enables us to test whether synonymous substitutions can be ignored by using the likelihood ratio. Our theoretical results and analyses of real data indicate that synonymous substitutions are very informative and substantially improve evolutionary inference, even when the sequences are highly divergent. Therefore, we note that amino acid models should be adopted only after carefully investigating and discarding the possibility that synonymous substitutions can reveal important evolutionary information.
Collapse
Affiliation(s)
- Tae-Kun Seo
- Professional Programme for Agricultural Bioinformatics, Graduate School of Agricultural and Life Sciences, University of Tokyo, Tokyo, Japan.
| | | |
Collapse
|
27
|
Rokas A, Carroll SB. Frequent and widespread parallel evolution of protein sequences. Mol Biol Evol 2008; 25:1943-53. [PMID: 18583353 DOI: 10.1093/molbev/msn143] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Understanding the patterns and causes of protein sequence evolution is a major challenge in evolutionary biology. One of the critical unresolved issues is the relative contribution of selection and genetic drift to the fixation of amino acid sequence differences between species. Molecular homoplasy, the independent evolution of the same amino acids at orthologous sites in different taxa, is one potential signature of selection; however, relatively little is known about its prevalence in eukaryotic proteomes. To quantify the extent and type of homoplasy among evolving proteins, we used phylogenetic methodology to analyze 8 genome-scale data matrices from clades of different evolutionary depths that span the eukaryotic tree of life. We found that the frequency of homoplastic amino acid substitutions in eukaryotic proteins was more than 2-fold higher than expected under neutral models of protein evolution. The overwhelming majority of homoplastic substitutions were parallelisms that involved the most frequently exchanged amino acids with similar physicochemical properties and that could be reached by a single-mutational step. We conclude that the role of homoplasy in shaping the protein record is much larger than generally assumed, and we suggest that its high frequency can be explained by both weak positive selection for certain substitutions and purifying selection that constrains substitutions to a small number of functionally equivalent amino acids.
Collapse
Affiliation(s)
- Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, USA
| | | |
Collapse
|
28
|
Ewing GB, Ebersberger I, Schmidt HA, von Haeseler A. Rooted triple consensus and anomalous gene trees. BMC Evol Biol 2008; 8:118. [PMID: 18439266 PMCID: PMC2409437 DOI: 10.1186/1471-2148-8-118] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2007] [Accepted: 04/25/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Anomalous gene trees (AGTs) are gene trees with a topology different from a species tree that are more probable to observe than congruent gene trees. In this paper we propose a rooted triple approach to finding the correct species tree in the presence of AGTs. RESULTS Based on simulated data we show that our method outperforms the extended majority rule consensus strategy, while still resolving the species tree. Applying both methods to a metazoan data set of 216 genes, we tested whether AGTs substantially interfere with the reconstruction of the metazoan phylogeny. CONCLUSION Evidence of AGTs was not found in this data set, suggesting that erroneously reconstructed gene trees are the most significant challenge in the reconstruction of phylogenetic relationships among species with current data. The new method does however rule out the erroneous reconstruction of deep or poorly resolved splits in the presence of lineage sorting.
Collapse
Affiliation(s)
- Gregory B Ewing
- Center for Integrative Bioinformatics Vienna, Max F, Perutz Laboratories, Dr, Bohr Gasse 9, A-1030 Vienna, Austria.
| | | | | | | |
Collapse
|
29
|
Anisimova M, Liberles DA. The quest for natural selection in the age of comparative genomics. Heredity (Edinb) 2007; 99:567-79. [PMID: 17848974 DOI: 10.1038/sj.hdy.6801052] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Continued genome sequencing has fueled progress in statistical methods for understanding the action of natural selection at the molecular level. This article reviews various statistical techniques (and their applicability) for detecting adaptation events and the functional divergence of proteins. As large-scale automated studies become more frequent, they provide a useful resource for generating biological null hypotheses for further experimental and statistical testing. Furthermore, they shed light on typical patterns of lineage-specific evolution of organisms, on the functional and structural evolution of protein families and on the interplay between the two. More complex models are being developed to better reflect the underlying biological and chemical processes and to complement simpler statistical models. Linking molecular processes to their statistical signatures in genomes can be demanding, and the proper application of statistical models is discussed.
Collapse
Affiliation(s)
- M Anisimova
- Department of Biology, University College London, London, UK
| | | |
Collapse
|
30
|
Hanada K, Shiu SH, Li WH. The Nonsynonymous/Synonymous Substitution Rate Ratio versus the Radical/Conservative Replacement Rate Ratio in the Evolution of Mammalian Genes. Mol Biol Evol 2007; 24:2235-41. [PMID: 17652332 DOI: 10.1093/molbev/msm152] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
There are 2 ways to infer selection pressures in the evolution of protein-coding genes, the nonsynonymous and synonymous substitution rate ratio (K(A)/K(S)) and the radical and conservative amino acid replacement rate ratio (K(R)/K(C)). Because the K(R)/K(C) ratio depends on the definition of radical and conservative changes in the classification of amino acids, we develop an amino acid classification that maximizes the correlation between K(A)/K(S) and K(R)/K(C). An analysis of 3,375 orthologous gene groups among 5 mammalian species shows that our classification gives a significantly higher correlation coefficient between the 2 ratios than those of existing classifications. However, there are many orthologous gene groups with a low K(A)/K(S) but a high K(R)/K(C) ratio. Examining the functions of these genes, we found an overrepresentation of functional categories related to development. To determine if the overrepresentation is stage specific, we examined the expression patterns of these genes at different developmental stages of the mouse. Interestingly, these genes are highly expressed in the early middle stage of development (blastocyst to amnion). It is commonly thought that developmental genes tend to be conservative in evolution, but some molecular changes in developmental stages should have contributed to morphological divergence in adult mammals. Therefore, we propose that the relaxed pressures indicated by the K(R)/K(C) ratio but not by K(A)/K(S) in the early middle stage of development may be important for the morphological divergence of mammals at the adult stage, whereas purifying selection detected by K(A)/K(S) occurs in the early middle developmental stage.
Collapse
Affiliation(s)
- Kousuke Hanada
- Department of Ecology and Evolution, University of Chicago, USA
| | | | | |
Collapse
|
31
|
Scapoli C, De Lorenzi S, Salvatorelli G, Barrai I. Amino acid and codon use: in two influenza viruses and three hosts. Med Mal Infect 2007; 37:337-42. [PMID: 17336013 DOI: 10.1016/j.medmal.2006.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2006] [Accepted: 12/08/2006] [Indexed: 11/21/2022]
Abstract
OBJECTIVE The aim of this study was to compare the use of amino acids and codons in influenza viruses A and B and in their common hosts, to highlight any relevant difference. METHODS The frequency of the 20 amino acids and of the 61 codons was studied in influenza viruses A, B, and in man, pig, and chicken. The correlation in amino acid and codon use among these hosts was calculated. RESULTS The correlation between the frequency of the 20 amino acids and the molecular weight was also calculated and it was very similar in all studied hosts, ranging from 0.506 to 0.595. The correlation of codon frequency among these organisms was highest between man and chicken (r=0.974), and lowest between pig and virus B (r=0.147). CONCLUSIONS The important correlation in codon use among the three hosts and the two viruses suggests there was a remote lateral gene transfer among the three hosts and the two viruses. The higher use of alanine, leucine, and proline in man versus virus A is significant.
Collapse
Affiliation(s)
- C Scapoli
- Department of Biology, University of Ferrara, 44100 Ferrara, Italy
| | | | | | | |
Collapse
|
32
|
Dishaw LJ, Herrera ML, Bigger CH. Characterization and phylogenetic analysis of a cnidarian LMP X-like cDNA. Immunogenetics 2006; 58:454-64. [PMID: 16552514 DOI: 10.1007/s00251-006-0105-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2005] [Accepted: 02/15/2006] [Indexed: 12/01/2022]
Abstract
Proteasomes are multisubunit protease complexes which are partly responsible for metabolism of intracellular, ubiquitinylated proteins. Vertebrates have adapted a second and specialized structure responsible for the generation of peptides presented to the adaptive immune system and is thus, commonly referred to as the immunoproteasome. This complex is assembled from paralogous copies of subunits belonging to the constitutive, housekeeping proteasome. The immunoproteasome is more efficient in the generation of peptides for display on major histocompatibility complex (MHC) molecules. Important components of this complex are the paralogous members, LMP X and 7; where the latter replaces the former in the assembly of the immunoproteasome of vertebrates. In this report, we describe an LMP X-like cDNA from an endosymbiont-free gorgonian coral, Swiftia exserta. Cnidarians predate the phylogenetic divergence of protostomes and deuterostomes (P-D split), and are becoming an essential model for our comprehension of immune system evolution. Phylogenetic analyses of available sequences indicates that invertebrate LMP X-like sequences are outgroups to vertebrate LMP X and LMP 7, and is in agreement with previous observations that the duplication event giving rise to the two rapidly diverging lineages of proteasomal subunits occurred before jawed fished divergence.
Collapse
Affiliation(s)
- Larry J Dishaw
- Department of Biological Sciences, Florida International University, Miami, FL, 33199, USA
| | | | | |
Collapse
|
33
|
Abhiman S, Daub CO, Sonnhammer ELL. Prediction of function divergence in protein families using the substitution rate variation parameter alpha. Mol Biol Evol 2006; 23:1406-13. [PMID: 16672285 DOI: 10.1093/molbev/msl002] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Protein families typically embody a range of related functions and may thus be decomposed into subfamilies with, for example, distinct substrate specificities. Detection of functionally divergent subfamilies is possible by methods for recognizing branches of adaptive evolution in a gene tree. As the number of genome sequences is growing rapidly, it is highly desirable to automatically detect subfamily function divergence. To this end, we here introduce a method for large-scale prediction of function divergence within protein families. It is called the alpha shift measure (ASM) as it is based on detecting a shift in the shape parameter (alpha [alpha]) of the substitution rate gamma distribution. Four different methods for estimating alpha were investigated. We benchmarked the accuracy of ASM using function annotation from Enzyme Commission numbers within Pfam protein families divided into subfamilies by the automatic tree-based method BETE. In a test using 563 subfamily pairs in 162 families, ASM outperformed functional site-based methods using rate or conservation shifting (rate shift measure [RSM] and conservation shift measure [CSM]). The best results were obtained using the "GZ-Gamma" method for estimating alpha. By combining ASM with RSM and CSM using linear discriminant analysis, the prediction accuracy was further improved.
Collapse
Affiliation(s)
- Saraswathi Abhiman
- Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden.
| | | | | |
Collapse
|
34
|
Rocha EPC, Smith JM, Hurst LD, Holden MTG, Cooper JE, Smith NH, Feil EJ. Comparisons of dN/dS are time dependent for closely related bacterial genomes. J Theor Biol 2005; 239:226-35. [PMID: 16239014 DOI: 10.1016/j.jtbi.2005.08.037] [Citation(s) in RCA: 302] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2005] [Revised: 05/07/2005] [Accepted: 05/15/2005] [Indexed: 12/22/2022]
Abstract
The ratio of non-synonymous (dN) to synonymous (dS) changes between taxa is frequently computed to assay the strength and direction of selection. Here we note that for comparisons between closely related strains and/or species a second parameter needs to be considered, namely the time since divergence of the two sequences under scrutiny. We demonstrate that a simple time lag model provides a general, parsimonious explanation of the extensive variation in the dN/dS ratio seen when comparing closely related bacterial genomes. We explore this model through simulation and comparative genomics, and suggest a role for hitch-hiking in the accumulation of non-synonymous mutations. We also note taxon-specific differences in the change of dN/dS over time, which may indicate variation in selection, or in population genetics parameters such as population size or the rate of recombination. The effect of comparing intra-species polymorphism and inter-species substitution, and the problems associated with these concepts for asexual prokaryotes, are also discussed. We conclude that, because of the critical effect of time since divergence, inter-taxa comparisons are only possible by comparing trajectories of dN/dS over time and it is not valid to compare taxa on the basis of single time points.
Collapse
Affiliation(s)
- Eduardo P C Rocha
- Atelier de BioInformatique, Université Paris VI, 75005 Paris, France
| | | | | | | | | | | | | |
Collapse
|
35
|
Abhiman S, Sonnhammer ELL. Large-scale prediction of function shift in protein families with a focus on enzymatic function. Proteins 2005; 60:758-68. [PMID: 16001403 DOI: 10.1002/prot.20550] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Protein function shift can be predicted from sequence comparisons, either using positive selection signals or evolutionary rate estimation. None of the methods have been validated on large datasets, however. Here we investigate existing and novel methods for protein function shift prediction, and benchmark the accuracy against a large dataset of proteins with known enzymatic functions. Function change was predicted between subfamilies by identifying two kinds of sites in a multiple sequence alignment: Conservation-Shifting Sites (CSS), which are conserved in two subfamilies using two different amino acid types, and Rate-Shifting Sites (RSS), which have different evolutionary rates in two subfamilies. CSS were predicted by a new entropy-based method, and RSS using the Rate-Shift program. In principle, the more CSS and RSS between two subfamilies, the more likely a function shift between them. A test dataset was built by extracting subfamilies from Pfam with different EC numbers that belong to the same domain family. Subfamilies were generated automatically using a phylogenetic tree-based program, BETE. The dataset comprised 997 subfamily pairs with four or more members per subfamily. We observed a significant increase in CSS and RSS for subfamily comparisons with different EC numbers compared to cases with same EC numbers. The discrimination was better using RSS than CSS, and was more pronounced for larger families. Combining RSS and CSS by discriminant analysis improved classification accuracy to 71%. The method was applied to the Pfam database and the results are available at http://FunShift.cgb.ki.se. A closer examination of some superfamily comparisons showed that single EC numbers sometimes embody distinct functional classes. Hence, the measured accuracy of function shift is underestimated.
Collapse
Affiliation(s)
- Saraswathi Abhiman
- Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden
| | | |
Collapse
|
36
|
Spratt BG. John Maynard Smith (1920-2004). INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2004; 4:297-300. [PMID: 15503422 DOI: 10.1016/j.meegid.2004.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Affiliation(s)
- Brian G Spratt
- Department of Infectious Disease Epidemiology, Imperial College London W2 1PG, UK.
| |
Collapse
|
37
|
Abstract
We present models describing the acquisition and deletion of novel sequences in populations of microorganisms. We infer that most novel sequences are neutral. Thus, sequence duplications and gene transfer between organisms sharing the same environment are rarely expected to generate adaptive functions. Two classes of models are considered: (1) a homogeneous population with constant size, and (2) an island model in which the population is subdivided into patches that are in contact through slow migration. Distributions of gene frequencies are derived in a Moran model with overlapping generations. We find that novel, neutral or near-neutral coding sequences in microorganisms will not be fixed globally because they offer large target sizes for mutations and because the populations are so large. At most, such genes may have a transient presence in only a small fraction of the population. Consequently, a microbial population is expected to have a very large diversity of transient neutral gene content. Only sequences that are under strong selection, globally or in individual patches, can be expected to persist. We suggest that genome size is maintained in microorganisms by a quasi-steady state mechanism in which random fluctuations in the effective acquisition and deletion rates result in genome sizes that vary from patch to patch. We assign the genomic identity of a global population to those genes that are required for the participation of patches in the genetic sweeps that maintain the genomic coherence of the population. In contrast, we stress the influence of sequence loss on the isolation and the divergence (speciation) of novel patches from a global population.
Collapse
Affiliation(s)
- Otto G Berg
- Department of Molecular Evolution, Uppsala University EBC, Norbyvagen 18C, SE-75236 Uppsala, Sweden.
| | | |
Collapse
|
38
|
Liberles DA. Evaluation of methods for determination of a reconstructed history of gene sequence evolution. Mol Biol Evol 2001; 18:2040-7. [PMID: 11606700 DOI: 10.1093/oxfordjournals.molbev.a003745] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
With whole-genome sequences being completed at an increasing rate, it is important to develop and assess tools to analyze them. Following annotation of the protein content of a genome, one can compare sequences with previously characterized homologous genes to detect novel functions within specific proteins in the evolution of the newly sequenced genome. One common statistical method to detect such changes is to compare the ratios of nonsynonymous (K(a)) to synonymous (K(s)) nucleotide substitution rates. Here, the effects of several parameters that can influence this calculation (sequence reconstruction method, phylogenetic tree branch length weighting, GC content, and codon bias) are examined. Also, two new alternative measures of adaptive evolution, the point accepted mutations (PAM)/neutral evolutionary distance (NED) ratio and the sequence space assessment (SSA) statistic are presented. All of these methods are compared using two sequence families: the recent divergence of leptin orthologs in primates, and the more ancient divergence of the deoxyribonucleoside kinase family. The examination of these and other measures to detect changes of gene function along branches of a phylogenetic tree will become increasingly important in the postgenomic era.
Collapse
Affiliation(s)
- D A Liberles
- Department of Biochemistry and Biophysics and Stockholm Bioinformatics Center, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
39
|
Abstract
Attempts to calibrate bacterial evolution have relied on the assumption that rates of molecular sequence divergence in bacteria are similar to those of higher eukaryotes, or to those of the few bacterial taxa for which ancestors can be reliably dated from ecological or geological evidence. Despite similarities in the substitution rates estimated for some lineages, comparisons of the relative rates of evolution at different classes of nucleotide sites indicate no basis for their universal application to all bacteria. However, there is evidence that bacteria have a constant genome-wide mutation rate on an evolutionary time scale but that this rate differs dramatically from the rate estimated by experimental methods.
Collapse
Affiliation(s)
- H Ochman
- Department of Ecology, University of Arizona, Tucson, AZ 85721, USA.
| | | | | |
Collapse
|
40
|
Abstract
Phylogenetic trees constructed using human mitochondrial sequences contain a large number of homoplasies. These are due either to repeated mutation or to recombination between mitochondrial lineages. We show that a tree constructed using synonymous variation in the protein coding sequences of 29 largely complete human mitochondrial molecules contains 22 homoplasies at 32 phylogenetically informative sites. This level of homoplasy is very unlikely if inheritance is clonal, even if we take into account base composition bias. There must either be 'hypervariable' sites or recombination between mitochondria. We present evidence which suggests that hypervariable sites do not exist in our data. It therefore seems likely that recombination has occurred between mitochondrial lineages in humans.
Collapse
Affiliation(s)
- A Eyre-Walker
- Centre for the Study of Evolution, University of Sussex, Brighton, UK.
| | | | | |
Collapse
|