1
|
Babarinde IA, Saitou N. The Dynamics, Causes, and Impacts of Mammalian Evolutionary Rates Revealed by the Analyses of Capybara Draft Genome Sequences. Genome Biol Evol 2020; 12:1444-1458. [PMID: 32835375 DOI: 10.1093/gbe/evaa157] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2020] [Indexed: 12/23/2022] Open
Abstract
Capybara (Hydrochoerus hydrochaeri) is the largest species among the extant rodents. The draft genome of capybara was sequenced with the estimated genome size of 2.6 Gb. Although capybara is about 60 times larger than guinea pig, comparative analyses revealed that the neutral evolutionary rates of the two species were not substantially different. However, analyses of 39 mammalian genomes revealed very heterogeneous evolutionary rates. The highest evolutionary rate, 8.5 times higher than the human rate, was found in the Cricetidae-Muridae common ancestor after the divergence of Spalacidae. Muridae, the family with the highest number of species among mammals, emerged after the rate acceleration. Factors responsible for the evolutionary rate heterogeneity were investigated through correlations between the evolutionary rate and longevity, gestation length, litter frequency, litter size, body weight, generation interval, age at maturity, and taxonomic order. The regression analysis of these factors showed that the model with three factors (taxonomic order, generation interval, and litter size) had the highest predictive power (R2 = 0.74). These three factors determine the number of meiosis per unit time. We also conducted transcriptome analysis and found that the evolutionary rate dynamics affects the evolution of gene expression patterns.
Collapse
Affiliation(s)
- Isaac Adeyemi Babarinde
- Department of Biological Sciences, Southern University of Science and Technology, Shenzhen, China.,Population Genetics Laboratory, National Institute of Genetics, Mishima, Japan
| | - Naruya Saitou
- Population Genetics Laboratory, National Institute of Genetics, Mishima, Japan.,School of Medicine, University of the Ryukyus, Okinawa, Japan.,Department of Genetics, School of Life Science, Graduate University for Advanced Studies, Mishima, Japan.,Department of Biological Sciences, Graduate School of Science, University of Tokyo, Japan
| |
Collapse
|
2
|
Abstract
BACKGROUND Gene order changes, under rearrangements, insertions, deletions and duplications, have been used as a new type of data source for phylogenetic reconstruction. Because these changes are rare compared to sequence mutations, they allow the inference of phylogeny further back in evolutionary time. There exist many computational methods for the reconstruction of gene-order phylogenies, including widely used maximum parsimonious methods and maximum likelihood methods. However, both methods face challenges in handling large genomes with many duplicated genes, especially in the presence of whole genome duplication. METHODS In this paper, we present three simple yet powerful methods based on maximum-likelihood (ML) approaches that encode multiplicities of both gene adjacency and gene content information for phylogenetic reconstruction. RESULTS Extensive experiments on simulated data sets show that our new method achieves the most accurate phylogenies compared to existing approaches. We also evaluate our method on real whole-genome data from eleven mammals. The package is publicly accessible at http://www.geneorder.org . CONCLUSIONS Our new encoding schemes successfully incorporate the multiplicity information of gene adjacencies and gene content into an ML framework, and show promising results in reconstruct phylogenies for whole-genome data in the presence of massive duplications.
Collapse
Affiliation(s)
- Lingxi Zhou
- Department of Computer Science and Engineering, University of South Carolina, Columbia, 29208 South Carolina USA
| | - Yu Lin
- Research School of Computer Science, Australian National University, Canberra, 2601 ACT Australia
| | - Bing Feng
- Department of Computer Science and Engineering, University of South Carolina, Columbia, 29208 South Carolina USA
| | - Jieyi Zhao
- University of Texas School of Biomedical Informatics at Houston, Houston, 77030 Texas USA
| | - Jijun Tang
- School of Computer Science and Engineering, Tianjin University, Tianjin, 300072 China
- Department of Computer Science and Engineering, University of South Carolina, Columbia, 29208 South Carolina USA
| |
Collapse
|
3
|
Ezran C, Karanewsky CJ, Pendleton JL, Sholtz A, Krasnow MR, Willick J, Razafindrakoto A, Zohdy S, Albertelli MA, Krasnow MA. The Mouse Lemur, a Genetic Model Organism for Primate Biology, Behavior, and Health. Genetics 2017; 206:651-664. [PMID: 28592502 PMCID: PMC5499178 DOI: 10.1534/genetics.116.199448] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2016] [Accepted: 04/08/2017] [Indexed: 01/24/2023] Open
Abstract
Systematic genetic studies of a handful of diverse organisms over the past 50 years have transformed our understanding of biology. However, many aspects of primate biology, behavior, and disease are absent or poorly modeled in any of the current genetic model organisms including mice. We surveyed the animal kingdom to find other animals with advantages similar to mice that might better exemplify primate biology, and identified mouse lemurs (Microcebus spp.) as the outstanding candidate. Mouse lemurs are prosimian primates, roughly half the genetic distance between mice and humans. They are the smallest, fastest developing, and among the most prolific and abundant primates in the world, distributed throughout the island of Madagascar, many in separate breeding populations due to habitat destruction. Their physiology, behavior, and phylogeny have been studied for decades in laboratory colonies in Europe and in field studies in Malagasy rainforests, and a high quality reference genome sequence has recently been completed. To initiate a classical genetic approach, we developed a deep phenotyping protocol and have screened hundreds of laboratory and wild mouse lemurs for interesting phenotypes and begun mapping the underlying mutations, in collaboration with leading mouse lemur biologists. We also seek to establish a mouse lemur gene "knockout" library by sequencing the genomes of thousands of mouse lemurs to identify null alleles in most genes from the large pool of natural genetic variants. As part of this effort, we have begun a citizen science project in which students across Madagascar explore the remarkable biology around their schools, including longitudinal studies of the local mouse lemurs. We hope this work spawns a new model organism and cultivates a deep genetic understanding of primate biology and health. We also hope it establishes a new and ethical method of genetics that bridges biological, behavioral, medical, and conservation disciplines, while providing an example of how hands-on science education can help transform developing countries.
Collapse
Affiliation(s)
- Camille Ezran
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | | | | | - Alex Sholtz
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | - Maya R Krasnow
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | - Jason Willick
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| | - Andriamahery Razafindrakoto
- Department of Animal Biology, Faculty of Science, University of Antananarivo, Antananarivo 101, BP 566, Madagascar, and
| | - Sarah Zohdy
- School of Forestry and Wildlife Sciences and College of Veterinary Medicine, Auburn University, Alabama 36849
| | - Megan A Albertelli
- Department of Comparative Medicine, Stanford University School of Medicine, California 94305
| | - Mark A Krasnow
- Department of Biochemistry
- Howard Hughes Medical Institute, and
| |
Collapse
|
4
|
Oey H, Isbel L, Hickey P, Ebaid B, Whitelaw E. Genetic and epigenetic variation among inbred mouse littermates: identification of inter-individual differentially methylated regions. Epigenetics Chromatin 2015; 8:54. [PMID: 26692901 PMCID: PMC4676890 DOI: 10.1186/s13072-015-0047-z] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Accepted: 11/23/2015] [Indexed: 05/06/2023] Open
Abstract
Background Phenotypic variability among inbred littermates reared in controlled environments remains poorly understood. Metastable epialleles refer to loci that intrinsically behave in this way and a few examples have been described. They display differential methylation in association with differential expression. For example, inbred mice carrying the agouti viable yellow (Avy) allele show a range of coat colours associated with different DNA methylation states at the locus. The availability of next-generation sequencing, in particular whole genome sequencing of bisulphite converted DNA, allows us, for the first time, to search for metastable epialleles at base pair resolution. Results Using whole genome bisulphite sequencing of DNA from the livers of five mice from the Avy colony, we searched for sites at which DNA methylation differed among the mice. A small number of loci, 356, were detected and we call these inter-individual Differentially Methylated Regions, iiDMRs, 55 of which overlap with endogenous retroviral elements (ERVs). Whole genome resequencing of two mice from the colony identified very few differences and these did not occur at or near the iiDMRs. Further work suggested that the majority of ERV iiDMRs are metastable epialleles; the level of methylation was maintained in tissue from other germ layers and the level of mRNA from the neighbouring gene inversely correlated with methylation state. Most iiDMRs that were not overlapping ERV insertions occurred at tissue-specific DMRs and it cannot be ruled out that these are driven by changes in the ratio of cell types in the tissues analysed. Conclusions Using the most thorough genome-wide profiling technologies for differentially methylated regions, we find very few intrinsically epigenetically variable regions that we term iiDMRs. The most robust of these are at retroviral elements and appear to be metastable epialleles. The non-ERV iiDMRs cannot be described as metastable epialleles at this stage but provide a novel class of variably methylated elements for further study. Electronic supplementary material The online version of this article (doi:10.1186/s13072-015-0047-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Harald Oey
- Department of Genetics, La Trobe Institute for Molecular Science, La Trobe University, Bundoora, Melbourne, VIC 3086 Australia.,University of Queensland Diamantina Institute, Translational Research Institute, Princess Alexandra Hospital, Brisbane, QLD 4102 Australia
| | - Luke Isbel
- Department of Genetics, La Trobe Institute for Molecular Science, La Trobe University, Bundoora, Melbourne, VIC 3086 Australia
| | - Peter Hickey
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052 Australia
| | - Basant Ebaid
- Department of Genetics, La Trobe Institute for Molecular Science, La Trobe University, Bundoora, Melbourne, VIC 3086 Australia
| | - Emma Whitelaw
- Department of Genetics, La Trobe Institute for Molecular Science, La Trobe University, Bundoora, Melbourne, VIC 3086 Australia
| |
Collapse
|
5
|
Abstract
Species survival depends on the faithful replication of genetic information, which is continually monitored and maintained by DNA repair pathways that correct replication errors and the thousands of lesions that arise daily from the inherent chemical lability of DNA and the effects of genotoxic agents. Nonetheless, neutrally evolving DNA (not under purifying selection) accumulates base substitutions with time (the neutral mutation rate). Thus, repair processes are not 100% efficient. The neutral mutation rate varies both between and within chromosomes. For example it is 10-50 fold higher at CpGs than at non-CpG positions. Interestingly, the neutral mutation rate at non-CpG sites is positively correlated with CpG content. Although the basis of this correlation was not immediately apparent, some bioinformatic results were consistent with the induction of non-CpG mutations by DNA repair at flanking CpG sites. Recent studies with a model system showed that in vivo repair of preformed lesions (mismatches, abasic sites, single stranded nicks) can in fact induce mutations in flanking DNA. Mismatch repair (MMR) is an essential component for repair-induced mutations, which can occur as distant as 5 kb from the introduced lesions. Most, but not all, mutations involved the C of TpCpN (G of NpGpA) which is the target sequence of the C-preferring single-stranded DNA specific APOBEC deaminases. APOBEC-mediated mutations are not limited to our model system: Recent studies by others showed that some tumors harbor mutations with the same signature, as can intermediates in RNA-guided endonuclease-mediated genome editing. APOBEC deaminases participate in normal physiological functions such as generating mutations that inactivate viruses or endogenous retrotransposons, or that enhance immunoglobulin diversity in B cells. The recruitment of normally physiological error-prone processes during DNA repair would have important implications for disease, aging and evolution. This perspective briefly reviews both the bioinformatic and biochemical literature relevant to repair-induced mutagenesis and discusses future directions required to understand the mechanistic basis of this process.
Collapse
Affiliation(s)
- Jia Chen
- School of Life Science and Technology, ShanghaiTech University, Building 8, 319 Yueyang Road, Shanghai 200031, China
| | - Anthony V Furano
- Section on Genomic Structure and Function, Laboratory of Cell and Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Building 8, Room 203, 8 Center Drive, MSC 0830, Bethesda, MD 20892-0830, USA.
| |
Collapse
|
6
|
Kaehler BD, Yap VB, Zhang R, Huttley GA. Genetic distance for a general non-stationary markov substitution process. Syst Biol 2015; 64:281-93. [PMID: 25503772 PMCID: PMC4380038 DOI: 10.1093/sysbio/syu106] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Accepted: 12/01/2014] [Indexed: 11/18/2022] Open
Abstract
The genetic distance between biological sequences is a fundamental quantity in molecular evolution. It pertains to questions of rates of evolution, existence of a molecular clock, and phylogenetic inference. Under the class of continuous-time substitution models, the distance is commonly defined as the expected number of substitutions at any site in the sequence. We eschew the almost ubiquitous assumptions of evolution under stationarity and time-reversible conditions and extend the concept of the expected number of substitutions to nonstationary Markov models where the only remaining constraint is of time homogeneity between nodes in the tree. Our measure of genetic distance reduces to the standard formulation if the data in question are consistent with the stationarity assumption. We apply this general model to samples from across the tree of life to compare distances so obtained with those from the general time-reversible model, with and without rate heterogeneity across sites, and the paralinear distance, an empirical pairwise method explicitly designed to address nonstationarity. We discover that estimates from both variants of the general time-reversible model and the paralinear distance systematically overestimate genetic distance and departure from the molecular clock. The magnitude of the distance bias is proportional to departure from stationarity, which we demonstrate to be associated with longer edge lengths. The marked improvement in consistency between the general nonstationary Markov model and sequence alignments leads us to conclude that analyses of evolutionary rates and phylogenies will be substantively improved by application of this model.
Collapse
Affiliation(s)
- Benjamin D Kaehler
- John Curtin School of Medical Research, Australian National University, Canberra, ACT, 2600, Australia; and
| | - Von Bing Yap
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore
| | - Rongli Zhang
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, 117546, Singapore
| | - Gavin A Huttley
- John Curtin School of Medical Research, Australian National University, Canberra, ACT, 2600, Australia; and
| |
Collapse
|
7
|
Elhaik E, Graur D. A comparative study and a phylogenetic exploration of the compositional architectures of mammalian nuclear genomes. PLoS Comput Biol 2014; 10:e1003925. [PMID: 25375262 PMCID: PMC4222635 DOI: 10.1371/journal.pcbi.1003925] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2014] [Accepted: 09/18/2014] [Indexed: 11/18/2022] Open
Abstract
For the past four decades the compositional organization of the mammalian genome posed a formidable challenge to molecular evolutionists attempting to explain it from an evolutionary perspective. Unfortunately, most of the explanations adhered to the "isochore theory," which has long been rebutted. Recently, an alternative compositional domain model was proposed depicting the human and cow genomes as composed mostly of short compositionally homogeneous and nonhomogeneous domains and a few long ones. We test the validity of this model through a rigorous sequence-based analysis of eleven completely sequenced mammalian and avian genomes. Seven attributes of compositional domains are used in the analyses: (1) the number of compositional domains, (2) compositional domain-length distribution, (3) density of compositional domains, (4) genome coverage by the different domain types, (5) degree of fit to a power-law distribution, (6) compositional domain GC content, and (7) the joint distribution of GC content and length of the different domain types. We discuss the evolution of these attributes in light of two competing phylogenetic hypotheses that differ from each other in the validity of clade Euarchontoglires. If valid, the murid genome compositional organization would be a derived state and exhibit a high similarity to that of other mammals. If invalid, the murid genome compositional organization would be closer to an ancestral state. We demonstrate that the compositional organization of the murid genome differs from those of primates and laurasiatherians, a phenomenon previously termed the "murid shift," and in many ways resembles the genome of opossum. We find no support to the "isochore theory." Instead, our findings depict the mammalian genome as a tapestry of mostly short homogeneous and nonhomogeneous domains and few long ones thus providing strong evidence in favor of the compositional domain model and seem to invalidate clade Euarchontoglires.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
- * E-mail:
| | - Dan Graur
- Department of Biology & Biochemistry, University of Houston, Houston, Texas, United States of America
| |
Collapse
|
8
|
Genetic architecture of parallel pelvic reduction in ninespine sticklebacks. G3-GENES GENOMES GENETICS 2013; 3:1833-42. [PMID: 23979937 PMCID: PMC3789808 DOI: 10.1534/g3.113.007237] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Teleost fish genomes are known to be evolving faster than those of other vertebrate taxa. Thus, fish are suited to address the extent to which the same vs. different genes are responsible for similar phenotypic changes in rapidly evolving genomes of evolutionary independent lineages. To gain insights into the genetic basis and evolutionary processes behind parallel phenotypic changes within and between species, we identified the genomic regions involved in pelvic reduction in Northern European ninespine sticklebacks (Pungitius pungitius) and compared them to those of North American ninespine and threespine sticklebacks (Gasterosteus aculeatus). To this end, we conducted quantitative trait locus (QTL) mapping using 283 F2 progeny from an interpopulation cross. Phenotypic analyses indicated that pelvic reduction is a recessive trait and is inherited in a simple Mendelian fashion. Significant QTL for pelvic spine and girdle lengths were identified in the region of the Pituitary homeobox transcription factor 1 (Pitx1) gene, also responsible for pelvic reduction in threespine sticklebacks. The fact that no QTL was observed in the region identified in the mapping study of North American ninespine sticklebacks suggests that an alternative QTL for pelvic reduction has emerged in this species within the past 1.6 million years after the split between Northern European and North American populations. In general, our study provides empirical support for the view that alternative genetic mechanisms that lead to similar phenotypes can evolve over short evolutionary time scales.
Collapse
|
9
|
Vinogradov AE. Density peaks of paralog pairs in human and mouse genomes. Gene 2013; 527:55-61. [PMID: 23751307 DOI: 10.1016/j.gene.2013.05.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Revised: 05/10/2013] [Accepted: 05/12/2013] [Indexed: 11/30/2022]
Abstract
Paralog gene trees, which reflect the increase of genomic complexity in the evolution, can be complicated and ambiguous. A simpler complementary approach is analysis of density distribution of paralog pairs. It can reveal general features of genome evolution, which may be hidden in the forest of gene trees. It is known that distribution of human paralog pairs along the axis of protein divergence between pair members forms two main peaks. Here I show that there are three main peaks in the mouse genome. Thus, the multimodality of paralog pair distribution seems to be a fundamental feature of mammalian genomes. Despite the great diversity of domains presented in small amounts or in multidomain architectures with a few predominant domains, both in human and mouse the first peak consists mostly of gene pairs with zinc finger domains or olfactory receptor domain. In the mouse the olfactory receptor predominates, which stipulates the three-peak distribution (since in the olfactory receptors the second peak is closer to the first peak than in other genes). The mammalian-wide zinc finger orthologs are biased towards the second peak. Thus, the marsupial orthologs are nearly absent in the first peak of human and mouse. The gene pairs in the first peak show a lower ratio of nonsynonymous to synonymous substitutions, which suggests that their evolution is more constrained. The plausible explanation is that they are in subfunctionalization state (partition of initial function of ancestral gene), whereas the second peak contains gene pairs that are already in neofunctionalization state (acquiring of novel functions). These data suggest that the adaptive radiation of mammals was accompanied by a burst of duplication of zinc finger genes, which are located in the first (most recent) peak of paralog pairs.
Collapse
|
10
|
Luo H, Arndt W, Zhang Y, Shi G, Alekseyev M, Tang J, Hughes AL, Friedman R. Phylogenetic analysis of genome rearrangements among five mammalian orders. Mol Phylogenet Evol 2012; 65:871-82. [PMID: 22929217 PMCID: PMC4425404 DOI: 10.1016/j.ympev.2012.08.008] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2012] [Revised: 08/11/2012] [Accepted: 08/13/2012] [Indexed: 01/16/2023]
Abstract
Evolutionary relationships among placental mammalian orders have been controversial. Whole genome sequencing and new computational methods offer opportunities to resolve the relationships among 10 genomes belonging to the mammalian orders Primates, Rodentia, Carnivora, Perissodactyla and Artiodactyla. By application of the double cut and join distance metric, where gene order is the phylogenetic character, we computed genomic distances among the sampled mammalian genomes. With a marsupial outgroup, the gene order tree supported a topology in which Rodentia fell outside the cluster of Primates, Carnivora, Perissodactyla, and Artiodactyla. Results of breakpoint reuse rate and synteny block length analyses were consistent with the prediction of random breakage model, which provided a diagnostic test to support use of gene order as an appropriate phylogenetic character in this study. We discussed the influence of rate differences among lineages and other factors that may contribute to different resolutions of mammalian ordinal relationships by different methods of phylogenetic reconstruction.
Collapse
Affiliation(s)
- Haiwei Luo
- Department of Biological Sciences, University of South Carolina, Columbia 29208, USA
| | - William Arndt
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| | - Yiwei Zhang
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| | - Guanqun Shi
- Department of Computer Science, University of California, Riverside, 92521, USA
| | - Max Alekseyev
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| | - Jijun Tang
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| | - Austin L. Hughes
- Department of Biological Sciences, University of South Carolina, Columbia 29208, USA
| | - Robert Friedman
- Department of Biological Sciences, University of South Carolina, Columbia 29208, USA
| |
Collapse
|
11
|
Lin Y, Rajan V, Moret BME. Bootstrapping phylogenies inferred from rearrangement data. Algorithms Mol Biol 2012; 7:21. [PMID: 22931958 PMCID: PMC3487984 DOI: 10.1186/1748-7188-7-21] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 07/26/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. RESULTS We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. CONCLUSIONS Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver-operating characteristics are nearly identical, indicating that it provides similar levels of sensitivity and specificity. Thus our assessment method makes it possible to conduct phylogenetic analyses on whole genomes with the same degree of confidence as for analyses on aligned sequences. Extensions to search-based inference methods such as maximum parsimony and maximum likelihood are possible, but remain to be thoroughly tested.
Collapse
Affiliation(s)
- Yu Lin
- Laboratory for Computational Biology and Bioinformatics, EPFL, EPFL-IC-LCBB INJ230, Station 14, CH-1015 Lausanne, Switzerland
| | - Vaibhav Rajan
- Laboratory for Computational Biology and Bioinformatics, EPFL, EPFL-IC-LCBB INJ230, Station 14, CH-1015 Lausanne, Switzerland
| | - Bernard ME Moret
- Laboratory for Computational Biology and Bioinformatics, EPFL, EPFL-IC-LCBB INJ230, Station 14, CH-1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Morris Goodman's hominoid rate slowdown: the importance of being neutral. Mol Phylogenet Evol 2012; 66:569-74. [PMID: 22902941 DOI: 10.1016/j.ympev.2012.07.031] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Revised: 07/09/2012] [Accepted: 07/26/2012] [Indexed: 12/30/2022]
Abstract
Half a century ago, when the field of molecular evolution did not even exist, Morris Goodman analyzed profiles of immunological interactions between species and reached the following two remarkable conclusions: first, protein evolution slowed down in the human lineage compared to other primate lineages; second, this slowdown was more pronounced for proteins whose functions were likely to be neutral. It took several decades of research to fully grasp these ideas and document the pattern of hominoid rate slowdown. Along the way, studies of hominoid rate slowdown led to major progresses in understanding determinants of neutral molecular evolution, which in turn is used to calibrate rates of adaptive evolution. Furthermore, the growing knowledge on the origin of mutations provides a basis for understanding differential evolutionary rates between sex chromosomes and autosomes, which has deep implications for inferring human evolutionary histories, and other aspects of molecular evolution. Primate genomics in particular stand to provide critical information in these pursuits, due to the abundance of genomic data, relatively rich documentation of life history traits, and several model systems, including our own species.
Collapse
|
13
|
Pharo EA, De Leo AA, Renfree MB, Thomson PC, Lefèvre CM, Nicholas KR. The mammary gland-specific marsupial ELP and eutherian CTI share a common ancestral gene. BMC Evol Biol 2012; 12:80. [PMID: 22681678 PMCID: PMC3426482 DOI: 10.1186/1471-2148-12-80] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 06/08/2012] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND The marsupial early lactation protein (ELP) gene is expressed in the mammary gland and the protein is secreted into milk during early lactation (Phase 2A). Mature ELP shares approximately 55.4% similarity with the colostrum-specific bovine colostrum trypsin inhibitor (CTI) protein. Although ELP and CTI both have a single bovine pancreatic trypsin inhibitor (BPTI)-Kunitz domain and are secreted only during the early lactation phases, their evolutionary history is yet to be investigated. RESULTS Tammar ELP was isolated from a genomic library and the fat-tailed dunnart and Southern koala ELP genes cloned from genomic DNA. The tammar ELP gene was expressed only in the mammary gland during late pregnancy (Phase 1) and early lactation (Phase 2A). The opossum and fat-tailed dunnart ELP and cow CTI transcripts were cloned from RNA isolated from the mammary gland and dog CTI from cells in colostrum. The putative mature ELP and CTI peptides shared 44.6%-62.2% similarity. In silico analyses identified the ELP and CTI genes in the other species examined and provided compelling evidence that they evolved from a common ancestral gene. In addition, whilst the eutherian CTI gene was conserved in the Laurasiatherian orders Carnivora and Cetartiodactyla, it had become a pseudogene in others. These data suggest that bovine CTI may be the ancestral gene of the Artiodactyla-specific, rapidly evolving chromosome 13 pancreatic trypsin inhibitor (PTI), spleen trypsin inhibitor (STI) and the five placenta-specific trophoblast Kunitz domain protein (TKDP1-5) genes. CONCLUSIONS Marsupial ELP and eutherian CTI evolved from an ancestral therian mammal gene before the divergence of marsupials and eutherians between 130 and 160 million years ago. The retention of the ELP gene in marsupials suggests that this early lactation-specific milk protein may have an important role in the immunologically naïve young of these species.
Collapse
Affiliation(s)
- Elizabeth A Pharo
- Department of Zoology, The University of Melbourne, Melbourne, Victoria, 3010, Australia.
| | | | | | | | | | | |
Collapse
|
14
|
Lin Y, Rajan V, Moret BME. Fast and accurate phylogenetic reconstruction from high-resolution whole-genome data and a novel robustness estimator. J Comput Biol 2012; 18:1131-9. [PMID: 21899420 DOI: 10.1089/cmb.2011.0114] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The rapid accumulation of whole-genome data has renewed interest in the study of genomic rearrangements. Comparative genomics, evolutionary biology, and cancer research all require models and algorithms to elucidate the mechanisms, history, and consequences of these rearrangements. However, even simple models lead to NP-hard problems, particularly in the area of phylogenetic analysis. Current approaches are limited to small collections of genomes and low-resolution data (typically a few hundred syntenic blocks). Moreover, whereas phylogenetic analyses from sequence data are deemed incomplete unless bootstrapping scores (a measure of confidence) are given for each tree edge, no equivalent to bootstrapping exists for rearrangement-based phylogenetic analysis. We describe a fast and accurate algorithm for rearrangement analysis that scales up, in both time and accuracy, to modern high-resolution genomic data. We also describe a novel approach to estimate the robustness of results-an equivalent to the bootstrapping analysis used in sequence-based phylogenetic reconstruction. We present the results of extensive testing on both simulated and real data showing that our algorithm returns very accurate results, while scaling linearly with the size of the genomes and cubically with their number. We also present extensive experimental results showing that our approach to robustness testing provides excellent estimates of confidence, which, moreover, can be tuned to trade off thresholds between false positives and false negatives. Together, these two novel approaches enable us to attack heretofore intractable problems, such as phylogenetic inference for high-resolution vertebrate genomes, as we demonstrate on a set of six vertebrate genomes with 8,380 syntenic blocks. A copy of the software is available on demand.
Collapse
Affiliation(s)
- Y Lin
- Laboratory for Computational Biology and Bioinformatics, EPFL, Lausanne, Switzerland
| | | | | |
Collapse
|
15
|
Campbell V, Lapointe FJ. Retrieving a mitogenomic mammal tree using composite taxa. Mol Phylogenet Evol 2011; 58:149-56. [DOI: 10.1016/j.ympev.2010.11.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2010] [Revised: 10/17/2010] [Accepted: 11/19/2010] [Indexed: 10/18/2022]
|
16
|
Simmons MP, Müller KF, Norton AP. Alignment of, and phylogenetic inference from, random sequences: the susceptibility of alternative alignment methods to creating artifactual resolution and support. Mol Phylogenet Evol 2010; 57:1004-16. [PMID: 20849963 DOI: 10.1016/j.ympev.2010.09.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Revised: 04/05/2010] [Accepted: 09/06/2010] [Indexed: 10/19/2022]
Abstract
We used random sequences to determine which alignment methods are most susceptible to aligning sequences so as to create artifactual resolution and branch support in phylogenetic trees derived from those alignments. We compared four alignment methods (progressive pairwise alignment, simultaneous multiple alignment of sequence fragments, local pairwise alignment, and direct optimization) to determine which methods are most susceptible to creating false positives in phylogenetic trees. Implied alignments created using direct optimization provided more artifactual support than progressive pairwise alignment methods, which in turn generally provided more artifactual support than simultaneous and local alignment methods. Artifactual support derived from base pairs was generally reinforced by the incorporation of gap characters for progressive pairwise alignment, local pairwise alignment, and implied alignments. The amount of artifactual resolution and support was generally greater for simulated nucleotide sequences than for simulated amino acid sequences. In the context of direct optimization, the differences between static and dynamic approaches to calculating support were extreme, ranging from maximal to nearly minimal support. When applied to highly divergent sequences, it is important that dynamic, rather than static, characters be used whenever calculating branch support using direct optimization. In contrast to the tree-based approaches to alignment, simultaneous alignment of sequences using the similarity criterion generally does not create alignments that are biased in favor of any particular tree topology.
Collapse
Affiliation(s)
- Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO 80523-1878, USA.
| | | | | |
Collapse
|
17
|
Kostka D, Hahn MW, Pollard KS. Noncoding sequences near duplicated genes evolve rapidly. Genome Biol Evol 2010; 2:518-33. [PMID: 20660939 PMCID: PMC2942038 DOI: 10.1093/gbe/evq037] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/25/2010] [Indexed: 11/17/2022] Open
Abstract
Gene expression divergence and chromosomal rearrangements have been put forward as major contributors to phenotypic differences between closely related species. It has also been established that duplicated genes show enhanced rates of positive selection in their amino acid sequences. If functional divergence is largely due to changes in gene expression, it follows that regulatory sequences in duplicated loci should also evolve rapidly. To investigate this hypothesis, we performed likelihood ratio tests (LRTs) on all noncoding loci within 5 kb of every transcript in the human genome and identified sequences with increased substitution rates in the human lineage since divergence from Old World Monkeys. The fraction of rapidly evolving loci is significantly higher nearby genes that duplicated in the common ancestor of humans and chimps compared with nonduplicated genes. We also conducted a genome-wide scan for nucleotide substitutions predicted to affect transcription factor binding. Rates of binding site divergence are elevated in noncoding sequences of duplicated loci with accelerated substitution rates. Many of the genes associated with these fast-evolving genomic elements belong to functional categories identified in previous studies of positive selection on amino acid sequences. In addition, we find enrichment for accelerated evolution nearby genes involved in establishment and maintenance of pregnancy, processes that differ significantly between humans and monkeys. Our findings support the hypothesis that adaptive evolution of the regulation of duplicated genes has played a significant role in human evolution.
Collapse
Affiliation(s)
- Dennis Kostka
- Gladstone Institute for Cardiovascular Disease, Gladstone Institutes, University of California-San Francisco, San Francisco, CA, USA.
| | | | | |
Collapse
|
18
|
Kuhn A, Dehnert M, Helm WE, Hütt MT. Statistical evidence for ancestral correlation patterns. Biosystems 2010; 100:215-24. [DOI: 10.1016/j.biosystems.2010.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2009] [Revised: 12/15/2009] [Accepted: 03/16/2010] [Indexed: 10/19/2022]
|
19
|
Campbell V, Lapointe FJ. An application of supertree methods to Mammalian mitogenomic sequences. Evol Bioinform Online 2010; 6:57-71. [PMID: 20535231 PMCID: PMC2880846 DOI: 10.4137/ebo.s4527] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
TWO DIFFERENT APPROACHES CAN BE USED IN PHYLOGENOMICS: combined or separate analysis. In the first approach, different datasets are combined in a concatenated supermatrix. In the second, datasets are analyzed separately and the phylogenetic trees are then combined in a supertree. The supertree method is an interesting alternative to avoid missing data, since datasets that are analyzed separately do not need to represent identical taxa. However, the supertree approach and the corresponding consensus methods have been highly criticized for not providing valid phylogenetic hypotheses. In this study, congruence of trees estimated by consensus and supertree approaches were compared to model trees obtained from a combined analysis of complete mitochondrial sequences of 102 species representing 93 mammal families. The consensus methods produced poorly resolved consensus trees and did not perform well, except for the majority rule consensus with compatible groupings. The weighted supertree and matrix representation with parsimony methods performed equally well and were highly congruent with the model trees. The most similar supertree method was the least congruent with the model trees. We conclude that some of the methods tested are worth considering in a phylogenomic context.
Collapse
Affiliation(s)
- Véronique Campbell
- Université de Montréal, Département de Sciences Biologiques, C.P. 6128, Succ. Centre-ville, Montréal, Québec, H3C 3J7, Canada
| | - François-Joseph Lapointe
- Université de Montréal, Département de Sciences Biologiques, C.P. 6128, Succ. Centre-ville, Montréal, Québec, H3C 3J7, Canada
| |
Collapse
|
20
|
Asher RJ, Bennett N, Lehmann T. The new framework for understanding placental mammal evolution. Bioessays 2010; 31:853-64. [PMID: 19582725 DOI: 10.1002/bies.200900053] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
An unprecedented level of confidence has recently crystallized around a new hypothesis of how living placental mammals share a pattern of common descent. The major groups are afrotheres (e.g., aardvarks, elephants), xenarthrans (e.g., anteaters, sloths), laurasiatheres (e.g., horses, shrews), and euarchontoglires (e.g., humans, rodents). Compared with previous hypotheses this tree is remarkably stable; however, some uncertainty persists about the location of the placental root, and (for example) the position of bats within laurasiatheres, of sea cows and aardvarks within afrotheres, and of dermopterans within euarchontoglires. A variety of names for sub-clades within the new placental mammal tree have been proposed, not all of which follow conventions regarding priority and stability. More importantly, the new phylogenetic framework enables the formulation of new hypotheses and testing thereof, for example regarding the possible developmental dichotomy that seems to distinguish members of the newly identified southern and northern radiations of living placental mammals.
Collapse
Affiliation(s)
- Robert J Asher
- Department of Zoology, University of Cambridge, Downing St., Cambridge CB23EJ, UK.
| | | | | |
Collapse
|
21
|
Fast and Accurate Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Novel Robustness Estimator. ACTA ACUST UNITED AC 2010. [DOI: 10.1007/978-3-642-16181-0_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
|
22
|
Abstract
Genomewide analyses of distances between orthologous gene pairs from the ascidian species Ciona intestinalis and Ciona savignyi were compared with those of vertebrates. Combining this data with a detailed and careful use of vertebrate fossil records, we estimated the time of divergence between the two ascidians nearly 180 My. This estimation was obtained after correcting for the different substitution rates found comparing several groups of chordates; indeed we determine here that on average Ciona species evolve 50% faster than vertebrates.
Collapse
|
23
|
Suzuki Y, Gojobori T, Kumar S. Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level. Mol Biol Evol 2009; 26:2275-84. [PMID: 19581348 DOI: 10.1093/molbev/msp133] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
In detecting natural selection operating at the amino acid sequence level by comparing the rates of synonymous (r(S)) and nonsynonymous (r(N)) substitutions, the rates of synonymous and nonsynonymous mutations are assumed to be approximately the same. In reality, however, these rates may not be the same if different proportions of synonymous and nonsynonymous sites overlap with CpG dinucleotides, which are known to be hypermutable in some organisms. Here, we develop the evolutionary pathway methods for comparing r(S) and r(N) at multiple codon sites (all-sites analysis) and at single codon sites (single-site analysis) that take into account the hypermutability at CpG dinucleotides in estimating the number of synonymous substitutions per synonymous site (d(S)) and nonsynonymous substitutions per nonsynonymous site (d(N)). Computer simulations show that the direction and magnitude of the bias in the estimation of d(N)/d(S) caused by the hypermutability of CpGs are determined by both the number of CpGs and the relative proportions of synonymous and nonsynonymous sites overlapping with CpGs. This bias is greatly reduced when using the methods we propose to account for the hypermutability of CpG dinucleotides. In an all-sites analysis of protamine 1 genes from primates, d(N)/d(S) > 1 was observed for many pairs if the hypermutability was ignored. However, d(N)/d(S) becomes <or=1 for most of these pairs when the CpG sites are assumed to be hypermutable. Therefore, statistical indications of positive selection in some sequences or individual codons may be caused by mutation rate differences in synonymous and nonsynonymous sites.
Collapse
Affiliation(s)
- Yoshiyuki Suzuki
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Shizuoka, Japan.
| | | | | |
Collapse
|
24
|
White CR, Blackburn TM, Seymour RS. Phylogenetically informed analysis of the allometry of Mammalian Basal metabolic rate supports neither geometric nor quarter-power scaling. Evolution 2009; 63:2658-67. [PMID: 19519636 DOI: 10.1111/j.1558-5646.2009.00747.x] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
The form of the relationship between the basal metabolic rate (BMR) and body mass (M) of mammals has been at issue for almost seven decades, with debate focusing on the value of the scaling exponent (b, where BMR is proportional to M(b)) and the relative merits of b= 0.67 (geometric scaling) and b= 0.75 (quarter-power scaling). However, most analyses are not phylogenetically informed (PI) and therefore fail to account for the shared evolutionary history of the species they consider. Here, we reanalyze the most rigorously selected and comprehensive mammalian BMR dataset presently available, and investigate the effects of data selection and phylogenetic method (phylogenetic generalized least squares and independent contrasts) on estimation of the scaling exponent relating mammalian BMR to M. Contrary to the results of a non-PI analysis of these data, which found an exponent of 0.67-0.69, we find that most of the PI scaling exponents are significantly different from both 0.67 and 0.75. Similarly, the scaling exponents differ between lineages, and these exponents are also often different from 0.67 or 0.75. Thus, we conclude that no single value of b adequately characterizes the allometric relationship between body mass and BMR.
Collapse
Affiliation(s)
- Craig R White
- School of Biosciences, The University of Birmingham, Edgbaston, Birmingham, United Kingdom.
| | | | | |
Collapse
|
25
|
Alekseyev MA, Pevzner PA. Breakpoint graphs and ancestral genome reconstructions. Genes Dev 2009; 19:943-57. [PMID: 19218533 PMCID: PMC2675983 DOI: 10.1101/gr.082784.108] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2008] [Accepted: 01/22/2009] [Indexed: 11/24/2022]
Abstract
Recently completed whole-genome sequencing projects marked the transition from gene-based phylogenetic studies to phylogenomics analysis of entire genomes. We developed an algorithm MGRA for reconstructing ancestral genomes and used it to study the rearrangement history of seven mammalian genomes: human, chimpanzee, macaque, mouse, rat, dog, and opossum. MGRA relies on the notion of the multiple breakpoint graphs to overcome some limitations of the existing approaches to ancestral genome reconstructions. MGRA also generates the rearrangement-based characters guiding the phylogenetic tree reconstruction when the phylogeny is unknown.
Collapse
Affiliation(s)
- Max A. Alekseyev
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, California 92093-0404, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, California 92093-0404, USA
| |
Collapse
|
26
|
Schneider A, Cannarozzi GM. Support patterns from different outgroups provide a strong phylogenetic signal. Mol Biol Evol 2009; 26:1259-72. [PMID: 19240194 DOI: 10.1093/molbev/msp034] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
It is known that the accuracy of phylogenetic reconstruction decreases when more distant outgroups are used. We quantify this phenomenon with a novel scoring method, the outgroup score pOG. This score expresses if the support for a particular branch of a tree decreases with increasingly distant outgroups. Large-scale simulations confirmed that the outgroup support follows this expectation and that the pOG score captures this pattern. The score often identifies the correct topology even when the primary reconstruction methods fail, particularly in the presence of model violations. In simulations of problematic phylogenetic scenarios such as rate variation among lineages (which can lead to long-branch attraction artifacts) and quartet-based reconstruction, the pOG analysis outperformed the primary reconstruction methods. Because the pOG method does not make any assumptions about the evolutionary model (besides the decreasing support from increasingly distant outgroups), it can detect cases of violations not treated by a specific model or too strong to be fully corrected. When used as an optimization criterion in the construction of a tree of 23 mammals, the outgroup signal confirmed many well-accepted mammalian orders and superorders. It supports Atlantogenata, a clade of Afrotheria and Xenarthra, and suggests an Artiodactyla-Chiroptera clade.
Collapse
Affiliation(s)
- Adrian Schneider
- ETH Zurich, Department of Computer Science, Zurich, Switzerland.
| | | |
Collapse
|
27
|
Huttley G. Do genomic datasets resolve the correct relationship among the placental, marsupial and monotreme lineages? AUST J ZOOL 2009. [DOI: 10.1071/zo09049] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Did the mammal radiation arise through initial divergence of prototherians from a common ancestor of metatherians and eutherians, the Theria hypothesis, or of eutherians from a common ancestor of metatherians and prototherians, the Marsupionta hypothesis? Molecular phylogenetic analyses of point substitutions applied to this problem have been contradictory – mtDNA-encoded sequences supported Marsupionta, nuclear-encoded sequences and RY (purine–pyrimidine)-recoded mtDNA supported Theria. The consistency property of maximum likelihood guarantees convergence on the true tree only with longer alignments. Results from analyses of genome datasets should therefore be impervious to choice of outgroup. We assessed whether important hypotheses concerning mammal evolution, including Theria/Marsupionta and the branching order of rodents, carnivorans and primates, are resolved by phylogenetic analyses using ~2.3 megabases of protein-coding sequence from genome projects. In each case, only two tree topologies were being compared and thus inconsistency in resolved topologies can only derive from flawed models of sequence divergence. The results from all substitution models strongly supported Theria. For the eutherian lineages, all models were sensitive to the outgroup. We argue that phylogenetic inference from point substitutions will remain unreliable until substitution models that better match biological mechanisms of sequence divergence have been developed.
Collapse
|
28
|
Kofler R, Schlötterer C, Luschützky E, Lelley T. Survey of microsatellite clustering in eight fully sequenced species sheds light on the origin of compound microsatellites. BMC Genomics 2008; 9:612. [PMID: 19091106 PMCID: PMC2644718 DOI: 10.1186/1471-2164-9-612] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2008] [Accepted: 12/17/2008] [Indexed: 01/24/2023] Open
Abstract
Background Compound microsatellites are a special variation of microsatellites in which two or more individual microsatellites are found directly adjacent to each other. Until now, such composite microsatellites have not been investigated in a comprehensive manner. Results Our in silico survey of microsatellite clustering in genomes of Homo sapiens, Maccaca mulatta, Mus musculus, Rattus norvegicus, Ornithorhynchus anatinus, Gallus gallus, Danio rerio and Drosophila melanogaster revealed an unexpected high abundance of compound microsatellites. About 4 – 25% of all microsatellites could be categorized as compound microsatellites. Compound microsatellites are approximately 15 times more frequent than expected under the assumption of a random distribution of microsatellites. Interestingly, microsatellites do not only tend to cluster but the adjacent repeat types of compound microsatellites have very similar motifs: in most cases (>90%) these motifs differ only by a single mutation (base substitution or indel). We propose that the majority of the compound microsatellites originates by duplication of imperfections in a microsatellite tract. This process occurs mostly at the end of a microsatellite, leading to a new repeat type and a potential microsatellite repeat track. Conclusion Our findings suggest a more dynamic picture of microsatellite evolution than previously believed. Imperfections within microsatellites might not only cause the "death" of microsatellites they might also result in their "birth".
Collapse
Affiliation(s)
- Robert Kofler
- University of Natural Resources and Applied Life Sciences, Department for Agrobiotechnology IFA-Tulln, Institute of Biotechnology in Plant Production, Tulln, Austria.
| | | | | | | |
Collapse
|
29
|
Asher RJ, Geisler JH, Sánchez-Villagra MR. Morphology, paleontology, and placental mammal phylogeny. Syst Biol 2008; 57:311-7. [PMID: 18432551 DOI: 10.1080/10635150802033022] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Affiliation(s)
- Robert J Asher
- Department of Zoology, University of Cambridge, Downing Street, UK
| | | | | |
Collapse
|
30
|
Okoruwa OE, Weston MD, Sanjeevi DC, Millemon AR, Fritzsch B, Hallworth R, Beisel KW. Evolutionary insights into the unique electromotility motor of mammalian outer hair cells. Evol Dev 2008; 10:300-15. [PMID: 18460092 DOI: 10.1111/j.1525-142x.2008.00239.x] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Prestin (SLC26A5) is the molecular motor responsible for cochlear amplification by mammalian cochlea outer hair cells and has the unique combined properties of energy-independent motility, voltage sensitivity, and speed of cellular shape change. The ion transporter capability, typical of SLC26A members, was exchanged for electromotility function and is a newly derived feature of the therian cochlea. A putative minimal essential motif for the electromotility motor (meEM) was identified through the amalgamation of comparative genomic, evolution, and structural diversification approaches. Comparisons were done among nonmammalian vertebrates, eutherian mammalian species, and the opossum and platypus. The opossum and platypus SLC26A5 proteins were comparable to the eutherian consensus sequence. Suggested from the point-accepted mutation analysis, the meEM motif spans all the transmembrane segments and represented residues 66-503. Within the eutherian clade, the meEM was highly conserved with a substitution frequency of only 39/7497 (0.5%) residues, compared with 5.7% in SLC26A4 and 12.8% in SLC26A6 genes. Clade-specific substitutions were not observed and there was no sequence correlation with low or high hearing frequency specialists. We were able to identify that within the highly conserved meEM motif two regions, which are unique to all therian species, appear to be the most derived features in the SLC26A5 peptide.
Collapse
Affiliation(s)
- Oseremen E Okoruwa
- Department of Biomedical Sciences, Creighton University School of Medicine, Omaha, NE 68178, USA
| | | | | | | | | | | | | |
Collapse
|
31
|
Rytkönen KT, Ryynänen HJ, Nikinmaa M, Primmer CR. Variable patterns in the molecular evolution of the hypoxia-inducible factor-1 alpha (HIF-1α) gene in teleost fishes and mammals. Gene 2008; 420:1-10. [DOI: 10.1016/j.gene.2008.04.018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2008] [Accepted: 04/28/2008] [Indexed: 10/22/2022]
|
32
|
Springer MS, Meredith RW, Eizirik E, Teeling E, Murphy WJ. Morphology and Placental Mammal Phylogeny. Syst Biol 2008; 57:499-503. [DOI: 10.1080/10635150802164504] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Affiliation(s)
- Mark S. Springer
- Department of Biology, University of California
Riverside, CA 92521, USA; E-mail: (M.S.S.)
| | - Robert W. Meredith
- Department of Biology, University of California
Riverside, CA 92521, USA; E-mail: (M.S.S.)
| | - Eduardo Eizirik
- Faculdade de Biociencias, PUCRS
Porto Alegre, RS 90619-900, Brazil
| | - Emma Teeling
- School of Biological and Environmental Sciences, University College Dublin Belfield
Dublin, 4, Ireland
| | - William J. Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University
College Station, TX 77843-4458, USA
| |
Collapse
|
33
|
Warren WC, Hillier LW, Marshall Graves JA, Birney E, Ponting CP, Grützner F, Belov K, Miller W, Clarke L, Chinwalla AT, Yang SP, Heger A, Locke DP, Miethke P, Waters PD, Veyrunes F, Fulton L, Fulton B, Graves T, Wallis J, Puente XS, López-Otín C, Ordóñez GR, Eichler EE, Chen L, Cheng Z, Deakin JE, Alsop A, Thompson K, Kirby P, Papenfuss AT, Wakefield MJ, Olender T, Lancet D, Huttley GA, Smit AFA, Pask A, Temple-Smith P, Batzer MA, Walker JA, Konkel MK, Harris RS, Whittington CM, Wong ESW, Gemmell NJ, Buschiazzo E, Vargas Jentzsch IM, Merkel A, Schmitz J, Zemann A, Churakov G, Kriegs JO, Brosius J, Murchison EP, Sachidanandam R, Smith C, Hannon GJ, Tsend-Ayush E, McMillan D, Attenborough R, Rens W, Ferguson-Smith M, Lefèvre CM, Sharp JA, Nicholas KR, Ray DA, Kube M, Reinhardt R, Pringle TH, Taylor J, Jones RC, Nixon B, Dacheux JL, Niwa H, Sekita Y, Huang X, Stark A, Kheradpour P, Kellis M, Flicek P, Chen Y, Webber C, Hardison R, Nelson J, Hallsworth-Pepin K, Delehaunty K, Markovic C, Minx P, Feng Y, Kremitzki C, Mitreva M, Glasscock J, Wylie T, Wohldmann P, Thiru P, Nhan MN, Pohl CS, Smith SM, Hou S, Nefedov M, et alWarren WC, Hillier LW, Marshall Graves JA, Birney E, Ponting CP, Grützner F, Belov K, Miller W, Clarke L, Chinwalla AT, Yang SP, Heger A, Locke DP, Miethke P, Waters PD, Veyrunes F, Fulton L, Fulton B, Graves T, Wallis J, Puente XS, López-Otín C, Ordóñez GR, Eichler EE, Chen L, Cheng Z, Deakin JE, Alsop A, Thompson K, Kirby P, Papenfuss AT, Wakefield MJ, Olender T, Lancet D, Huttley GA, Smit AFA, Pask A, Temple-Smith P, Batzer MA, Walker JA, Konkel MK, Harris RS, Whittington CM, Wong ESW, Gemmell NJ, Buschiazzo E, Vargas Jentzsch IM, Merkel A, Schmitz J, Zemann A, Churakov G, Kriegs JO, Brosius J, Murchison EP, Sachidanandam R, Smith C, Hannon GJ, Tsend-Ayush E, McMillan D, Attenborough R, Rens W, Ferguson-Smith M, Lefèvre CM, Sharp JA, Nicholas KR, Ray DA, Kube M, Reinhardt R, Pringle TH, Taylor J, Jones RC, Nixon B, Dacheux JL, Niwa H, Sekita Y, Huang X, Stark A, Kheradpour P, Kellis M, Flicek P, Chen Y, Webber C, Hardison R, Nelson J, Hallsworth-Pepin K, Delehaunty K, Markovic C, Minx P, Feng Y, Kremitzki C, Mitreva M, Glasscock J, Wylie T, Wohldmann P, Thiru P, Nhan MN, Pohl CS, Smith SM, Hou S, Nefedov M, de Jong PJ, Renfree MB, Mardis ER, Wilson RK. Genome analysis of the platypus reveals unique signatures of evolution. Nature 2008; 453:175-83. [PMID: 18464734 PMCID: PMC2803040 DOI: 10.1038/nature06936] [Show More Authors] [Citation(s) in RCA: 480] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Accepted: 03/25/2008] [Indexed: 12/18/2022]
Abstract
We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.
Collapse
Affiliation(s)
- Wesley C Warren
- Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St Louis, Missouri 63108, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Ranwez V, Delsuc F, Ranwez S, Belkhir K, Tilak MK, Douzery EJ. OrthoMaM: a database of orthologous genomic markers for placental mammal phylogenetics. BMC Evol Biol 2007; 7:241. [PMID: 18053139 PMCID: PMC2249597 DOI: 10.1186/1471-2148-7-241] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2007] [Accepted: 11/30/2007] [Indexed: 11/23/2022] Open
Abstract
Background Molecular sequence data have become the standard in modern day phylogenetics. In particular, several long-standing questions of mammalian evolutionary history have been recently resolved thanks to the use of molecular characters. Yet, most studies have focused on only a handful of standard markers. The availability of an ever increasing number of whole genome sequences is a golden mine for modern systematics. Genomic data now provide the opportunity to select new markers that are potentially relevant for further resolving branches of the mammalian phylogenetic tree at various taxonomic levels. Description The EnsEMBL database was used to determine a set of orthologous genes from 12 available complete mammalian genomes. As targets for possible amplification and sequencing in additional taxa, more than 3,000 exons of length > 400 bp have been selected, among which 118, 368, 608, and 674 are respectively retrieved for 12, 11, 10, and 9 species. A bioinformatic pipeline has been developed to provide evolutionary descriptors for these candidate markers in order to assess their potential phylogenetic utility. The resulting OrthoMaM (Orthologous Mammalian Markers) database can be queried and alignments can be downloaded through a dedicated web interface . Conclusion The importance of marker choice in phylogenetic studies has long been stressed. Our database centered on complete genome information now makes possible to select promising markers to a given phylogenetic question or a systematic framework by querying a number of evolutionary descriptors. The usefulness of the database is illustrated with two biological examples. First, two potentially useful markers were identified for rodent systematics based on relevant evolutionary parameters and sequenced in additional species. Second, a complete, gapless 94 kb supermatrix of 118 orthologous exons was assembled for 12 mammals. Phylogenetic analyses using probabilistic methods unambiguously supported the new placental phylogeny by retrieving the monophyly of Glires, Euarchontoglires, Laurasiatheria, and Boreoeutheria. Muroid rodents thus do not represent a basal placental lineage as it was mistakenly reasserted in some recent phylogenomic analyses based on fewer taxa. We expect the OrthoMaM database to be useful for further resolving the phylogenetic tree of placental mammals and for better understanding the evolutionary dynamics of their genomes, i.e., the forces that shaped coding sequences in terms of selective constraints.
Collapse
Affiliation(s)
- Vincent Ranwez
- Université Montpellier 2, CC064, Place Eugène Bataillon, 34 095 Montpellier Cedex 05, France.
| | | | | | | | | | | |
Collapse
|
35
|
Matsuya A, Sakate R, Kawahara Y, Koyanagi KO, Sato Y, Fujii Y, Yamasaki C, Habara T, Nakaoka H, Todokoro F, Yamaguchi K, Endo T, Oota S, Makalowski W, Ikeo K, Suzuki Y, Hanada K, Hashimoto K, Hirai M, Iwama H, Saitou N, Hiraki AT, Jin L, Kaneko Y, Kanno M, Murakami K, Noda AO, Saichi N, Sanbonmatsu R, Suzuki M, Takeda JI, Tanaka M, Gojobori T, Imanishi T, Itoh T. Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees. Nucleic Acids Res 2007; 36:D787-92. [PMID: 17982176 PMCID: PMC2238928 DOI: 10.1093/nar/gkm878] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Orthologs are genes in different species that evolved from a common ancestral gene by speciation. Currently, with the rapid growth of transcriptome data of various species, more reliable orthology information is prerequisite for further studies. However, detection of orthologs could be erroneous if pairwise distance-based methods, such as reciprocal BLAST searches, are utilized. Thus, as a sub-database of H-InvDB, an integrated database of annotated human genes (http://h-invitational.jp/), we constructed a fully curated database of evolutionary features of human genes, called ‘Evola’. In the process of the ortholog detection, computational analysis based on conserved genome synteny and transcript sequence similarity was followed by manual curation by researchers examining phylogenetic trees. In total, 18 968 human genes have orthologs among 11 vertebrates (chimpanzee, mouse, cow, chicken, zebrafish, etc.), either computationally detected or manually curated orthologs. Evola provides amino acid sequence alignments and phylogenetic trees of orthologs and homologs. In ‘dN/dS view’, natural selection on genes can be analyzed between human and other species. In ‘Locus maps’, all transcript variants and their exon/intron structures can be compared among orthologous gene loci. We expect the Evola to serve as a comprehensive and reliable database to be utilized in comparative analyses for obtaining new knowledge about human genes. Evola is available at http://www.h-invitational.jp/evola/.
Collapse
Affiliation(s)
- Akihiro Matsuya
- Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics Consortium, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
|
37
|
Shoval Y, Pietrokovski S, Kimchi A. ZIPK: a unique case of murine-specific divergence of a conserved vertebrate gene. PLoS Genet 2007; 3:1884-93. [PMID: 17953487 PMCID: PMC2041995 DOI: 10.1371/journal.pgen.0030180] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Accepted: 09/06/2007] [Indexed: 12/27/2022] Open
Abstract
Zipper interacting protein kinase (ZIPK, also known as death-associated protein kinase 3 [DAPK3]) is a Ser/Thr kinase that functions in programmed cell death. Since its identification eight years ago, contradictory findings regarding its intracellular localization and molecular mode of action have been reported, which may be attributed to unpredicted differences among the human and rodent orthologs. By aligning the sequences of all available ZIPK orthologs, from fish to human, we discovered that rat and mouse sequences are more diverged from the human ortholog relative to other, more distant, vertebrates. To test experimentally the outcome of this sequence divergence, we compared rat ZIPK to human ZIPK in the same cellular settings. We found that while ectopically expressed human ZIPK localized to the cytoplasm and induced membrane blebbing, rat ZIPK localized exclusively within nuclei, mainly to promyelocytic leukemia oncogenic bodies, and induced significantly lower levels of membrane blebbing. Among the unique murine (rat and mouse) sequence features, we found that a highly conserved phosphorylation site, previously shown to have an effect on the cellular localization of human ZIPK, is absent in murines but not in earlier diverging organisms. Recreating this phosphorylation site in rat ZIPK led to a significant reduction in its promyelocytic leukemia oncogenic body localization, yet did not confer full cytoplasmic localization. Additionally, we found that while rat ZIPK interacts with PAR-4 (also known as PAWR) very efficiently, human ZIPK fails to do so. This interaction has clear functional implications, as coexpression of PAR-4 with rat ZIPK caused nuclear to cytoplasm translocation and induced strong membrane blebbing, thus providing the murine protein a possible adaptive mechanism to compensate for its sequence divergence. We have also cloned zebrafish ZIPK and found that, like the human and unlike the murine orthologs, it localizes to the cytoplasm, and fails to bind the highly conserved PAR-4 protein. This further supports the hypothesis that murine ZIPK underwent specific divergence from a conserved consensus. In conclusion, we present a case of species-specific divergence occurring in a specific branch of the evolutionary tree, accompanied by the acquisition of a unique protein-protein interaction that enables conservation of cellular function.
Collapse
Affiliation(s)
- Yishay Shoval
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Shmuel Pietrokovski
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Adi Kimchi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
38
|
Wildman DE, Uddin M, Opazo JC, Liu G, Lefort V, Guindon S, Gascuel O, Grossman LI, Romero R, Goodman M. Genomics, biogeography, and the diversification of placental mammals. Proc Natl Acad Sci U S A 2007; 104:14395-400. [PMID: 17728403 PMCID: PMC1958817 DOI: 10.1073/pnas.0704342104] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2007] [Indexed: 11/18/2022] Open
Abstract
Previous molecular analyses of mammalian evolutionary relationships involving a wide range of placental mammalian taxa have been restricted in size from one to two dozen gene loci and have not decisively resolved the basal branching order within Placentalia. Here, on extracting from thousands of gene loci both their coding nucleotide sequences and translated amino acid sequences, we attempt to resolve key uncertainties about the ancient branching pattern of crown placental mammals. Focusing on approximately 1,700 conserved gene loci, those that have the more slowly evolving coding sequences, and using maximum-likelihood, Bayesian inference, maximum parsimony, and neighbor-joining (NJ) phylogenetic tree reconstruction methods, we find from almost all results that a clade (the southern Atlantogenata) composed of Afrotheria and Xenarthra is the sister group of all other (the northern Boreoeutheria) crown placental mammals, among boreoeutherians Rodentia groups with Lagomorpha, and the resultant Glires is close to Primates. Only the NJ tree for nucleotide sequences separates Rodentia (murids) first and then Lagomorpha (rabbit) from the other placental mammals. However, this nucleotide NJ tree still depicts Atlantogenata and Boreoeutheria but minus Rodentia and Lagomorpha. Moreover, the NJ tree for amino acid sequences does depict the basal separation to be between Atlantogenata and a Boreoeutheria that includes Rodentia and Lagomorpha. Crown placental mammalian diversification appears to be largely the result of ancient plate tectonic events that allowed time for convergent phenotypes to evolve in the descendant clades.
Collapse
Affiliation(s)
- Derek E. Wildman
- Perinatology Research Branch, National Institute of Child Health and Human Development/National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20892
- Center For Molecular Medicine and Genetics, and
- Departments of Obstetrics and Gynecology and
| | | | - Juan C. Opazo
- Center For Molecular Medicine and Genetics, and
- School of Biological Sciences, University of Nebraska, Lincoln, NE 68588; and
| | - Guozhen Liu
- Center For Molecular Medicine and Genetics, and
| | - Vincent Lefort
- Laboratory of Computer Science, Robotics, and Microelectronics, Centre National de la Recherche Scientifique, Université Montpellier II, 161 Rue Ada, 34392 Montpellier, France
| | - Stephane Guindon
- Laboratory of Computer Science, Robotics, and Microelectronics, Centre National de la Recherche Scientifique, Université Montpellier II, 161 Rue Ada, 34392 Montpellier, France
| | - Olivier Gascuel
- Laboratory of Computer Science, Robotics, and Microelectronics, Centre National de la Recherche Scientifique, Université Montpellier II, 161 Rue Ada, 34392 Montpellier, France
| | | | - Roberto Romero
- Perinatology Research Branch, National Institute of Child Health and Human Development/National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20892
| | - Morris Goodman
- Center For Molecular Medicine and Genetics, and
- Anatomy and Cell Biology, Wayne State University, Detroit, MI 48201
| |
Collapse
|
39
|
Kullberg M, Hallström B, Arnason U, Janke A. Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution. PLoS One 2007; 2:e775. [PMID: 17712423 PMCID: PMC1942079 DOI: 10.1371/journal.pone.0000775] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2007] [Accepted: 07/24/2007] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND We investigate the usefulness of expressed sequence tags, ESTs, for establishing divergences within the tree of placental mammals. This is done on the example of the established relationships among primates (human), lagomorphs (rabbit), rodents (rat and mouse), artiodactyls (cow), carnivorans (dog) and proboscideans (elephant). METHODOLOGY/PRINCIPAL FINDINGS We have produced 2000 ESTs (1.2 mega bases) from a marsupial mouse and characterized the data for their use in phylogenetic analysis. The sequences were used to identify putative orthologous sequences from whole genome projects. Although most ESTs stem from single sequence reads, the frequency of potential sequencing errors was found to be lower than allelic variation. Most of the sequences represented slowly evolving housekeeping-type genes, with an average amino acid distance of 6.6% between human and mouse. Positive Darwinian selection was identified at only a few single sites. Phylogenetic analyses of the EST data yielded trees that were consistent with those established from whole genome projects. CONCLUSIONS The general quality of EST sequences and the general absence of positive selection in these sequences make ESTs an attractive tool for phylogenetic analysis. The EST approach allows, at reasonable costs, a fast extension of data sampling from species outside the genome projects.
Collapse
Affiliation(s)
- Morgan Kullberg
- Department of Cell and Organism Biology, Division of Evolutionary Molecular Systematics, University of Lund, Lund, Sweden.
| | | | | | | |
Collapse
|