1
|
Zou Y, Zhang Z, Zeng Y, Hu H, Hao Y, Huang S, Li B. Common Methods for Phylogenetic Tree Construction and Their Implementation in R. Bioengineering (Basel) 2024; 11:480. [PMID: 38790347 PMCID: PMC11117635 DOI: 10.3390/bioengineering11050480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 05/04/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
A phylogenetic tree can reflect the evolutionary relationships between species or gene families, and they play a critical role in modern biological research. In this review, we summarize common methods for constructing phylogenetic trees, including distance methods, maximum parsimony, maximum likelihood, Bayesian inference, and tree-integration methods (supermatrix and supertree). Here we discuss the advantages, shortcomings, and applications of each method and offer relevant codes to construct phylogenetic trees from molecular data using packages and algorithms in R. This review aims to provide comprehensive guidance and reference for researchers seeking to construct phylogenetic trees while also promoting further development and innovation in this field. By offering a clear and concise overview of the different methods available, we hope to enable researchers to select the most appropriate approach for their specific research questions and datasets.
Collapse
Affiliation(s)
- Yue Zou
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Zixuan Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Yujie Zeng
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Hanyue Hu
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| | - Sheng Huang
- Animal Nutrition Institute, Chongqing Academy of Animal Science, Chongqing 402460, China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, China; (Y.Z.); (Z.Z.); (Y.Z.); (H.H.); (Y.H.)
| |
Collapse
|
2
|
Church SH, Mah JL, Dunn CW. Integrating phylogenies into single-cell RNA sequencing analysis allows comparisons across species, genes, and cells. PLoS Biol 2024; 22:e3002633. [PMID: 38787797 PMCID: PMC11125556 DOI: 10.1371/journal.pbio.3002633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2024] Open
Abstract
Comparisons of single-cell RNA sequencing (scRNA-seq) data across species can reveal links between cellular gene expression and the evolution of cell functions, features, and phenotypes. These comparisons evoke evolutionary histories, as depicted by phylogenetic trees, that define relationships between species, genes, and cells. This Essay considers each of these in turn, laying out challenges and solutions derived from a phylogenetic comparative approach and relating these solutions to previously proposed methods for the pairwise alignment of cellular dimensional maps. This Essay contends that species trees, gene trees, cell phylogenies, and cell lineages can all be reconciled as descriptions of the same concept-the tree of cellular life. By integrating phylogenetic approaches into scRNA-seq analyses, challenges for building informed comparisons across species can be overcome, and hypotheses about gene and cell evolution can be robustly tested.
Collapse
Affiliation(s)
- Samuel H. Church
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Jasmine L. Mah
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Casey W. Dunn
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
3
|
Harrison MC, Ubbelohde EJ, LaBella AL, Opulente DA, Wolters JF, Zhou X, Shen XX, Groenewald M, Hittinger CT, Rokas A. Machine learning enables identification of an alternative yeast galactose utilization pathway. Proc Natl Acad Sci U S A 2024; 121:e2315314121. [PMID: 38669185 PMCID: PMC11067038 DOI: 10.1073/pnas.2315314121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 02/27/2024] [Indexed: 04/28/2024] Open
Abstract
How genomic differences contribute to phenotypic differences is a major question in biology. The recently characterized genomes, isolation environments, and qualitative patterns of growth on 122 sources and conditions of 1,154 strains from 1,049 fungal species (nearly all known) in the yeast subphylum Saccharomycotina provide a powerful, yet complex, dataset for addressing this question. We used a random forest algorithm trained on these genomic, metabolic, and environmental data to predict growth on several carbon sources with high accuracy. Known structural genes involved in assimilation of these sources and presence/absence patterns of growth in other sources were important features contributing to prediction accuracy. By further examining growth on galactose, we found that it can be predicted with high accuracy from either genomic (92.2%) or growth data (82.6%) but not from isolation environment data (65.6%). Prediction accuracy was even higher (93.3%) when we combined genomic and growth data. After the GALactose utilization genes, the most important feature for predicting growth on galactose was growth on galactitol, raising the hypothesis that several species in two orders, Serinales and Pichiales (containing the emerging pathogen Candida auris and the genus Ogataea, respectively), have an alternative galactose utilization pathway because they lack the GAL genes. Growth and biochemical assays confirmed that several of these species utilize galactose through an alternative oxidoreductive D-galactose pathway, rather than the canonical GAL pathway. Machine learning approaches are powerful for investigating the evolution of the yeast genotype-phenotype map, and their application will uncover novel biology, even in well-studied traits.
Collapse
Affiliation(s)
- Marie-Claire Harrison
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235
| | - Emily J Ubbelohde
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
| | - Abigail L LaBella
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28262
| | - Dana A Opulente
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
- Department of Biology, Villanova University, Villanova, PA 19085
| | - John F Wolters
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
| | - Xiaofan Zhou
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xing-Xing Shen
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China
| | | | - Chris Todd Hittinger
- Laboratory of Genetics, Department of Energy (DOE) Great Lakes Bioenergy Research Center, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, Wisconsin Energy Institute, University of Wisconsin-Madison, Madison, WI 53726
| | - Antonis Rokas
- Department of Biological Sciences and Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235
| |
Collapse
|
4
|
Mohanta TK, Mohanta YK, Kaushik P, Kumar J. Physiology, genomics, and evolutionary aspects of desert plants. J Adv Res 2024; 58:63-78. [PMID: 37160225 PMCID: PMC10982872 DOI: 10.1016/j.jare.2023.04.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 04/28/2023] [Accepted: 04/29/2023] [Indexed: 05/11/2023] Open
Abstract
BACKGROUND Despite the exposure to arid environmental conditions across the globe ultimately hampering the sustainability of the living organism, few plant species are equipped with several unique genotypic, biochemical, and physiological features to counter such harsh conditions. Physiologically, they have evolved with reduced leaf size, spines, waxy cuticles, thick leaves, succulent hydrenchyma, sclerophyll, chloroembryo, and photosynthesis in nonfoliar and other parts. At the biochemical level, they are evolved to perform efficient photosynthesis through Crassulacean acid metabolism (CAM) and C4 pathways with the formation of oxaloacetic acid (Hatch-Slack pathway) instead of the C3 pathway. Additionally, comparative genomics with existing data provides ample evidence of the xerophytic plants' positive selection to adapt to the arid environment. However, adding more high-throughput sequencing of xerophyte plant species is further required for a comparative genomic study toward trait discovery related to survival. Learning from the mechanism to survive in harsh conditions could pave the way to engineer crops for future sustainable agriculture. AIM OF THE REVIEW The distinct physiology of desert plants allows them to survive in harsh environments. However, the genomic composition also contributes significantly to this and requires great attention. This review emphasizes the physiological and genomic adaptation of desert plants. Other important parameters, such as desert biodiversity and photosynthetic strategy, are also discussed with recent progress in the field. Overall, this review discusses the different features of desert plants, which prepares them for harsh conditions intending to translate knowledge to engineer plant species for sustainable agriculture. KEY SCIENTIFIC CONCEPTS OF REVIEW This review comprehensively presents the physiology, molecular mechanism, and genomics of desert plants aimed towards engineering a sustainable crop.
Collapse
Affiliation(s)
- Tapan Kumar Mohanta
- Natural and Medical Sciences Research Center, University of Nizwa, Nizwa 611, Oman.
| | - Yugal Kishore Mohanta
- Dept. of Applied Biology, University of Science and Technology Meghalaya, Baridua, Meghalaya 793101, India
| | - Prashant Kaushik
- Chaudhary Charan Singh Haryana Agricultural University, Hisar, Haryana, 125004, India
| | - Jitesh Kumar
- Department of Plant and Microbial Biology, University of Minnesota, Saint Paul, MN 55108, United States
| |
Collapse
|
5
|
Gemmell P, Sackton TB, Edwards SV, Liu JS. A phylogenetic method linking nucleotide substitution rates to rates of continuous trait evolution. PLoS Comput Biol 2024; 20:e1011995. [PMID: 38656999 PMCID: PMC11078400 DOI: 10.1371/journal.pcbi.1011995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 05/08/2024] [Accepted: 03/13/2024] [Indexed: 04/26/2024] Open
Abstract
Genomes contain conserved non-coding sequences that perform important biological functions, such as gene regulation. We present a phylogenetic method, PhyloAcc-C, that associates nucleotide substitution rates with changes in a continuous trait of interest. The method takes as input a multiple sequence alignment of conserved elements, continuous trait data observed in extant species, and a background phylogeny and substitution process. Gibbs sampling is used to assign rate categories (background, conserved, accelerated) to lineages and explore whether the assigned rate categories are associated with increases or decreases in the rate of trait evolution. We test our method using simulations and then illustrate its application using mammalian body size and lifespan data previously analyzed with respect to protein coding genes. Like other studies, we find processes such as tumor suppression, telomere maintenance, and p53 regulation to be related to changes in longevity and body size. In addition, we also find that skeletal genes, and developmental processes, such as sprouting angiogenesis, are relevant.
Collapse
Affiliation(s)
- Patrick Gemmell
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
| | - Timothy B. Sackton
- FAS Informatics Group, Harvard University, Cambridge, Massachusetts, United States of America
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Jun S. Liu
- Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
6
|
Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.10.579721. [PMID: 38496530 PMCID: PMC10942266 DOI: 10.1101/2024.02.10.579721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.
Collapse
|
7
|
Ribeiro TM, Espíndola A. Integrated phylogenomic approaches in insect systematics. CURRENT OPINION IN INSECT SCIENCE 2024; 61:101150. [PMID: 38061460 DOI: 10.1016/j.cois.2023.101150] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 11/16/2023] [Accepted: 11/25/2023] [Indexed: 12/29/2023]
Abstract
The increased accessibility of genomic and imaging methods, and the improved access to ecological, spatial, and other natural history-related data is allowing for insect systematics to grow and find answers to central evolutionary and taxonomic questions. Today, integrated studies in insect phylogenomics and systematics are combining natural history, behavior, developmental biology, morphology, fossils, geographic range data, and ecological interactions. This integration is contributing to the clarification of evolutionary relationships, and the recognition of the role played by these factors on the evolution of insects. Future work should continue to build on these advances, seeking to further increase open-access databasing and support for natural history research, as well as expand its analytical palettes.
Collapse
Affiliation(s)
- Taís Ma Ribeiro
- Department of Entomology, University of Maryland, 4112 Plant Sciences Building, 4291 Fieldhouse Dr., College Park, MD 20742-4454, USA
| | - Anahí Espíndola
- Department of Entomology, University of Maryland, 4112 Plant Sciences Building, 4291 Fieldhouse Dr., College Park, MD 20742-4454, USA.
| |
Collapse
|
8
|
Merrill RM, Arenas-Castro H, Feller AF, Harenčár J, Rossi M, Streisfeld MA, Kay KM. Genetics and the Evolution of Prezygotic Isolation. Cold Spring Harb Perspect Biol 2024; 16:a041439. [PMID: 37848246 PMCID: PMC10835618 DOI: 10.1101/cshperspect.a041439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]
Abstract
The significance of prezygotic isolation for speciation has been recognized at least since the Modern Synthesis. However, fundamental questions remain. For example, how are genetic associations between traits that contribute to prezygotic isolation maintained? What is the source of genetic variation underlying the evolution of these traits? And how do prezygotic barriers affect patterns of gene flow? We address these questions by reviewing genetic features shared across plants and animals that influence prezygotic isolation. Emerging technologies increasingly enable the identification and functional characterization of the genes involved, allowing us to test established theoretical expectations. Embedding these genes in their developmental context will allow further predictions about what constrains the evolution of prezygotic isolation. Ongoing improvements in statistical and computational tools will reveal how pre- and postzygotic isolation may differ in how they influence gene flow across the genome. Finally, we highlight opportunities for progress by combining theory with appropriate data.
Collapse
Affiliation(s)
- Richard M Merrill
- Faculty of Biology, Division of Evolutionary Biology, LMU Munich, 82152 Planegg-Martinsried, Germany
| | - Henry Arenas-Castro
- School of Biological Sciences, University of Queensland, St. Lucia, Queensland 4072, Australia
| | - Anna F Feller
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138, USA
- Arnold Arboretum of Harvard University, Boston, Massachusetts 02131, USA
| | - Julia Harenčár
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| | - Matteo Rossi
- Faculty of Biology, Division of Evolutionary Biology, LMU Munich, 82152 Planegg-Martinsried, Germany
| | - Matthew A Streisfeld
- Institute of Ecology and Evolution, University of Oregon, Eugene, Oregon 97403-5289, USA
| | - Kathleen M Kay
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, California 95060, USA
| |
Collapse
|
9
|
Christensen KE, Duarte A, Ma Z, Edwards JL, Brem RB. Dissecting an ancient stress resistance trait syndrome in the compost yeast Kluyveromyces marxianus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.21.572915. [PMID: 38187519 PMCID: PMC10769334 DOI: 10.1101/2023.12.21.572915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
In the search to understand how evolution builds new traits, ancient events are often the hardest to dissect. Species-unique traits pose a particular challenge for geneticists-cases in which a character arose long ago and, in the modern day, is conserved within a species, distinguishing it from reproductively isolated relatives. In this work, we have developed the budding yeast genus Kluyveromyces as a model for mechanistic dissection of trait variation across species boundaries. Phenotypic profiling revealed robust heat and chemical-stress tolerance phenotypes that distinguished the compost yeast K. marxianus from the rest of the clade. We used culture-based, transcriptomic, and genetic approaches to characterize the metabolic requirements of the K. marxianus trait syndrome. We then generated a population-genomic resource for K. marxianus and harnessed it in molecular-evolution analyses, which found hundreds of housekeeping genes with evidence for adaptive protein variation unique to this species. Our data support a model in which, in the distant past, K. marxianus underwent a vastly complex remodeling of its proteome to achieve stress resistance. Such a polygenic architecture, involving nucleotide-level allelic variation on a massive scale, is consistent with theoretical models of the mechanisms of long-term adaptation, and suggests principles of broad relevance for interspecies trait genetics.
Collapse
Affiliation(s)
- Kaylee E. Christensen
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, 94720
| | - Abel Duarte
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, 94720
| | - Zhenzhen Ma
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, 94720
- Current address: Department of Biology, Stanford University, Stanford, CA, 94305
| | - Judith L. Edwards
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, 94720
| | - Rachel B. Brem
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, 94720
| |
Collapse
|
10
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
11
|
Guerreiro R, Bonthala VS, Schlüter U, Hoang NV, Triesch S, Schranz ME, Weber APM, Stich B. A genomic panel for studying C3-C4 intermediate photosynthesis in the Brassiceae tribe. PLANT, CELL & ENVIRONMENT 2023; 46:3611-3627. [PMID: 37431820 DOI: 10.1111/pce.14662] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/18/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023]
Abstract
Research on C4 and C3-C4 photosynthesis has attracted significant attention because the understanding of the genetic underpinnings of these traits will support the introduction of its characteristics into commercially relevant crop species. We used a panel of 19 taxa of 18 Brassiceae species with different photosynthesis characteristics (C3 and C3-C4) with the following objectives: (i) create draft genome assemblies and annotations, (ii) quantify orthology levels using synteny maps between all pairs of taxa, (iii) describe the phylogenetic relatedness across all the species, and (iv) track the evolution of C3-C4 intermediate photosynthesis in the Brassiceae tribe. Our results indicate that the draft de novo genome assemblies are of high quality and cover at least 90% of the gene space. Therewith we more than doubled the sampling depth of genomes of the Brassiceae tribe that comprises commercially important as well as biologically interesting species. The gene annotation generated high-quality gene models, and for most genes extensive upstream sequences are available for all taxa, yielding potential to explore variants in regulatory sequences. The genome-based phylogenetic tree of the Brassiceae contained two main clades and indicated that the C3-C4 intermediate photosynthesis has evolved five times independently. Furthermore, our study provides the first genomic support of the hypothesis that Diplotaxis muralis is a natural hybrid of D. tenuifolia and D. viminea. Altogether, the de novo genome assemblies and the annotations reported in this study are a valuable resource for research on the evolution of C3-C4 intermediate photosynthesis.
Collapse
Affiliation(s)
- Ricardo Guerreiro
- Institute of Quantitative Genetics and Genomics of Plants, Faculty of Mathematics and Natural Sciences, Heinrich Heine University, Düsseldorf, Germany
| | - Venkata Suresh Bonthala
- Institute of Quantitative Genetics and Genomics of Plants, Faculty of Mathematics and Natural Sciences, Heinrich Heine University, Düsseldorf, Germany
| | - Urte Schlüter
- Institute of Plant Biochemistry, Faculty of Mathematics and Natural Sciences, Heinrich Heine University, Düsseldorf, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
| | - Nam V Hoang
- Biosystematics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands
| | - Sebastian Triesch
- Institute of Plant Biochemistry, Faculty of Mathematics and Natural Sciences, Heinrich Heine University, Düsseldorf, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
| | - M Eric Schranz
- Biosystematics Group, Department of Plant Sciences, Wageningen University, Wageningen, The Netherlands
| | - Andreas P M Weber
- Institute of Plant Biochemistry, Faculty of Mathematics and Natural Sciences, Heinrich Heine University, Düsseldorf, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
| | - Benjamin Stich
- Institute of Quantitative Genetics and Genomics of Plants, Faculty of Mathematics and Natural Sciences, Heinrich Heine University, Düsseldorf, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Düsseldorf, Germany
- Max Planck Institute for Plant Breeding Research, Köln, Germany
| |
Collapse
|
12
|
Yusuf LH, Saldívar Lemus Y, Thorpe P, Macías Garcia C, Ritchie MG. Genomic Signatures Associated with Transitions to Viviparity in Cyprinodontiformes. Mol Biol Evol 2023; 40:msad208. [PMID: 37789509 PMCID: PMC10568250 DOI: 10.1093/molbev/msad208] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 08/23/2023] [Accepted: 09/19/2023] [Indexed: 10/05/2023] Open
Abstract
The transition from oviparity to viviparity has occurred independently over 150 times across vertebrates, presenting one of the most compelling cases of phenotypic convergence. However, whether the repeated, independent evolution of viviparity is driven by redeployment of similar genetic mechanisms and whether these leave a common signature in genomic divergence remains largely unknown. Although recent investigations into the evolution of viviparity have demonstrated striking similarity among the genes and molecular pathways involved across disparate vertebrate groups, quantitative tests for genome-wide convergent have provided ambivalent answers. Here, we investigate the potential role of molecular convergence during independent transitions to viviparity across an order of ray-finned freshwater fish (Cyprinodontiformes). We assembled de novo genomes and utilized publicly available genomes of viviparous and oviparous species to test for molecular convergence across both coding and noncoding regions. We found no evidence for an excess of molecular convergence in amino acid substitutions and in rates of sequence divergence, implying independent genetic changes are associated with these transitions. However, both statistical power and biological confounds could constrain our ability to detect significant correlated evolution. We therefore identified candidate genes with potential signatures of molecular convergence in viviparous Cyprinodontiformes lineages. Motif enrichment and gene ontology analyses suggest transcriptional changes associated with early morphogenesis, brain development, and immunity occurred alongside the evolution of viviparity. Overall, however, our findings indicate that independent transitions to viviparity in these fish are not strongly associated with an excess of molecular convergence, but a few genes show convincing evidence of convergent evolution.
Collapse
Affiliation(s)
- Leeban H Yusuf
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK
| | - Yolitzi Saldívar Lemus
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK
- Department of Biology, Texas State University, San Marcos, TX, USA
| | - Peter Thorpe
- The Data Analysis Group, School of Life Sciences, University of Dundee, Dundee, UK
- School of Medicine, University of North Haugh, St Andrews KY16 9TF, UK
| | - Constantino Macías Garcia
- Instituto de Ecologia, Universidad Nacional Autónoma de México, Ciudad Universitaria, Mexico City CdMx, Mexico
| | - Michael G Ritchie
- Centre for Biological Diversity, School of Biology, University of St Andrews, St Andrews, UK
| |
Collapse
|
13
|
Parker MT, Fica SM, Barton GJ, Simpson GG. Inter-species association mapping links splice site evolution to METTL16 and SNRNP27K. eLife 2023; 12:e91997. [PMID: 37787376 PMCID: PMC10581693 DOI: 10.7554/elife.91997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 09/18/2023] [Indexed: 10/04/2023] Open
Abstract
Eukaryotic genes are interrupted by introns that are removed from transcribed RNAs by splicing. Patterns of splicing complexity differ between species, but it is unclear how these differences arise. We used inter-species association mapping with Saccharomycotina species to correlate splicing signal phenotypes with the presence or absence of splicing factors. Here, we show that variation in 5' splice site sequence preferences correlate with the presence of the U6 snRNA N6-methyladenosine methyltransferase METTL16 and the splicing factor SNRNP27K. The greatest variation in 5' splice site sequence occurred at the +4 position and involved a preference switch between adenosine and uridine. Loss of METTL16 and SNRNP27K orthologs, or a single SNRNP27K methionine residue, was associated with a preference for +4 U. These findings are consistent with splicing analyses of mutants defective in either METTL16 or SNRNP27K orthologs and models derived from spliceosome structures, demonstrating that inter-species association mapping is a powerful orthogonal approach to molecular studies. We identified variation between species in the occurrence of two major classes of 5' splice sites, defined by distinct interaction potentials with U5 and U6 snRNAs, that correlates with intron number. We conclude that variation in concerted processes of 5' splice site selection by U6 snRNA is associated with evolutionary changes in splicing signal phenotypes.
Collapse
Affiliation(s)
- Matthew T Parker
- School of Life Sciences, University of DundeeDundeeUnited Kingdom
| | - Sebastian M Fica
- Department of Biochemistry, University of OxfordOxfordUnited Kingdom
| | | | - Gordon G Simpson
- School of Life Sciences, University of DundeeDundeeUnited Kingdom
- Cell & Molecular Sciences, James Hutton InstituteInvergowrieUnited Kingdom
| |
Collapse
|
14
|
Fernández-Pascual E, Carta A, Rosbakh S, Guja L, Phartyal SS, Silveira FAO, Chen SC, Larson JE, Jiménez-Alfaro B. SeedArc, a global archive of primary seed germination data. THE NEW PHYTOLOGIST 2023; 240:466-470. [PMID: 37533134 DOI: 10.1111/nph.19143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 06/27/2023] [Indexed: 08/04/2023]
Affiliation(s)
- Eduardo Fernández-Pascual
- IMIB Biodiversity Research Institute (University of Oviedo - CSIC - Principality of Asturias), University of Oviedo, E-33600, Mieres, Spain
| | - Angelino Carta
- Department of Biology, Botany Unit, University of Pisa, 56122, Pisa, Italy
- CIRSEC - Centre for Climate Change Impact, University of Pisa, 56122, Pisa, Italy
| | - Sergey Rosbakh
- Department of Plant and Environmental Sciences, University of Copenhagen, DK-1871, Frederiksberg C, Denmark
| | - Lydia Guja
- National Seed Bank, Australian National Botanic Gardens, Parks Australia, 2601, Acton, ACT, Australia
- Centre for Australian National Biodiversity Research (A Joint Venture Between Parks Australia and CSIRO), CSIRO, 2601, Acton, ACT, Australia
| | - Shyam S Phartyal
- School of Ecology and Environment Studies, Nalanda University, 803116, Rajgir, India
| | - Fernando A O Silveira
- Department of Genetics, Ecology and Evolution, Federal University of Minas Gerais, 31320290, Belo Horizonte, Brazil
| | - Si-Chong Chen
- Wuhan Botanical Garden, Chinese Academy of Sciences, 430074, Wuhan, China
- Millennium Seed Bank, Royal Botanic Gardens Kew, RH176TN, Wakehurst, UK
| | - Julie E Larson
- USDA Agricultural Research Service, Eastern Oregon Agricultural Research Center, Burns, OR, 97720, USA
| | - Borja Jiménez-Alfaro
- IMIB Biodiversity Research Institute (University of Oviedo - CSIC - Principality of Asturias), University of Oviedo, E-33600, Mieres, Spain
| |
Collapse
|
15
|
Yan H, Hu Z, Thomas GWC, Edwards SV, Sackton TB, Liu JS. PhyloAcc-GT: A Bayesian Method for Inferring Patterns of Substitution Rate Shifts on Targeted Lineages Accounting for Gene Tree Discordance. Mol Biol Evol 2023; 40:msad195. [PMID: 37665177 PMCID: PMC10540510 DOI: 10.1093/molbev/msad195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 08/15/2023] [Accepted: 09/01/2023] [Indexed: 09/05/2023] Open
Abstract
An important goal of evolutionary genomics is to identify genomic regions whose substitution rates differ among lineages. For example, genomic regions experiencing accelerated molecular evolution in some lineages may provide insight into links between genotype and phenotype. Several comparative genomics methods have been developed to identify genomic accelerations between species, including a Bayesian method called PhyloAcc, which models shifts in substitution rate in multiple target lineages on a phylogeny. However, few methods consider the possibility of discordance between the trees of individual loci and the species tree due to incomplete lineage sorting, which might cause false positives. Here, we present PhyloAcc-GT, which extends PhyloAcc by modeling gene tree heterogeneity. Given a species tree, we adopt the multispecies coalescent model as the prior distribution of gene trees, use Markov chain Monte Carlo (MCMC) for inference, and design novel MCMC moves to sample gene trees efficiently. Through extensive simulations, we show that PhyloAcc-GT outperforms PhyloAcc and other methods in identifying target lineage-specific accelerations and detecting complex patterns of rate shifts, and is robust to specification of population size parameters. PhyloAcc-GT is usually more conservative than PhyloAcc in calling convergent rate shifts because it identifies more accelerations on ancestral than on terminal branches. We apply PhyloAcc-GT to two examples of convergent evolution: flightlessness in ratites and marine mammal adaptations, and show that PhyloAcc-GT is a robust tool to identify shifts in substitution rate associated with specific target lineages while accounting for incomplete lineage sorting.
Collapse
Affiliation(s)
- Han Yan
- Department of Statistics, Harvard University, Cambridge, MA, USA
| | - Zhirui Hu
- Department of Statistics, Harvard University, Cambridge, MA, USA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | | | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | | | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, MA, USA
| |
Collapse
|
16
|
Chen HI, Turakhia Y, Bejerano G, Kingsley DM. Whole-genome Comparisons Identify Repeated Regulatory Changes Underlying Convergent Appendage Evolution in Diverse Fish Lineages. Mol Biol Evol 2023; 40:msad188. [PMID: 37739926 PMCID: PMC10516590 DOI: 10.1093/molbev/msad188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/24/2023] Open
Abstract
Fins are major functional appendages of fish that have been repeatedly modified in different lineages. To search for genomic changes underlying natural fin diversity, we compared the genomes of 36 percomorph fish species that span over 100 million years of evolution and either have complete or reduced pelvic and caudal fins. We identify 1,614 genomic regions that are well-conserved in fin-complete species but missing from multiple fin-reduced lineages. Recurrent deletions of conserved sequences in wild fin-reduced species are enriched for functions related to appendage development, suggesting that convergent fin reduction at the organismal level is associated with repeated genomic deletions near fin-appendage development genes. We used sequencing and functional enhancer assays to confirm that PelA, a Pitx1 enhancer previously linked to recurrent pelvic loss in sticklebacks, has also been independently deleted and may have contributed to the fin morphology in distantly related pelvic-reduced species. We also identify a novel enhancer that is conserved in the majority of percomorphs, drives caudal fin expression in transgenic stickleback, is missing in tetraodontiform, syngnathid, and synbranchid species with caudal fin reduction, and alters caudal fin development when targeted by genome editing. Our study illustrates a broadly applicable strategy for mapping phenotypes to genotypes across a tree of vertebrate species and highlights notable new examples of regulatory genomic hotspots that have been used to evolve recurrent phenotypes across 100 million years of fish evolution.
Collapse
Affiliation(s)
- Heidi I Chen
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California, San Diego, CA, USA
| | - Gill Bejerano
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
- Department of Computer Science, Stanford University School of Engineering, Stanford, CA, USA
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA
| | - David M Kingsley
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| |
Collapse
|
17
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|
18
|
Abstract
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.
Collapse
Affiliation(s)
- Roderic Guigó
- Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia
| |
Collapse
|
19
|
Gable SM, Mendez JM, Bushroe NA, Wilson A, Byars MI, Tollis M. The State of Squamate Genomics: Past, Present, and Future of Genome Research in the Most Speciose Terrestrial Vertebrate Order. Genes (Basel) 2023; 14:1387. [PMID: 37510292 PMCID: PMC10379679 DOI: 10.3390/genes14071387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 06/28/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023] Open
Abstract
Squamates include more than 11,000 extant species of lizards, snakes, and amphisbaenians, and display a dazzling diversity of phenotypes across their over 200-million-year evolutionary history on Earth. Here, we introduce and define squamates (Order Squamata) and review the history and promise of genomic investigations into the patterns and processes governing squamate evolution, given recent technological advances in DNA sequencing, genome assembly, and evolutionary analysis. We survey the most recently available whole genome assemblies for squamates, including the taxonomic distribution of available squamate genomes, and assess their quality metrics and usefulness for research. We then focus on disagreements in squamate phylogenetic inference, how methods of high-throughput phylogenomics affect these inferences, and demonstrate the promise of whole genomes to settle or sustain persistent phylogenetic arguments for squamates. We review the role transposable elements play in vertebrate evolution, methods of transposable element annotation and analysis, and further demonstrate that through the understanding of the diversity, abundance, and activity of transposable elements in squamate genomes, squamates can be an ideal model for the evolution of genome size and structure in vertebrates. We discuss how squamate genomes can contribute to other areas of biological research such as venom systems, studies of phenotypic evolution, and sex determination. Because they represent more than 30% of the living species of amniote, squamates deserve a genome consortium on par with recent efforts for other amniotes (i.e., mammals and birds) that aim to sequence most of the extant families in a clade.
Collapse
Affiliation(s)
- Simone M Gable
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Jasmine M Mendez
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Nicholas A Bushroe
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Adam Wilson
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Michael I Byars
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA
| | - Marc Tollis
- School of Informatics, Computing, and Cyber Systems, Northern Arizona University, Flagstaff, AZ 86011, USA
| |
Collapse
|
20
|
Fleming JF, Valero‐Gracia A, Struck TH. Identifying and addressing methodological incongruence in phylogenomics: A review. Evol Appl 2023; 16:1087-1104. [PMID: 37360032 PMCID: PMC10286231 DOI: 10.1111/eva.13565] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/07/2023] [Accepted: 05/17/2023] [Indexed: 06/28/2023] Open
Abstract
The availability of phylogenetic data has greatly expanded in recent years. As a result, a new era in phylogenetic analysis is dawning-one in which the methods we use to analyse and assess our data are the bottleneck to producing valuable phylogenetic hypotheses, rather than the need to acquire more data. This makes the ability to accurately appraise and evaluate new methods of phylogenetic analysis and phylogenetic artefact identification more important than ever. Incongruence in phylogenetic reconstructions based on different datasets may be due to two major sources: biological and methodological. Biological sources comprise processes like horizontal gene transfer, hybridization and incomplete lineage sorting, while methodological ones contain falsely assigned data or violations of the assumptions of the underlying model. While the former provides interesting insights into the evolutionary history of the investigated groups, the latter should be avoided or minimized as best as possible. However, errors introduced by methodology must first be excluded or minimized to be able to conclude that biological sources are the cause. Fortunately, a variety of useful tools exist to help detect such misassignments and model violations and to apply ameliorating measurements. Still, the number of methods and their theoretical underpinning can be overwhelming and opaque. Here, we present a practical and comprehensive review of recent developments in techniques to detect artefacts arising from model violations and poorly assigned data. The advantages and disadvantages of the different methods to detect such misleading signals in phylogenetic reconstructions are also discussed. As there is no one-size-fits-all solution, this review can serve as a guide in choosing the most appropriate detection methods depending on both the actual dataset and the computational power available to the researcher. Ultimately, this informed selection will have a positive impact on the broader field, allowing us to better understand the evolutionary history of the group of interest.
Collapse
|
21
|
Brownstein CD. Syngnathoid Evolutionary History and the Conundrum of Fossil Misplacement. Integr Org Biol 2023; 5:obad011. [PMID: 37251781 PMCID: PMC10210065 DOI: 10.1093/iob/obad011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 03/21/2023] [Indexed: 05/31/2023] Open
Abstract
Seahorses, pipefishes, trumpetfishes, shrimpfishes, and allies are a speciose, globally distributed clade of fishes that have evolved a large number of unusual body plans. The clade that includes all these forms, Syngnathoidei, has become a model for the study of life history evolution, population biology, and biogeography. Yet, the timeline of syngnathoid evolution has remained highly contentious. This debate is largely attributable to the nature of the syngnathoid fossil record, which is both poorly described and patchy for several major lineages. Although fossil syngnathoids have been used to calibrate molecular phylogenies, the interrelationships of extinct species and their affinities to major living syngnathoid clades have scarcely been quantitatively tested. Here, I use an expanded morphological dataset to reconstruct the evolutionary relationships and clade ages of fossil and extant syngnathoids. Phylogenies generated using different analytical methodologies are largely congruent with molecular phylogenetic trees of Syngnathoidei but consistently find novel placements for several key taxa used as fossil calibrators in phylogenomic studies. Tip-dating of the syngnathoid phylogeny finds a timeline for their evolution that differs slightly from the one inferred using molecular trees but is generally congruent with a post-Cretaceous diversification event. These results emphasize the importance of quantitatively testing the relationships of fossil species, particularly when they are critical to assessing divergence times.
Collapse
|
22
|
Fleming JF, Struck TH. nRCFV: a new, dataset-size-independent metric to quantify compositional heterogeneity in nucleotide and amino acid datasets. BMC Bioinformatics 2023; 24:145. [PMID: 37046225 PMCID: PMC10099917 DOI: 10.1186/s12859-023-05270-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 04/04/2023] [Indexed: 04/14/2023] Open
Abstract
MOTIVATION Compositional heterogeneity-when the proportions of nucleotides and amino acids are not broadly similar across the dataset-is a cause of a great number of phylogenetic artefacts. Whilst a variety of methods can identify it post-hoc, few metrics exist to quantify compositional heterogeneity prior to the computationally intensive task of phylogenetic tree reconstruction. Here we assess the efficacy of one such existing, widely used, metric: Relative Composition Frequency Variability (RCFV), using both real and simulated data. RESULTS Our results show that RCFV can be biased by sequence length, the number of taxa, and the number of possible character states within the dataset. However, we also find that missing data does not appear to have an appreciable effect on RCFV. We discuss the theory behind this, the consequences of this for the future of the usage of the RCFV value and propose a new metric, nRCFV, which accounts for these biases. Alongside this, we present a new software that calculates both RCFV and nRCFV, called nRCFV_Reader. AVAILABILITY AND IMPLEMENTATION nRCFV has been implemented in RCFV_Reader, available at: https://github.com/JFFleming/RCFV_Reader . Both our simulation and real data are available at Datadryad: https://doi.org/10.5061/dryad.wpzgmsbpn .
Collapse
Affiliation(s)
- James F Fleming
- University of Oslo Natural History Museum, Sars' Gata 1, Oslo, Norway.
| | - Torsten H Struck
- University of Oslo Natural History Museum, Sars' Gata 1, Oslo, Norway
| |
Collapse
|
23
|
Kliesmete Z, Wange LE, Vieth B, Esgleas M, Radmer J, Hülsmann M, Geuder J, Richter D, Ohnuki M, Götz M, Hellmann I, Enard W. Regulatory and coding sequences of TRNP1 co-evolve with brain size and cortical folding in mammals. eLife 2023; 12:83593. [PMID: 36947129 PMCID: PMC10032658 DOI: 10.7554/elife.83593] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 03/01/2023] [Indexed: 03/23/2023] Open
Abstract
Brain size and cortical folding have increased and decreased recurrently during mammalian evolution. Identifying genetic elements whose sequence or functional properties co-evolve with these traits can provide unique information on evolutionary and developmental mechanisms. A good candidate for such a comparative approach is TRNP1, as it controls proliferation of neural progenitors in mice and ferrets. Here, we investigate the contribution of both regulatory and coding sequences of TRNP1 to brain size and cortical folding in over 30 mammals. We find that the rate of TRNP1 protein evolution (ω) significantly correlates with brain size, slightly less with cortical folding and much less with body size. This brain correlation is stronger than for >95% of random control proteins. This co-evolution is likely affecting TRNP1 activity, as we find that TRNP1 from species with larger brains and more cortical folding induce higher proliferation rates in neural stem cells. Furthermore, we compare the activity of putative cis-regulatory elements (CREs) of TRNP1 in a massively parallel reporter assay and identify one CRE that likely co-evolves with cortical folding in Old World monkeys and apes. Our analyses indicate that coding and regulatory changes that increased TRNP1 activity were positively selected either as a cause or a consequence of increases in brain size and cortical folding. They also provide an example how phylogenetic approaches can inform biological mechanisms, especially when combined with molecular phenotypes across several species.
Collapse
Affiliation(s)
- Zane Kliesmete
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Lucas Esteban Wange
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Beate Vieth
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Miriam Esgleas
- Physiological Genomics, BioMedical Center - BMC, Ludwig-Maximilians-Universität, Munich, Germany
- Institute for Stem Cell Research, Helmholtz Zentrum München, Germany Research Center for Environmental Health, Munich, Germany
| | - Jessica Radmer
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Matthias Hülsmann
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
- Department of Environmental Microbiology, Eawag, Dübendorf, Switzerland
- Department of Environmental Systems Science, ETH Zurich, Zurich, Switzerland
| | - Johanna Geuder
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Daniel Richter
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Mari Ohnuki
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Magdelena Götz
- Physiological Genomics, BioMedical Center - BMC, Ludwig-Maximilians-Universität, Munich, Germany
- Institute for Stem Cell Research, Helmholtz Zentrum München, Germany Research Center for Environmental Health, Munich, Germany
- SYNERGY, Excellence Cluster of Systems Neurology, BioMedical Center (BMC), Ludwig-Maximilians-Universität München, Munich, Germany
| | - Ines Hellmann
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| | - Wolfgang Enard
- Anthropology and Human Genomics, Faculty of Biology, Ludwig-Maximilians-Universität, Munich, Germany
| |
Collapse
|
24
|
Magpali L, Bielawski JP. Why are whales big? Genes behind ocean giants. Trends Genet 2023; 39:436-438. [PMID: 36997429 DOI: 10.1016/j.tig.2023.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 03/15/2023] [Indexed: 03/30/2023]
Abstract
Gigantism is prevalent in animals, but it has never reached more extreme levels than in aquatic mammals such as whales, dolphins, and porpoises. A new study by Silva et al. has uncovered five genes underlying this gigantism, a phenotype with important connections to aging and cancer suppression in long-lived animals.
Collapse
|
25
|
Genome Evolution and the Future of Phylogenomics of Non-Avian Reptiles. Animals (Basel) 2023; 13:ani13030471. [PMID: 36766360 PMCID: PMC9913427 DOI: 10.3390/ani13030471] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/13/2023] [Accepted: 01/15/2023] [Indexed: 02/01/2023] Open
Abstract
Non-avian reptiles comprise a large proportion of amniote vertebrate diversity, with squamate reptiles-lizards and snakes-recently overtaking birds as the most species-rich tetrapod radiation. Despite displaying an extraordinary diversity of phenotypic and genomic traits, genomic resources in non-avian reptiles have accumulated more slowly than they have in mammals and birds, the remaining amniotes. Here we review the remarkable natural history of non-avian reptiles, with a focus on the physical traits, genomic characteristics, and sequence compositional patterns that comprise key axes of variation across amniotes. We argue that the high evolutionary diversity of non-avian reptiles can fuel a new generation of whole-genome phylogenomic analyses. A survey of phylogenetic investigations in non-avian reptiles shows that sequence capture-based approaches are the most commonly used, with studies of markers known as ultraconserved elements (UCEs) especially well represented. However, many other types of markers exist and are increasingly being mined from genome assemblies in silico, including some with greater information potential than UCEs for certain investigations. We discuss the importance of high-quality genomic resources and methods for bioinformatically extracting a range of marker sets from genome assemblies. Finally, we encourage herpetologists working in genomics, genetics, evolutionary biology, and other fields to work collectively towards building genomic resources for non-avian reptiles, especially squamates, that rival those already in place for mammals and birds. Overall, the development of this cross-amniote phylogenomic tree of life will contribute to illuminate interesting dimensions of biodiversity across non-avian reptiles and broader amniotes.
Collapse
|
26
|
Pennell M, Rodriguez OL, Watson CT, Greiff V. The evolutionary and functional significance of germline immunoglobulin gene variation. Trends Immunol 2023; 44:7-21. [PMID: 36470826 DOI: 10.1016/j.it.2022.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/07/2022] [Indexed: 12/04/2022]
Abstract
The recombination between immunoglobulin (IG) gene segments determines an individual's naïve antibody repertoire and, consequently, (auto)antigen recognition. Emerging evidence suggests that mammalian IG germline variation impacts humoral immune responses associated with vaccination, infection, and autoimmunity - from the molecular level of epitope specificity, up to profound changes in the architecture of antibody repertoires. These links between IG germline variants and immunophenotype raise the question on the evolutionary causes and consequences of diversity within IG loci. We discuss why the extreme diversity in IG loci remains a mystery, why resolving this is important for the design of more effective vaccines and therapeutics, and how recent evidence from multiple lines of inquiry may help us do so.
Collapse
Affiliation(s)
- Matt Pennell
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.
| | - Oscar L Rodriguez
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Corey T Watson
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, USA
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
27
|
Church SH, Munro C, Dunn CW, Extavour CG. The evolution of ovary-biased gene expression in Hawaiian Drosophila. PLoS Genet 2023; 19:e1010607. [PMID: 36689550 PMCID: PMC9894553 DOI: 10.1371/journal.pgen.1010607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 02/02/2023] [Accepted: 01/09/2023] [Indexed: 01/24/2023] Open
Abstract
With detailed data on gene expression accessible from an increasingly broad array of species, we can test the extent to which our developmental genetic knowledge from model organisms predicts expression patterns and variation across species. But to know when differences in gene expression across species are significant, we first need to know how much evolutionary variation in gene expression we expect to observe. Here we provide an answer by analyzing RNAseq data across twelve species of Hawaiian Drosophilidae flies, focusing on gene expression differences between the ovary and other tissues. We show that over evolutionary time, there exists a cohort of ovary specific genes that is stable and that largely corresponds to described expression patterns from laboratory model Drosophila species. Our results also provide a demonstration of the prediction that, as phylogenetic distance increases, variation between species overwhelms variation between tissue types. Using ancestral state reconstruction of expression, we describe the distribution of evolutionary changes in tissue-biased expression, and use this to identify gains and losses of ovary-biased expression across these twelve species. We then use this distribution to calculate the evolutionary correlation in expression changes between genes, and demonstrate that genes with known interactions in D. melanogaster are significantly more correlated in their evolution than genes with no or unknown interactions. Finally, we use this correlation matrix to infer new networks of genes that share evolutionary trajectories, and we present these results as a dataset of new testable hypotheses about genetic roles and interactions in the function and evolution of the Drosophila ovary.
Collapse
Affiliation(s)
- Samuel H Church
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Current address: Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Catriona Munro
- Collège de France, PSL Research University, CNRS, Inserm, Center for Interdisciplinary Research in Biology, Paris, France
| | - Casey W Dunn
- Current address: Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Cassandra G Extavour
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Chevy Chase, Maryland, United States of America
| |
Collapse
|
28
|
Detecting macroevolutionary genotype-phenotype associations using error-corrected rates of protein convergence. Nat Ecol Evol 2023; 7:155-170. [PMID: 36604553 PMCID: PMC9834058 DOI: 10.1038/s41559-022-01932-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 10/12/2022] [Indexed: 01/07/2023]
Abstract
On macroevolutionary timescales, extensive mutations and phylogenetic uncertainty mask the signals of genotype-phenotype associations underlying convergent evolution. To overcome this problem, we extended the widely used framework of non-synonymous to synonymous substitution rate ratios and developed the novel metric ωC, which measures the error-corrected convergence rate of protein evolution. While ωC distinguishes natural selection from genetic noise and phylogenetic errors in simulation and real examples, its accuracy allows an exploratory genome-wide search of adaptive molecular convergence without phenotypic hypothesis or candidate genes. Using gene expression data, we explored over 20 million branch combinations in vertebrate genes and identified the joint convergence of expression patterns and protein sequences with amino acid substitutions in functionally important sites, providing hypotheses on undiscovered phenotypes. We further extended our method with a heuristic algorithm to detect highly repetitive convergence among computationally non-trivial higher-order phylogenetic combinations. Our approach allows bidirectional searches for genotype-phenotype associations, even in lineages that diverged for hundreds of millions of years.
Collapse
|
29
|
African mitochondrial haplogroup L7: a 100,000-year-old maternal human lineage discovered through reassessment and new sequencing. Sci Rep 2022; 12:10747. [PMID: 35750688 PMCID: PMC9232647 DOI: 10.1038/s41598-022-13856-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 05/30/2022] [Indexed: 11/17/2022] Open
Abstract
Archaeological and genomic evidence suggest that modern Homo sapiens have roamed the planet for some 300–500 thousand years. In contrast, global human mitochondrial (mtDNA) diversity coalesces to one African female ancestor (“Mitochondrial Eve”) some 145 thousand years ago, owing to the ¼ gene pool size of our matrilineally inherited haploid genome. Therefore, most of human prehistory was spent in Africa where early ancestors of Southern African Khoisan and Central African rainforest hunter-gatherers (RFHGs) segregated into smaller groups. Their subdivisions followed climatic oscillations, new modes of subsistence, local adaptations, and cultural-linguistic differences, all prior to their exodus out of Africa. Seven African mtDNA haplogroups (L0–L6) traditionally captured this ancient structure—these L haplogroups have formed the backbone of the mtDNA tree for nearly two decades. Here we describe L7, an eighth haplogroup that we estimate to be ~ 100 thousand years old and which has been previously misclassified in the literature. In addition, L7 has a phylogenetic sublineage L7a*, the oldest singleton branch in the human mtDNA tree (~ 80 thousand years). We found that L7 and its sister group L5 are both low-frequency relics centered around East Africa, but in different populations (L7: Sandawe; L5: Mbuti). Although three small subclades of African foragers hint at the population origins of L5'7, the majority of subclades are divided into Afro-Asiatic and eastern Bantu groups, indicative of more recent admixture. A regular re-estimation of the entire mtDNA haplotype tree is needed to ensure correct cladistic placement of new samples in the future.
Collapse
|
30
|
Rokas A. Evolution of the human pathogenic lifestyle in fungi. Nat Microbiol 2022; 7:607-619. [PMID: 35508719 PMCID: PMC9097544 DOI: 10.1038/s41564-022-01112-0] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 03/25/2022] [Indexed: 02/07/2023]
Abstract
Fungal pathogens cause more than a billion human infections every year, resulting in more than 1.6 million deaths annually. Understanding the natural history and evolutionary ecology of fungi is helping us understand how disease-relevant traits have repeatedly evolved. Different types and mechanisms of genetic variation have contributed to the evolution of fungal pathogenicity and specific genetic differences distinguish pathogens from non-pathogens. Insights into the traits, genetic elements, and genetic and ecological mechanisms that contribute to the evolution of fungal pathogenicity are crucial for developing strategies to both predict emergence of fungal pathogens and develop drugs to combat them.
Collapse
Affiliation(s)
- Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA.
- Vanderbilt Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN, USA.
| |
Collapse
|
31
|
Gupta PK. Earth Biogenome Project: present status and future plans. Trends Genet 2022; 38:811-820. [DOI: 10.1016/j.tig.2022.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 04/11/2022] [Accepted: 04/22/2022] [Indexed: 10/18/2022]
|
32
|
Mason AJ, Holding ML, Rautsaw RM, Rokyta DR, Parkinson CL, Gibbs HL. Venom gene sequence diversity and expression jointly shape diet adaptation in pitvipers. Mol Biol Evol 2022; 39:6567549. [PMID: 35413123 PMCID: PMC9040050 DOI: 10.1093/molbev/msac082] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Understanding the joint roles of protein sequence variation and differential expression during adaptive evolution is a fundamental, yet largely unrealized goal of evolutionary biology. Here, we use phylogenetic path analysis to analyze a comprehensive venom-gland transcriptome dataset spanning three genera of pitvipers to identify the functional genetic basis of a key adaptation (venom complexity) linked to diet breadth (DB). The analysis of gene-family-specific patterns reveals that, for genes encoding two of the most important venom proteins (snake venom metalloproteases and snake venom serine proteases), there are direct, positive relationships between sequence diversity (SD), expression diversity (ED), and increased DB. Further analysis of gene-family diversification for these proteins showed no constraint on how individual lineages achieved toxin gene SD in terms of the patterns of paralog diversification. In contrast, another major venom protein family (PLA2s) showed no relationship between venom molecular diversity and DB. Additional analyses suggest that other molecular mechanisms—such as higher absolute levels of expression—are responsible for diet adaptation involving these venom proteins. Broadly, our findings argue that functional diversity generated through sequence and expression variations jointly determine adaptation in the key components of pitviper venoms, which mediate complex molecular interactions between the snakes and their prey.
Collapse
Affiliation(s)
- Andrew J Mason
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| | | | - Rhett M Rautsaw
- Department of Biological Sciences, Clemson University, Clemson, SC, USA
| | - Darin R Rokyta
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| | - Christopher L Parkinson
- Department of Biological Sciences, Clemson University, Clemson, SC, USA.,Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, USA
| | - H Lisle Gibbs
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
33
|
Immunity and lifespan: answering long-standing questions with comparative genomics. Trends Genet 2022; 38:650-661. [DOI: 10.1016/j.tig.2022.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 01/14/2022] [Accepted: 02/28/2022] [Indexed: 10/18/2022]
|
34
|
Mongiardino Koch N, Thompson JR, Hiley AS, McCowin MF, Armstrong AF, Coppard SE, Aguilera F, Bronstein O, Kroh A, Mooi R, Rouse GW. Phylogenomic analyses of echinoid diversification prompt a re-evaluation of their fossil record. eLife 2022; 11:72460. [PMID: 35315317 PMCID: PMC8940180 DOI: 10.7554/elife.72460] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 03/03/2022] [Indexed: 12/25/2022] Open
Abstract
Echinoids are key components of modern marine ecosystems. Despite a remarkable fossil record, the emergence of their crown group is documented by few specimens of unclear affinities, rendering their early history uncertain. The origin of sand dollars, one of its most distinctive clades, is also unclear due to an unstable phylogenetic context. We employ 18 novel genomes and transcriptomes to build a phylogenomic dataset with a near-complete sampling of major lineages. With it, we revise the phylogeny and divergence times of echinoids, and place their history within the broader context of echinoderm evolution. We also introduce the concept of a chronospace - a multidimensional representation of node ages - and use it to explore methodological decisions involved in time calibrating phylogenies. We find the choice of clock model to have the strongest impact on divergence times, while the use of site-heterogeneous models and alternative node prior distributions show minimal effects. The choice of loci has an intermediate impact, affecting mostly deep Paleozoic nodes, for which clock-like genes recover dates more congruent with fossil evidence. Our results reveal that crown group echinoids originated in the Permian and diversified rapidly in the Triassic, despite the relative lack of fossil evidence for this early diversification. We also clarify the relationships between sand dollars and their close relatives and confidently date their origins to the Cretaceous, implying ghost ranges spanning approximately 50 million years, a remarkable discrepancy with their rich fossil record.
Collapse
Affiliation(s)
- Nicolás Mongiardino Koch
- Department of Earth & Planetary Sciences, Yale University, New Haven, United States.,Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| | - Jeffrey R Thompson
- Department of Earth Sciences, Natural History Museum, London, United Kingdom.,University College London Center for Life's Origins and Evolution, London, United Kingdom
| | - Avery S Hiley
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| | - Marina F McCowin
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| | - A Frances Armstrong
- Department of Invertebrate Zoology and Geology, California Academy of Sciences, San Francisco, United States
| | - Simon E Coppard
- Bader International Study Centre, Queen's University, Herstmonceux Castle, East Sussex, United Kingdom
| | - Felipe Aguilera
- Departamento de Bioquímica y Biología Molecular, Facultad de Ciencias Biológicas, Universidad de Concepción, Concepción, Chile
| | - Omri Bronstein
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.,Steinhardt Museum of Natural History, Tel-Aviv, Israel
| | - Andreas Kroh
- Department of Geology and Palaeontology, Natural History Museum Vienna, Vienna, Austria
| | - Rich Mooi
- Department of Invertebrate Zoology and Geology, California Academy of Sciences, San Francisco, United States
| | - Greg W Rouse
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, United States
| |
Collapse
|
35
|
Fanter C, Madelaire C, Genereux DP, van Breukelen F, Levesque D, Hindle A. Epigenomics as a paradigm to understand the nuances of phenotypes. J Exp Biol 2022; 225:274619. [PMID: 35258621 DOI: 10.1242/jeb.243411] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Quantifying the relative importance of genomic and epigenomic modulators of phenotype is a focal challenge in comparative physiology, but progress is constrained by availability of data and analytic methods. Previous studies have linked physiological features to coding DNA sequence, regulatory DNA sequence, and epigenetic state, but few have disentangled their relative contributions or unambiguously distinguished causative effects ('drivers') from correlations. Progress has been limited by several factors, including the classical approach of treating continuous and fluid phenotypes as discrete and static across time and environment, and difficulty in considering the full diversity of mechanisms that can modulate phenotype, such as gene accessibility, transcription, mRNA processing and translation. We argue that attention to phenotype nuance, progressing to association with epigenetic marks and then causal analyses of the epigenetic mechanism, will enable clearer evaluation of the evolutionary path. This would underlie an essential paradigm shift, and power the search for links between genomic and epigenomic features and physiology. Here, we review the growing knowledge base of gene-regulatory mechanisms and describe their links to phenotype, proposing strategies to address widely recognized challenges.
Collapse
Affiliation(s)
- Cornelia Fanter
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Carla Madelaire
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Diane P Genereux
- Vertebrate Genome Biology, Broad Institute, Cambridge, MA 02142, USA
| | - Frank van Breukelen
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| | - Danielle Levesque
- School of Biology and Ecology, University of Maine, Orono, ME 04469, USA
| | - Allyson Hindle
- School of Life Sciences, University of Nevada Las Vegas, Las Vegas, NV 89154, USA
| |
Collapse
|
36
|
PRDM9 losses in vertebrates are coupled to those of paralogs ZCWPW1 and ZCWPW2. Proc Natl Acad Sci U S A 2022; 119:2114401119. [PMID: 35217607 PMCID: PMC8892340 DOI: 10.1073/pnas.2114401119] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2022] [Indexed: 01/12/2023] Open
Abstract
We take a phylogenetic approach to search for molecular partners of PRDM9, a key meiotic recombination gene, by leveraging the fact that the complete PRDM9 gene has been lost at least 13 times independently in vertebrates. We identify two genes, ZCWPW1 and its paralog ZCWPW2, whose presence or absence across vertebrates is coupled to that of PRDM9. ZCWPW1 was recently shown to be recruited to sites of PRDM9 binding and to aid in the repair of double strand breaks. ZCWPW2 is likely recruited to sites of PRDM9 binding as well; its tight coevolution with PRDM9 across vertebrates suggests that it too plays an important role in mammals and beyond, either in double strand break formation or repair. In most mammals and likely throughout vertebrates, the gene PRDM9 specifies the locations of meiotic double strand breaks; in mice and humans at least, it also aids in their repair. For both roles, many of the molecular partners remain unknown. Here, we take a phylogenetic approach to identify genes that may be interacting with PRDM9 by leveraging the fact that PRDM9 arose before the origin of vertebrates but was lost many times, either partially or entirely—and with it, its role in recombination. As a first step, we characterize PRDM9 domain composition across 446 vertebrate species, inferring at least 13 independent losses. We then use the interdigitation of PRDM9 orthologs across vertebrates to test whether it coevolved with any of 241 candidate genes coexpressed with PRDM9 in mice or associated with recombination phenotypes in mammals. Accounting for the phylogenetic relationship among a subsample of 189 species, we find two genes whose presence and absence is unexpectedly coincident with that of PRDM9: ZCWPW1, which was recently shown to facilitate double strand break repair, and its paralog ZCWPW2, as well as, more tentatively, TEX15 and FBXO47. ZCWPW2 is expected to be recruited to sites of PRDM9 binding; its tight coevolution with PRDM9 across vertebrates suggests that it is a key interactor within mammals and beyond, with a role either in recruiting the recombination machinery or in double strand break repair.
Collapse
|
37
|
Blaxter M, Archibald JM, Childers AK, Coddington JA, Crandall KA, Di Palma F, Durbin R, Edwards SV, Graves JAM, Hackett KJ, Hall N, Jarvis ED, Johnson RN, Karlsson EK, Kress WJ, Kuraku S, Lawniczak MKN, Lindblad-Toh K, Lopez JV, Moran NA, Robinson GE, Ryder OA, Shapiro B, Soltis PS, Warnow T, Zhang G, Lewin HA. Why sequence all eukaryotes? Proc Natl Acad Sci U S A 2022; 119:e2115636118. [PMID: 35042801 PMCID: PMC8795522 DOI: 10.1073/pnas.2115636118] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Life on Earth has evolved from initial simplicity to the astounding complexity we experience today. Bacteria and archaea have largely excelled in metabolic diversification, but eukaryotes additionally display abundant morphological innovation. How have these innovations come about and what constraints are there on the origins of novelty and the continuing maintenance of biodiversity on Earth? The history of life and the code for the working parts of cells and systems are written in the genome. The Earth BioGenome Project has proposed that the genomes of all extant, named eukaryotes-about 2 million species-should be sequenced to high quality to produce a digital library of life on Earth, beginning with strategic phylogenetic, ecological, and high-impact priorities. Here we discuss why we should sequence all eukaryotic species, not just a representative few scattered across the many branches of the tree of life. We suggest that many questions of evolutionary and ecological significance will only be addressable when whole-genome data representing divergences at all of the branchings in the tree of life or all species in natural ecosystems are available. We envisage that a genomic tree of life will foster understanding of the ongoing processes of speciation, adaptation, and organismal dependencies within entire ecosystems. These explorations will resolve long-standing problems in phylogenetics, evolution, ecology, conservation, agriculture, bioindustry, and medicine.
Collapse
Affiliation(s)
- Mark Blaxter
- Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom;
| | - John M Archibald
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS B3H 4H7, Canada
| | - Anna K Childers
- Bee Research Laboratory, Agricultural Research Service, US Department of Agriculture (USDA), Beltsville, MD 20705
| | - Jonathan A Coddington
- Global Genome Initiative, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560
| | - Keith A Crandall
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, George Washington University, Washington, DC 20052
- Department of Invertebrate Zoology, Smithsonian Institution, Washington, DC 20013
| | - Federica Di Palma
- School of Biological Sciences, University of East Anglia, Norwich NR4 7TJ, United Kingdom
| | - Richard Durbin
- Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138
| | - Jennifer A M Graves
- School of Life Sciences, La Trobe University, Bundoora, VIC 751 23, Australia
- University of Canberra, Bruce, ACT 2617, Australia
| | - Kevin J Hackett
- Crop Production and Protection, Office of National Programs, Agricultural Research Service, USDA, Beltsville, MD 20705
| | - Neil Hall
- Earlham Institute, Norwich, Norfolk NR4 7UZ, United Kingdom
| | - Erich D Jarvis
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
- Howard Hughes Medical Institute, Chevy Chase, MD 20815
| | - Rebecca N Johnson
- National Museum of Natural History, Smithsonian Institution, Washington, DC 20560
| | - Elinor K Karlsson
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | - W John Kress
- Botany, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012
| | - Shigehiro Kuraku
- Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
| | | | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala 751 23, Sweden
| | - Jose V Lopez
- Department of Biological Sciences, Halmos College of Arts and Sciences, Nova Southeastern University, Dania Beach, FL 33004
- Guy Harvey Oceanographic Center, Dania Beach, FL 33004
| | - Nancy A Moran
- Integrative Biology, University of Texas at Austin, Austin, TX 78712
| | - Gene E Robinson
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - Oliver A Ryder
- Conservation Genetics, Division of Biology, San Diego Zoo Wildlife Alliance, Escondido, CA 92027
- Department of Evolution, Behavior and Ecology, University of California, San Diego, La Jolla, CA 92039
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA 95064
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
| | - Tandy Warnow
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61301
| | - Guojie Zhang
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
- China National Genebank, Beijing Genomics Institute-Shenzhen, Shenzhen 518083, China
| | - Harris A Lewin
- Department of Evolution and Ecology, College of Biological Sciences, University of California, Davis, CA 95616
- Department of Population Health and Reproduction, University of California, Davis, CA 95616
| |
Collapse
|
38
|
Field JT, Abrams AJ, Cartee JC, McTavish EJ. Rapid alignment updating with Extensiphy. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jasper Toscani Field
- Quantitative and Systems Biology Program School of Natural Sciences University of California Merced CA USA
| | - A. Jeanine Abrams
- Division of STD Prevention National Centers for HIV/AIDS Viral Hepatitis, STD, and TB Prevention Atlanta GA USA
| | - John C. Cartee
- Division of STD Prevention National Centers for HIV/AIDS Viral Hepatitis, STD, and TB Prevention Atlanta GA USA
| | - Emily Jane McTavish
- Life and Environmental Sciences Department School of Natural Sciences University of California Merced CA USA
| |
Collapse
|
39
|
Zaharias P, Grosshauser M, Warnow T. Re-evaluating Deep Neural Networks for Phylogeny Estimation: The Issue of Taxon Sampling. J Comput Biol 2022; 29:74-89. [PMID: 34986031 DOI: 10.1089/cmb.2021.0383] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Deep neural networks (DNNs) have been recently proposed for quartet tree phylogeny estimation. Here, we present a study evaluating recently trained DNNs in comparison to a collection of standard phylogeny estimation methods on a heterogeneous collection of datasets simulated under the same models that were used to train the DNNs, and also under similar conditions but with higher rates of evolution. Our study shows that using DNNs with quartet amalgamation is less accurate than several standard phylogeny estimation methods we explore (e.g., maximum likelihood and maximum parsimony). We further find that simple standard phylogeny estimation methods match or improve on DNNs for quartet accuracy, especially, but not exclusively, when used in a global manner (i.e., the tree on the full dataset is computed and then the induced quartet trees are extracted from the full tree). Thus, our study provides evidence that a major challenge impacting the utility of current DNNs for phylogeny estimation is their restriction to estimating quartet trees that must subsequently be combined into a tree on the full dataset. In contrast, global methods (i.e., those that estimate trees from the full set of sequences) are able to benefit from taxon sampling, and hence have higher accuracy on large datasets.
Collapse
Affiliation(s)
- Paul Zaharias
- Department of Computer Science, University of Illinois, Urbana, Illinois, USA
| | | | - Tandy Warnow
- Department of Computer Science, University of Illinois, Urbana, Illinois, USA
| |
Collapse
|
40
|
Bravo GA, Schmitt CJ, Edwards SV. What Have We Learned from the First 500 Avian Genomes? ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS 2021. [DOI: 10.1146/annurev-ecolsys-012121-085928] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The increased capacity of DNA sequencing has significantly advanced our understanding of the phylogeny of birds and the proximate and ultimate mechanisms molding their genomic diversity. In less than a decade, the number of available avian reference genomes has increased to over 500—approximately 5% of bird diversity—placing birds in a privileged position to advance the fields of phylogenomics and comparative, functional, and population genomics. Whole-genome sequence data, as well as indels and rare genomic changes, are further resolving the avian tree of life. The accumulation of bird genomes, increasingly with long-read sequence data, greatly improves the resolution of genomic features such as germline-restricted chromosomes and the W chromosome, and is facilitating the comparative integration of genotypes and phenotypes. Community-based initiatives such as the Bird 10,000 Genomes Project and Vertebrate Genome Project are playing a fundamental role in amplifying and coalescing a vibrant international program in avian comparative genomics.
Collapse
Affiliation(s)
- Gustavo A. Bravo
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| | - C. Jonathan Schmitt
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| | - Scott V. Edwards
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138, USA;, ,
| |
Collapse
|
41
|
Gendreau KL, Hornsby AD, Hague MTJ, McGlothlin JW. Gene Conversion Facilitates the Adaptive Evolution of Self-Resistance in Highly Toxic Newts. Mol Biol Evol 2021; 38:4077-4094. [PMID: 34129031 PMCID: PMC8476164 DOI: 10.1093/molbev/msab182] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Reconstructing the histories of complex adaptations and identifying the evolutionary mechanisms underlying their origins are two of the primary goals of evolutionary biology. Taricha newts, which contain high concentrations of the deadly toxin tetrodotoxin (TTX) as an antipredator defense, have evolved resistance to self-intoxication, which is a complex adaptation requiring changes in six paralogs of the voltage-gated sodium channel (Nav) gene family, the physiological target of TTX. Here, we reconstruct the origins of TTX self-resistance by sequencing the entire Nav gene family in newts and related salamanders. We show that moderate TTX resistance evolved early in the salamander lineage in three of the six Nav paralogs, preceding the proposed appearance of tetrodotoxic newts by ∼100 My. TTX-bearing newts possess additional unique substitutions across the entire Nav gene family that provide physiological TTX resistance. These substitutions coincide with signatures of positive selection and relaxed purifying selection, as well as gene conversion events, that together likely facilitated their evolution. We also identify a novel exon duplication within Nav1.4 encoding an expressed TTX-binding site. Two resistance-conferring changes within newts appear to have spread via nonallelic gene conversion: in one case, one codon was copied between paralogs, and in the second, multiple substitutions were homogenized between the duplicate exons of Nav1.4. Our results demonstrate that gene conversion can accelerate the coordinated evolution of gene families in response to a common selection pressure.
Collapse
Affiliation(s)
- Kerry L Gendreau
- Department of Biological Sciences, Virginia Tech, Blacksburg, United States
| | - Angela D Hornsby
- Department of Biological Sciences, Virginia Tech, Blacksburg, United States.,Philip L. Wright Zoological Museum, Division of Biological Sciences, University of Montana, Missoula, United States
| | - Michael T J Hague
- Division of Biological Sciences, University of Montana, Missoula, MT, United States
| | - Joel W McGlothlin
- Department of Biological Sciences, Virginia Tech, Blacksburg, United States
| |
Collapse
|
42
|
Rocha JL, Godinho R, Brito JC, Nielsen R. Life in Deserts: The Genetic Basis of Mammalian Desert Adaptation. Trends Ecol Evol 2021; 36:637-650. [PMID: 33863602 DOI: 10.1016/j.tree.2021.03.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 03/17/2021] [Accepted: 03/18/2021] [Indexed: 12/13/2022]
Abstract
Deserts are among the harshest environments on Earth. The multiple ages of different deserts and their global distribution provide a unique opportunity to study repeated adaptation at different timescales. Here, we summarize recent genomic research on the genetic mechanisms underlying desert adaptations in mammals. Several studies on different desert mammals show large overlap in functional classes of genes and pathways, consistent with the complexity and variety of phenotypes associated with desert adaptation to water and food scarcity and extreme temperatures. However, studies of desert adaptation are also challenged by a lack of accurate genotype-phenotype-environment maps. We encourage development of systems that facilitate functional analyses, but also acknowledge the need for more studies on a wider variety of desert mammals.
Collapse
Affiliation(s)
- Joana L Rocha
- CIBIO/InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus de Vairão, 4485-661 Vairão, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4169-007 Porto, Portugal.
| | - Raquel Godinho
- CIBIO/InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus de Vairão, 4485-661 Vairão, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4169-007 Porto, Portugal; Department of Zoology, University of Johannesburg, PO Box 534, Auckland Park 2006, South Africa
| | - José C Brito
- CIBIO/InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus de Vairão, 4485-661 Vairão, Portugal; Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, 4169-007 Porto, Portugal
| | - Rasmus Nielsen
- Department of Integrative Biology and Department of Statistics, University of California Berkeley, Berkeley, CA 94820, USA; Globe Institute, University of Copenhagen, DK-1165 Copenhagen, Denmark.
| |
Collapse
|
43
|
Uyeda JC, Bone N, McHugh S, Rolland J, Pennell MW. How should functional relationships be evaluated using phylogenetic comparative methods? A case study using metabolic rate and body temperature. Evolution 2021; 75:1097-1105. [PMID: 33788258 DOI: 10.1111/evo.14213] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 01/20/2021] [Indexed: 12/12/2022]
Abstract
Phylogenetic comparative methods are often used to test functional relationships between traits. However, million-year macroevolutionary observational datasets cannot definitively prove causal links between traits-correlation does not equal causation and experimental manipulation over such timescales is impossible. Although this caveat is widely understood, it is less appreciated that different phylogenetic approaches imply different causal assumptions about the functional relationships of traits. To make meaningful inferences, it is critical that our statistical methods make biologically reasonable assumptions. Here we illustrate the importance of causal reasoning in comparative biology by examining a recent study by Avaria-Llautureo et al (2019). that tested for the evolutionary coupling of metabolic rate and body temperature across endotherms and found that these traits were unlinked through evolutionary time and that body temperatures were, on average, higher in the early Cenozoic than they are today. We argue that the causal assumptions embedded into their models made it impossible for them to test the relevant functional and evolutionary hypotheses. We reanalyze their data using more biologically appropriate models and find support for the exact opposite conclusions, corroborating previous evidence from physiology and paleontology. We highlight the vital need for causal thinking, even when experiments are impossible.
Collapse
Affiliation(s)
- Josef C Uyeda
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061
| | - Nicholas Bone
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061
| | - Sean McHugh
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, 24061
| | - Jonathan Rolland
- Department of Computational Biology, University of Lausanne, Quartier Sorge, Lausanne, 1015, Switzerland.,Biodiversity Research Centre, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.,Department of Zoology, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Matthew W Pennell
- Biodiversity Research Centre, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.,Department of Zoology, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| |
Collapse
|
44
|
Wolfe JM, Luque J, Bracken-Grissom HD. How to become a crab: Phenotypic constraints on a recurring body plan. Bioessays 2021; 43:e2100020. [PMID: 33751651 DOI: 10.1002/bies.202100020] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 02/11/2021] [Accepted: 02/16/2021] [Indexed: 12/12/2022]
Abstract
A fundamental question in biology is whether phenotypes can be predicted by ecological or genomic rules. At least five cases of convergent evolution of the crab-like body plan (with a wide and flattened shape, and a bent abdomen) are known in decapod crustaceans, and have, for over 140 years, been known as "carcinization." The repeated loss of this body plan has been identified as "decarcinization." In reviewing the field, we offer phylogenetic strategies to include poorly known groups, and direct evidence from fossils, that will resolve the history of crab evolution and the degree of phenotypic variation within crabs. Proposed ecological advantages of the crab body are summarized into a hypothesis of phenotypic integration suggesting correlated evolution of the carapace shape and abdomen. Our premise provides fertile ground for future studies of the genomic and developmental basis, and the predictability, of the crab-like body form.
Collapse
Affiliation(s)
- Joanna M Wolfe
- Museum of Comparative Zoology and Department of Organismic & Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Javier Luque
- Museum of Comparative Zoology and Department of Organismic & Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA.,Smithsonian Tropical Research Institute, Balboa-Ancon, Panama.,Department of Earth and Planetary Sciences, Yale University, New Haven, Connecticut, USA
| | - Heather D Bracken-Grissom
- Institute of Environment and Department of Biological Sciences, Florida International University, North Miami, Florida, USA
| |
Collapse
|
45
|
Expanding evolutionary neuroscience: insights from comparing variation in behavior. Neuron 2021; 109:1084-1099. [PMID: 33609484 DOI: 10.1016/j.neuron.2021.02.002] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 01/25/2021] [Accepted: 01/28/2021] [Indexed: 01/01/2023]
Abstract
Neuroscientists have long studied species with convenient biological features to discover how behavior emerges from conserved molecular, neural, and circuit level processes. With the advent of new tools, from viral vectors and gene editing to automated behavioral analyses, there has been a recent wave of interest in developing new, "nontraditional" model species. Here, we advocate for a complementary approach to model species development, that is, model clade development, as a way to integrate an evolutionary comparative approach with neurobiological and behavioral experiments. Capitalizing on natural behavioral variation in and investing in experimental tools for model clades will be a valuable strategy for the next generation of neuroscience discovery.
Collapse
|
46
|
Pigmentation Genes Show Evidence of Repeated Divergence and Multiple Bouts of Introgression in Setophaga Warblers. Curr Biol 2021; 31:643-649.e3. [DOI: 10.1016/j.cub.2020.10.094] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 10/27/2020] [Accepted: 10/29/2020] [Indexed: 01/15/2023]
|
47
|
Padial JM, De la Riva I. A paradigm shift in our view of species drives current trends in biological classification. Biol Rev Camb Philos Soc 2020; 96:731-751. [PMID: 33368983 DOI: 10.1111/brv.12676] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 11/21/2020] [Accepted: 11/25/2020] [Indexed: 12/22/2022]
Abstract
Discontent about changes in species classifications has grown in recent years. Many of these changes are seen as arbitrary, stemming from unjustified conceptual and methodological grounds, or leading to species that are less distinct than those recognised in the past. We argue that current trends in species classification are the result of a paradigm shift toward which systematics and population genetics have converged and that regards species as the phylogenetic lineages that form the branches of the Tree of Life. Species delimitation now consists of determining which populations belong to which individual phylogenetic lineage. This requires inferences on the process of lineage splitting and divergence, a process to which we have only partial access through incidental evidence and assumptions that are themselves subject to refutation. This approach is not free of problems, as horizontal gene transfer, introgression, hybridisation, incorrect assumptions, sampling and methodological biases can mislead inferences of phylogenetic lineages. Increasing precision is demanded through the identification of both sister relationships and processes blurring or mimicking phylogeny, which has triggered, on the one hand, the development of methods that explicitly address such processes and, on the other hand, an increase in geographical and character data sampling necessary to infer/test such processes. Although our resolving power has increased, our knowledge of sister relationships - what we designate as species resolution - remains poor for many taxa and areas, which biases species limits and perceptions about how divergent species are or ought to be. We attribute to this conceptual shift the demise of trinominal nomenclature we are witnessing with the rise of subspecies to species or their rejection altogether; subspecies are raised to species if they are found to correspond to phylogenetic lineages, while they are rejected as fabricated taxa if they reflect arbitrary partitions of continuous or non-hereditary variation. Conservation strategies, if based on taxa, should emphasise species and reduce the use of subspecies to avoid preserving arbitrary partitions of continuous variation; local variation is best preserved by focusing on biological processes generating ecosystem resilience and diversity rather than by formally naming diagnosable units of any kind. Since many binomials still designate complexes of species rather than individual species, many species have been discovered but not named, geographical sampling is sparse, gene lineages have been mistaken for species, plenty of species limits remain untested, and many groups and areas lack adequate species resolution, we cannot avoid frequent changes to classifications as we address these problems. Changes will not only affect neglected taxa or areas, but also popular ones and regions where taxonomic research remained dormant for decades and old classifications were taken for granted.
Collapse
Affiliation(s)
- José M Padial
- Department of Herpetology, American Museum of Natural History, Central Park West & 79th St., New York, NY, 10024, U.S.A.,Department of Biology, Bronx Community College, City University of New York, 2155 University Avenue, Bronx, NY, 10453, U.S.A
| | - Ignacio De la Riva
- Museo Nacional de Ciencias Naturales-CSIC, José Gutiérrez Abascal 2, Madrid, 28006, Spain
| |
Collapse
|
48
|
Halabi K, Karin EL, Guéguen L, Mayrose I. A Codon Model for Associating Phenotypic Traits with Altered Selective Patterns of Sequence Evolution. Syst Biol 2020; 70:608-622. [PMID: 33252676 DOI: 10.1093/sysbio/syaa087] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 11/12/2020] [Accepted: 11/13/2020] [Indexed: 01/10/2023] Open
Abstract
Detecting the signature of selection in coding sequences and associating it with shifts in phenotypic states can unveil genes underlying complex traits. Of the various signatures of selection exhibited at the molecular level, changes in the pattern of selection at protein-coding genes have been of main interest. To this end, phylogenetic branch-site codon models are routinely applied to detect changes in selective patterns along specific branches of the phylogeny. Many of these methods rely on a prespecified partition of the phylogeny to branch categories, thus treating the course of trait evolution as fully resolved and assuming that phenotypic transitions have occurred only at speciation events. Here, we present TraitRELAX, a new phylogenetic model that alleviates these strong assumptions by explicitly accounting for the uncertainty in the evolution of both trait and coding sequences. This joint statistical framework enables the detection of changes in selection intensity upon repeated trait transitions. We evaluated the performance of TraitRELAX using simulations and then applied it to two case studies. Using TraitRELAX, we found an intensification of selection in the primate SEMG2 gene in polygynandrous species compared to species of other mating forms, as well as changes in the intensity of purifying selection operating on sixteen bacterial genes upon transitioning from a free-living to an endosymbiotic lifestyle.[Evolutionary selection; intensification; $\gamma $-proteobacteria; genotype-phenotype; relaxation; SEMG2.].
Collapse
Affiliation(s)
- Keren Halabi
- School of Plant Sciences and Food Security, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Eli Levy Karin
- Quantitative and Computational Biology, Max-Planck institute for biophysical Chemistry, Göttingen 37077, Germany
| | - Laurent Guéguen
- Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558, F-69622 Villeurbanne, France.,Swedish Collegium for Advanced Study, Thunbergsvägen 2 752 38 Uppsala, Sweden
| | - Itay Mayrose
- School of Plant Sciences and Food Security, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
49
|
Zardoya R. Quest for the Best Evolutionary Model. J Mol Evol 2020; 89:146-150. [PMID: 33201312 DOI: 10.1007/s00239-020-09971-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 11/04/2020] [Indexed: 11/28/2022]
Abstract
In the early 1980s, DNA sequencing became a routine and the increasing computing power opened the door to reconstruct molecular phylogenies using probabilistic approaches. DNA sequence alignments provided a large number of positions containing phylogenetic information, which could be extracted using explicit statistical models that described the mutation process using appropriate parameters. Consequently, an active quest started for building increasingly improved (more realistic) statistical models of nucleotide substitution. The simplest model assumed that nucleotide frequencies were in equilibrium and one single category of substitutions. Subsequent models allowed either unequal nucleotide frequencies or separate rates for transitions and transversions. The HKY85 model (Hasegawa et al. in J Mol Evol 22:160, 1985) combined elegantly both options into a single model, which became one of the most useful ones and has been the choice in many molecular phylogenetic studies ever since. The use of improved substitution models such as HKY85 allows reconstructing more accurate and reliable phylogenies, which in turn provide robust frameworks for understanding how biological diversity evolved and for performing a wealth of comparative studies in different disciplines such as ecology, biogeography, developmental biology, biochemistry, genomics, epidemiology, and biomedicine.
Collapse
Affiliation(s)
- Rafael Zardoya
- Departamento de Biodiversidad y Biología Evolutiva, Museo Nacional de Ciencias Naturales (MNCN-CSIC), José Gutiérrez Abascal, 2, 28006, Madrid, Spain.
| |
Collapse
|
50
|
Dellinger AS. Pollination syndromes in the 21 st century: where do we stand and where may we go? THE NEW PHYTOLOGIST 2020; 228:1193-1213. [PMID: 33460152 DOI: 10.1111/nph.16793] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 05/31/2020] [Indexed: 06/12/2023]
Abstract
Pollination syndromes, recurring suites of floral traits appearing in connection with specific functional pollinator groups, have served for decades to organise floral diversity under a functional-ecological perspective. Some potential caveats, such as over-simplification of complex plant-animal interactions or lack of empirical observations, have been identified and discussed in recent years. Which of these caveats do indeed cause problems, which have been solved and where do future possibilities lie? I address these questions in a review of the pollination-syndrome literature of 2010 to 2019. I show that the majority of studies was based on detailed empirical pollinator observations and could reliably predict pollinators based on a few floral traits such as colour, shape or reward. Some traits (i.e. colour) were less reliable in predicting pollinators than others (i.e. reward, corolla width), however. I stress that future studies should consider floral traits beyond those traditionally recorded to expand our understanding of mechanisms of floral evolution. I discuss statistical methods suitable for objectively analysing the interplay of system-specific evolutionary constraints, pollinator-mediated selection and adaptive trade-offs at microecological and macroecological scales. I exemplify my arguments on an empirical dataset of floral traits of a neotropical plant radiation in the family Melastomataceae.
Collapse
|