1
|
Dutheil JY. On the estimation of genome-average recombination rates. Genetics 2024:iyae051. [PMID: 38565705 DOI: 10.1093/genetics/iyae051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/13/2024] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open
Abstract
The rate at which recombination events occur in a population is an indicator of its effective population size and the organism's reproduction mode. It determines the extent of linkage disequilibrium along the genome and, thereby, the efficacy of both purifying and positive selection. The population recombination rate can be inferred using models of genome evolution in populations. Classic methods based on the patterns of linkage-disequilibrium provide the most accurate estimates, providing large sample sizes are used and the demography of the population is properly accounted for. Here, the capacity of approaches based on the sequentially Markov coalescent (SMC) to infer the genome-average recombination rate from as little as a single diploid genome is examined. SMC approaches provide highly accurate estimates even in the presence of changing population sizes, providing that (1) within genome heterogeneity is accounted for and (2) classic maximum-likelihood optimization algorithms are employed to fit the model. SMC-based estimates proved sensitive to gene conversion, leading to an overestimation of the recombination rate if conversion events are frequent. Conversely, methods based on the correlation of heterozygosity succeed in disentangling the rate of crossing over from that of gene conversion events, but only when the population size is constant and the recombination landscape homogeneous. These results call for a convergence of these two methods to obtain accurate and comparable estimates of recombination rates between populations.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, 24306 Plön, Germany
| |
Collapse
|
2
|
Langebrake C, Manthey G, Frederiksen A, Lugo Ramos JS, Dutheil JY, Chetverikova R, Solov'yov IA, Mouritsen H, Liedvogel M. Adaptive evolution and loss of a putative magnetoreceptor in passerines. Proc Biol Sci 2024; 291:20232308. [PMID: 38320616 PMCID: PMC10846946 DOI: 10.1098/rspb.2023.2308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 01/08/2024] [Indexed: 02/08/2024] Open
Abstract
Migratory birds possess remarkable accuracy in orientation and navigation, which involves various compass systems including the magnetic compass. Identifying the primary magnetosensor remains a fundamental open question. Cryptochromes (Cry) have been shown to be magnetically sensitive, and Cry4a from a migratory songbird seems to show enhanced magnetic sensitivity in vitro compared to Cry4a from resident species. We investigate Cry and their potential involvement in magnetoreception in a phylogenetic framework, integrating molecular evolutionary analyses with protein dynamics modelling. Our analysis is based on 363 bird genomes and identifies different selection regimes in passerines. We show that Cry4a is characterized by strong positive selection and high variability, typical characteristics of sensor proteins. We identify key sites that are likely to have facilitated the evolution of an optimized sensory protein for night-time orientation in songbirds. Additionally, we show that Cry4 was lost in hummingbirds, parrots and Tyranni (Suboscines), and thus identified a gene deletion, which might facilitate testing the function of Cry4a in birds. In contrast, the other avian Cry (Cry1 and Cry2) were highly conserved across all species, indicating basal, non-sensory functions. Our results support a specialization or functional differentiation of Cry4 in songbirds which could be magnetosensation.
Collapse
Affiliation(s)
- Corinna Langebrake
- Institute of Avian Research ‘Vogelwarte Helgoland’, 26386 Wilhelmshaven, Germany
- MPRG Behavioural Genomics, MPI Evolutionary Biology, 24306 Plön, Germany
| | - Georg Manthey
- Institute of Avian Research ‘Vogelwarte Helgoland’, 26386 Wilhelmshaven, Germany
- Department of Physics, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
| | - Anders Frederiksen
- Department of Physics, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
| | - Juan S. Lugo Ramos
- MPRG Behavioural Genomics, MPI Evolutionary Biology, 24306 Plön, Germany
- The Francis Crick Institute, London NW1 1AT, UK
| | - Julien Y. Dutheil
- Research Group Molecular Systems Evolution, MPI Evolutionary Biology, 24306 Plön, Germany
| | - Raisa Chetverikova
- Biology and Environmental Sciences Department, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
| | - Ilia A. Solov'yov
- Department of Physics, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
- Research Centre for Neurosensory Sciences, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
- Center for Nanoscale Dynamics (CENAD), Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
| | - Henrik Mouritsen
- Biology and Environmental Sciences Department, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
- Research Centre for Neurosensory Sciences, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
| | - Miriam Liedvogel
- Institute of Avian Research ‘Vogelwarte Helgoland’, 26386 Wilhelmshaven, Germany
- MPRG Behavioural Genomics, MPI Evolutionary Biology, 24306 Plön, Germany
- Biology and Environmental Sciences Department, Carl von Ossietzky Universität Oldenburg, 26129 Oldenburg
| |
Collapse
|
3
|
Dutheil JY, Hamidi D, Pajot B. The Site/Group Extended Data Format and Tools. Genome Biol Evol 2024; 16:evae011. [PMID: 38252924 PMCID: PMC10849175 DOI: 10.1093/gbe/evae011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 01/12/2024] [Indexed: 01/24/2024] Open
Abstract
Comparative sequence analysis permits unraveling the molecular processes underlying gene evolution. Many statistical methods generate candidate positions within genes, such as fast or slowly evolving sites, coevolving groups of residues, sites undergoing positive selection, or changes in evolutionary rates. Understanding the functional causes of these evolutionary patterns requires combining the results of these analyses and mapping them onto molecular structures, a complex task involving distinct coordinate referential systems. To ease this task, we introduce the site/group extended data format, a simple text format to store (groups of) site annotations. We developed a toolset, the SgedTools, which permits site/group extended data file manipulation, creating them from various software outputs and translating coordinates between individual sequences, alignments, and three-dimensional structures. The package also includes a Monte-Carlo procedure to generate random site samples, possibly conditioning on site-specific features. This eases the statistical testing of evolutionary hypotheses, accounting for the structural properties of the encoded molecules.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Research Group “Molecular Systems Evolution,” Department of Theoretical Biology, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Diyar Hamidi
- Research Group “Molecular Systems Evolution,” Department of Theoretical Biology, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Basile Pajot
- Research Group “Molecular Systems Evolution,” Department of Theoretical Biology, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| |
Collapse
|
4
|
Bascón-Cardozo K, Bours A, Manthey G, Durieux G, Dutheil JY, Pruisscher P, Odenthal-Hesse L, Liedvogel M. Fine-Scale Map Reveals Highly Variable Recombination Rates Associated with Genomic Features in the Eurasian Blackcap. Genome Biol Evol 2024; 16:evad233. [PMID: 38198800 PMCID: PMC10781513 DOI: 10.1093/gbe/evad233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2023] [Indexed: 01/12/2024] Open
Abstract
Recombination is responsible for breaking up haplotypes, influencing genetic variability, and the efficacy of selection. Bird genomes lack the protein PR domain-containing protein 9, a key determinant of recombination dynamics in most metazoans. Historical recombination maps in birds show an apparent stasis in positioning recombination events. This highly conserved recombination pattern over long timescales may constrain the evolution of recombination in birds. At the same time, extensive variation in recombination rate is observed across the genome and between different species of birds. Here, we characterize the fine-scale historical recombination map of an iconic migratory songbird, the Eurasian blackcap (Sylvia atricapilla), using a linkage disequilibrium-based approach that accounts for population demography. Our results reveal variable recombination rates among and within chromosomes, which associate positively with nucleotide diversity and GC content and negatively with chromosome size. Recombination rates increased significantly at regulatory regions but not necessarily at gene bodies. CpG islands are associated strongly with recombination rates, though their specific position and local DNA methylation patterns likely influence this relationship. The association with retrotransposons varied according to specific family and location. Our results also provide evidence of heterogeneous intrachromosomal conservation of recombination maps between the blackcap and its closest sister taxon, the garden warbler. These findings highlight the considerable variability of recombination rates at different scales and the role of specific genomic features in shaping this variation. This study opens the possibility of further investigating the impact of recombination on specific population-genomic features.
Collapse
Affiliation(s)
- Karen Bascón-Cardozo
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Andrea Bours
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Georg Manthey
- Institute of Avian Research “Vogelwarte Helgoland”, Wilhelmshaven 26386, Germany
| | - Gillian Durieux
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Julien Y Dutheil
- Department for Theoretical Biology, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Peter Pruisscher
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
- Department of Zoology, Stockholm University, Stockholm SE-106 91, Sweden
| | - Linda Odenthal-Hesse
- Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Miriam Liedvogel
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
- Institute of Avian Research “Vogelwarte Helgoland”, Wilhelmshaven 26386, Germany
- Department of Biology and Environmental Sciences, Carl von Ossietzky University of Oldenburg, Oldenburg 26129, Germany
| |
Collapse
|
5
|
Rivas-González I, Rousselle M, Li F, Zhou L, Dutheil JY, Munch K, Shao Y, Wu D, Schierup MH, Zhang G. Pervasive incomplete lineage sorting illuminates speciation and selection in primates. Science 2023; 380:eabn4409. [PMID: 37262154 DOI: 10.1126/science.abn4409] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 01/19/2023] [Indexed: 06/03/2023]
Abstract
Incomplete lineage sorting (ILS) causes the phylogeny of some parts of the genome to differ from the species tree. In this work, we investigate the frequencies and determinants of ILS in 29 major ancestral nodes across the entire primate phylogeny. We find up to 64% of the genome affected by ILS at individual nodes. We exploit ILS to reconstruct speciation times and ancestral population sizes. Estimated speciation times are much more recent than genomic divergence times and are in good agreement with the fossil record. We show extensive variation of ILS along the genome, mainly driven by recombination but also by the distance to genes, highlighting a major impact of selection on variation along the genome. In many nodes, ILS is reduced more on the X chromosome compared with autosomes than expected under neutrality, which suggests higher impacts of natural selection on the X chromosome. Finally, we show an excess of ILS in genes with immune functions and a deficit of ILS in housekeeping genes. The extensive ILS in primates discovered in this study provides insights into the speciation times, ancestral population sizes, and patterns of natural selection that shape primate evolution.
Collapse
Affiliation(s)
- Iker Rivas-González
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | | | - Fang Li
- BGI-Research, BGI-Wuhan, Wuhan 430074, China
- Institute of Animal Sex and Development, ZhejiangWanli University, Ningbo 315104, China
- BGI-Research, BGI-Shenzhen, Shenzhen 518083, China
| | - Long Zhou
- Evolutionary & Organismal Biology Research Center, Zhejiang University School of Medicine, Hangzhou 310058, China
- Women's Hospital, School of Medicine, Zhejiang University, Shangcheng District, Hangzhou 310006, China
| | - Julien Y Dutheil
- Max Planck Institute for Evolutionary Biology, Plön, Germany
- Institute of Evolution Sciences of Montpellier (ISEM), CNRS, University of Montpellier, IRD, EPHE, 34095 Montpellier, France
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- National Resource Center for Non-Human Primates, Kunming Primate Research Center, and National Research Facility for Phenotypic and Genetic Analysis of Model Animals (Primate Facility), Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650107, China
- Kunming Natural History Museum of Zoology, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Mikkel H Schierup
- Bioinformatics Research Centre, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Guojie Zhang
- Evolutionary & Organismal Biology Research Center, Zhejiang University School of Medicine, Hangzhou 310058, China
- Women's Hospital, School of Medicine, Zhejiang University, Shangcheng District, Hangzhou 310006, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou 311121, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
| |
Collapse
|
6
|
Raas MWD, Dutheil JY. The rate of adaptive molecular evolution in wild and domesticated Saccharomyces cerevisiae populations. Mol Ecol 2023. [PMID: 37157166 DOI: 10.1111/mec.16980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 04/22/2023] [Accepted: 04/26/2023] [Indexed: 05/10/2023]
Abstract
Through its fermentative capacities, Saccharomyces cerevisiae was central in the development of civilisation during the Neolithic period, and the yeast remains of importance in industry and biotechnology, giving rise to bona fide domesticated populations. Here, we conduct a population genomic study of domesticated and wild populations of S. cerevisiae. Using coalescent analyses, we report that the effective population size of yeast populations decreased since the divergence with S. paradoxus. We fitted models of distributions of fitness effects to infer the rate of adaptive ( ω a $$ {\omega}_a $$ ) and non-adaptive ( ω na $$ {\omega}_{na} $$ ) non-synonymous substitutions in protein-coding genes. We report an overall limited contribution of positive selection to S. cerevisiae protein evolution, albeit with higher rates of adaptive evolution in wild compared to domesticated populations. Our analyses revealed the signature of background selection and possibly Hill-Robertson interference, as recombination was found to be negatively correlated with ω na $$ {\omega}_{na} $$ and positively correlated with ω a $$ {\omega}_a $$ . However, the effect of recombination on ω a $$ {\omega}_a $$ was found to be labile, as it is only apparent after removing the impact of codon usage bias on the synonymous site frequency spectrum and disappears if we control for the correlation with ω na $$ {\omega}_{na} $$ , suggesting that it could be an artefact of the decreasing population size. Furthermore, the rate of adaptive non-synonymous substitutions is significantly correlated with the residue solvent exposure, a relation that cannot be explained by the population's demography. Together, our results provide a detailed characterisation of adaptive mutations in protein-coding genes across S. cerevisiae populations.
Collapse
Affiliation(s)
- Maximilian W D Raas
- Research Group Molecular Systems Evolution, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Julien Y Dutheil
- Research Group Molecular Systems Evolution, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Unité Mixte de Recherche 5554 Institut des Sciences de l'Evolution, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
7
|
Puzović N, Madaan T, Dutheil JY. Being noisy in a crowd: Differential selective pressure on gene expression noise in model gene regulatory networks. PLoS Comput Biol 2023; 19:e1010982. [PMID: 37079488 PMCID: PMC10118199 DOI: 10.1371/journal.pcbi.1010982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 02/27/2023] [Indexed: 04/21/2023] Open
Abstract
Expression noise, the variability of the amount of gene product among isogenic cells grown in identical conditions, originates from the inherent stochasticity of diffusion and binding of the molecular players involved in transcription and translation. It has been shown that expression noise is an evolvable trait and that central genes exhibit less noise than peripheral genes in gene networks. A possible explanation for this pattern is increased selective pressure on central genes since they propagate their noise to downstream targets, leading to noise amplification. To test this hypothesis, we developed a new gene regulatory network model with inheritable stochastic gene expression and simulated the evolution of gene-specific expression noise under constraint at the network level. Stabilizing selection was imposed on the expression level of all genes in the network and rounds of mutation, selection, replication and recombination were performed. We observed that local network features affect both the probability to respond to selection, and the strength of the selective pressure acting on individual genes. In particular, the reduction of gene-specific expression noise as a response to stabilizing selection on the gene expression level is higher in genes with higher centrality metrics. Furthermore, global topological structures such as network diameter, centralization and average degree affect the average expression variance and average selective pressure acting on constituent genes. Our results demonstrate that selection at the network level leads to differential selective pressure at the gene level, and local and global network characteristics are an essential component of gene-specific expression noise evolution.
Collapse
Affiliation(s)
- Nataša Puzović
- Molecular Systems Evolution Research Group, Max Planck Institute for Evolutionary Biology, Plön, Schleswig-Holstein, Germany
| | - Tanvi Madaan
- Molecular Systems Evolution Research Group, Max Planck Institute for Evolutionary Biology, Plön, Schleswig-Holstein, Germany
| | - Julien Y Dutheil
- Molecular Systems Evolution Research Group, Max Planck Institute for Evolutionary Biology, Plön, Schleswig-Holstein, Germany
- Institut des sciences de l'évolution, Montpellier, Languedoc-Roussillon, France
| |
Collapse
|
8
|
Moutinho AF, Eyre-Walker A, Dutheil JY. Strong evidence for the adaptive walk model of gene evolution in Drosophila and Arabidopsis. PLoS Biol 2022; 20:e3001775. [PMID: 36099311 PMCID: PMC9470001 DOI: 10.1371/journal.pbio.3001775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 08/01/2022] [Indexed: 11/19/2022] Open
Abstract
Understanding the dynamics of species adaptation to their environments has long been a central focus of the study of evolution. Theories of adaptation propose that populations evolve by “walking” in a fitness landscape. This “adaptive walk” is characterised by a pattern of diminishing returns, where populations further away from their fitness optimum take larger steps than those closer to their optimal conditions. Hence, we expect young genes to evolve faster and experience mutations with stronger fitness effects than older genes because they are further away from their fitness optimum. Testing this hypothesis, however, constitutes an arduous task. Young genes are small, encode proteins with a higher degree of intrinsic disorder, are expressed at lower levels, and are involved in species-specific adaptations. Since all these factors lead to increased protein evolutionary rates, they could be masking the effect of gene age. While controlling for these factors, we used population genomic data sets of Arabidopsis and Drosophila and estimated the rate of adaptive substitutions across genes from different phylostrata. We found that a gene’s evolutionary age significantly impacts the molecular rate of adaptation. Moreover, we observed that substitutions in young genes tend to have larger physicochemical effects. Our study, therefore, provides strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale. This study uses population genomic datasets from Arabidopsis and Drosophila to show that young genes adapt faster and are subject to mutations of larger fitness effects, providing strong evidence that molecular evolution follows an adaptive walk model across a large evolutionary timescale.
Collapse
Affiliation(s)
- Ana Filipa Moutinho
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
- * E-mail:
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Julien Y. Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Unité Mixte de Recherche 5554 Institut des Sciences de l’Evolution, CNRS, IRD, EPHE, Université de Montpellier, Montpellier, France
| |
Collapse
|
9
|
Meteyer CU, Dutheil JY, Keel MK, Boyles JG, Stukenbrock EH. Plant pathogens provide clues to the potential origin of bat white-nose syndrome Pseudogymnoascus destructans. Virulence 2022; 13:1020-1031. [PMID: 35635339 PMCID: PMC9176227 DOI: 10.1080/21505594.2022.2082139] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
White-nose syndrome has killed millions of bats, yet both the origins and infection strategy of the causative fungus, Pseudogymnoascus destructans, remain elusive. We provide evidence for a novel hypothesis that P. destructans emerged from plant-associated fungi and retained invasion strategies affiliated with fungal pathogens of plants. We demonstrate that P. destructans invades bat skin in successive biotrophic and necrotrophic stages (hemibiotrophic infection), a mechanism previously only described in plant fungal pathogens. Further, the convergence of hyphae at hair follicles suggests nutrient tropism. Tropism, biotrophy, and necrotrophy are often associated with structures termed appressoria in plant fungal pathogens; the penetrating hyphae produced by P. destructans resemble appressoria. Finally, we conducted a phylogenomic analysis of a taxonomically diverse collection of fungi. Despite gaps in genetic sampling of prehistoric and contemporary fungal species, we estimate an 88% probability the ancestral state of the clade containing P. destructans was a plant-associated fungus.
Collapse
Affiliation(s)
- Carol Uphoff Meteyer
- U.S. Geological Survey, National Wildlife Health Center, Madison, Wisconsin 53711
| | - Julien Y. Dutheil
- Molecular Systems Evolution, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - M. Kevin Keel
- School of Veterinary Medicine, Dept of Pathology, Microbiology & Immunology, University of California, Davis, California 95616
| | - Justin G. Boyles
- Cooperative Wildlife Research Laboratory and School of Biological Sciences, Southern Illinois University, Carbondale, Illinois 62901
| | - Eva H. Stukenbrock
- Environmental Genomics Group, Botanical Institute, Christian-Albrechts University of Kiel, Kiel, Germany and Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| |
Collapse
|
10
|
Chaurasia S, Dutheil JY. The Structural Determinants of Intra-Protein Compensatory Substitutions. Mol Biol Evol 2022; 39:6555661. [PMID: 35349721 PMCID: PMC9004419 DOI: 10.1093/molbev/msac063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Compensatory substitutions happen when one mutation is advantageously selected because it restores the loss of fitness induced by a previous deleterious mutation. How frequent such mutations occur in evolution and what is the structural and functional context permitting their emergence remain open questions. We built an atlas of intra-protein compensatory substitutions using a phylogenetic approach and a dataset of 1,630 bacterial protein families for which high-quality sequence alignments and experimentally derived protein structures were available. We identified more than 51,000 positions coevolving by the mean of predicted compensatory mutations. Using the evolutionary and structural properties of the analyzed positions, we demonstrate that compensatory mutations are scarce (typically only a few in the protein history) but widespread (the majority of proteins experienced at least one). Typical coevolving residues are evolving slowly, are located in the protein core outside secondary structure motifs, and are more often in contact than expected by chance, even after accounting for their evolutionary rate and solvent exposure. An exception to this general scheme is residues coevolving for charge compensation, which are evolving faster than noncoevolving sites, in contradiction with predictions from simple coevolutionary models, but similar to stem pairs in RNA. While sites with a significant pattern of coevolution by compensatory mutations are rare, the comparative analysis of hundreds of structures ultimately permits a better understanding of the link between the three-dimensional structure of a protein and its fitness landscape.
Collapse
Affiliation(s)
- Shilpi Chaurasia
- RG Molecular Systems Evolution, Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Straße 2, 24306 Plön, Germany.,Excelra Knowledge Solutions Pvt Ltd, Hyderabad, India
| | - Julien Y Dutheil
- RG Molecular Systems Evolution, Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Straße 2, 24306 Plön, Germany.,Institute of Evolution Sciences of Montpellier (ISEM), CNRS, University of Montpellier, IRD, EPHE, 34095 Montpellier, France
| |
Collapse
|
11
|
Locke DP, Hillier LW, Warren WC, Worley KC, Nazareth LV, Muzny DM, Yang SP, Wang Z, Chinwalla AT, Minx P, Mitreva M, Cook L, Delehaunty KD, Fronick C, Schmidt H, Fulton LA, Fulton RS, Nelson JO, Magrini V, Pohl C, Graves TA, Markovic C, Cree A, Dinh HH, Hume J, Kovar CL, Fowler GR, Lunter G, Meader S, Heger A, Ponting CP, Marques-Bonet T, Alkan C, Chen L, Cheng Z, Kidd JM, Eichler EE, White S, Searle S, Vilella AJ, Chen Y, Flicek P, Ma J, Raney B, Suh B, Burhans R, Herrero J, Haussler D, Faria R, Fernando O, Darré F, Farré D, Gazave E, Oliva M, Navarro A, Roberto R, Capozzi O, Archidiacono N, Della Valle G, Purgato S, Rocchi M, Konkel MK, Walker JA, Ullmer B, Batzer MA, Smit AFA, Hubley R, Casola C, Schrider DR, Hahn MW, Quesada V, Puente XS, Ordoñez GR, López-Otín C, Vinar T, Brejova B, Ratan A, Harris RS, Miller W, Kosiol C, Lawson HA, Taliwal V, Martins AL, Siepel A, RoyChoudhury A, Ma X, Degenhardt J, Bustamante CD, Gutenkunst RN, Mailund T, Dutheil JY, Hobolth A, Schierup MH, Ryder OA, Yoshinaga Y, de Jong PJ, Weinstock GM, Rogers J, Mardis ER, Gibbs RA, Wilson RK. Author Correction: Comparative and demographic analysis of orang-utan genomes. Nature 2022; 608:E36. [PMID: 35962045 PMCID: PMC9402433 DOI: 10.1038/s41586-022-04799-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Devin P. Locke
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - LaDeana W. Hillier
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Wesley C. Warren
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Kim C. Worley
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Lynne V. Nazareth
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Donna M. Muzny
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Shiaw-Pyng Yang
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Zhengyuan Wang
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Asif T. Chinwalla
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Pat Minx
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Makedonka Mitreva
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Lisa Cook
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Kim D. Delehaunty
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Catrina Fronick
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Heather Schmidt
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Lucinda A. Fulton
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Robert S. Fulton
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Joanne O. Nelson
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Vincent Magrini
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Craig Pohl
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Tina A. Graves
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Chris Markovic
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Andy Cree
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Huyen H. Dinh
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Jennifer Hume
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Christie L. Kovar
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Gerald R. Fowler
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Gerton Lunter
- grid.4991.50000 0004 1936 8948MRC Functional Genomics Unit and Department of Physiology, Anatomy and Genetics, University of Oxford, Le Gros Clark Building, Oxford, UK ,grid.270683.80000 0004 0641 4511Wellcome Trust Centre for Human Genetics, Oxford, UK
| | - Stephen Meader
- grid.4991.50000 0004 1936 8948MRC Functional Genomics Unit and Department of Physiology, Anatomy and Genetics, University of Oxford, Le Gros Clark Building, Oxford, UK
| | - Andreas Heger
- grid.4991.50000 0004 1936 8948MRC Functional Genomics Unit and Department of Physiology, Anatomy and Genetics, University of Oxford, Le Gros Clark Building, Oxford, UK
| | - Chris P. Ponting
- grid.4991.50000 0004 1936 8948MRC Functional Genomics Unit and Department of Physiology, Anatomy and Genetics, University of Oxford, Le Gros Clark Building, Oxford, UK
| | - Tomas Marques-Bonet
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington USA ,grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain
| | - Can Alkan
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington USA
| | - Lin Chen
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington USA
| | - Ze Cheng
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington USA
| | - Jeffrey M. Kidd
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington USA
| | - Evan E. Eichler
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington USA ,grid.413575.10000 0001 2167 1581Howard Hughes Medical Institute, Seattle, Washington USA
| | - Simon White
- grid.10306.340000 0004 0606 5382Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Stephen Searle
- grid.10306.340000 0004 0606 5382Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Albert J. Vilella
- grid.52788.300000 0004 0427 7672European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge UK
| | - Yuan Chen
- grid.52788.300000 0004 0427 7672European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge UK
| | - Paul Flicek
- grid.52788.300000 0004 0427 7672European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge UK
| | - Jian Ma
- grid.205975.c0000 0001 0740 6917Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California USA ,grid.35403.310000 0004 1936 9991Present Address: Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois USA
| | - Brian Raney
- grid.205975.c0000 0001 0740 6917Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California USA
| | - Bernard Suh
- grid.205975.c0000 0001 0740 6917Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California USA
| | - Richard Burhans
- grid.29857.310000 0001 2097 4281Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania, USA
| | - Javier Herrero
- grid.52788.300000 0004 0427 7672European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge UK
| | - David Haussler
- grid.205975.c0000 0001 0740 6917Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California USA
| | - Rui Faria
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain ,grid.5808.50000 0001 1503 7226CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão, Portugal
| | - Olga Fernando
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain ,grid.10772.330000000121511713Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Oeiras, Portugal
| | - Fleur Darré
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain
| | - Domènec Farré
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain
| | - Elodie Gazave
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain
| | - Meritxell Oliva
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain
| | - Arcadi Navarro
- grid.5612.00000 0001 2172 2676IBE, Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, PRBB, Doctor Aiguader, 88, Barcelona, Spain ,grid.425902.80000 0000 9601 989XICREA (Institució Catalana de Recerca i Estudis Avançats) and INB (Instituto Nacional de Bioinformática) PRBB, Doctor Aiguader, 88, Barcelona, Spain
| | - Roberta Roberto
- grid.7644.10000 0001 0120 3326Department of Biology, University of Bari, Bari, Italy
| | - Oronzo Capozzi
- grid.7644.10000 0001 0120 3326Department of Biology, University of Bari, Bari, Italy
| | | | - Giuliano Della Valle
- grid.6292.f0000 0004 1757 1758Department of Biology, University of Bologna, Bologna, Italy
| | - Stefania Purgato
- grid.6292.f0000 0004 1757 1758Department of Biology, University of Bologna, Bologna, Italy
| | - Mariano Rocchi
- grid.7644.10000 0001 0120 3326Department of Biology, University of Bari, Bari, Italy
| | - Miriam K. Konkel
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana USA
| | - Jerilyn A. Walker
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana USA
| | - Brygg Ullmer
- grid.64337.350000 0001 0662 7451Center for Computation and Technology, Department of Computer Sciences, Louisiana State University, Baton Rouge, Louisiana USA
| | - Mark A. Batzer
- grid.64337.350000 0001 0662 7451Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana USA
| | - Arian F. A. Smit
- grid.64212.330000 0004 0463 2320Institute for Systems Biology, Seattle, Washington USA
| | - Robert Hubley
- grid.64212.330000 0004 0463 2320Institute for Systems Biology, Seattle, Washington USA
| | - Claudio Casola
- grid.411377.70000 0001 0790 959XDepartment of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana USA
| | - Daniel R. Schrider
- grid.411377.70000 0001 0790 959XDepartment of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana USA
| | - Matthew W. Hahn
- grid.411377.70000 0001 0790 959XDepartment of Biology and School of Informatics and Computing, Indiana University, Bloomington, Indiana USA
| | - Victor Quesada
- grid.10863.3c0000 0001 2164 6351Instituto Universitario de Oncologia, Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Spain
| | - Xose S. Puente
- grid.10863.3c0000 0001 2164 6351Instituto Universitario de Oncologia, Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Spain
| | - Gonzalo R. Ordoñez
- grid.10863.3c0000 0001 2164 6351Instituto Universitario de Oncologia, Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Spain
| | - Carlos López-Otín
- grid.10863.3c0000 0001 2164 6351Instituto Universitario de Oncologia, Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Spain
| | - Tomas Vinar
- grid.7634.60000000109409708Faculty of Mathematics, Physics and Informatics, Comenius University, Mlynska Dolina, Bratislava, Slovakia
| | - Brona Brejova
- grid.7634.60000000109409708Faculty of Mathematics, Physics and Informatics, Comenius University, Mlynska Dolina, Bratislava, Slovakia
| | - Aakrosh Ratan
- grid.29857.310000 0001 2097 4281Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania, USA
| | - Robert S. Harris
- grid.29857.310000 0001 2097 4281Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania, USA
| | - Webb Miller
- grid.29857.310000 0001 2097 4281Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania, USA
| | - Carolin Kosiol
- Institut für Populations genetik, Vetmeduni Vienna, Wien, Austria
| | - Heather A. Lawson
- grid.4367.60000 0001 2355 7002Department of Anatomy and Neurobiology, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Vikas Taliwal
- grid.5386.8000000041936877XDepartment of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York USA
| | - André L. Martins
- grid.5386.8000000041936877XDepartment of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York USA
| | - Adam Siepel
- grid.5386.8000000041936877XDepartment of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York USA
| | - Arindam RoyChoudhury
- grid.21729.3f0000000419368729Department of Biostatistics, Columbia University, New York, New York USA
| | - Xin Ma
- grid.5386.8000000041936877XDepartment of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York USA
| | - Jeremiah Degenhardt
- grid.5386.8000000041936877XDepartment of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York USA
| | - Carlos D. Bustamante
- grid.168010.e0000000419368956Department of Genetics, Stanford University, Stanford, California USA
| | - Ryan N. Gutenkunst
- grid.134563.60000 0001 2168 186XDepartment of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona USA
| | - Thomas Mailund
- grid.7048.b0000 0001 1956 2722Bioinformatics Research Centre, Aarhus University, Aarhus C, Denmark
| | - Julien Y. Dutheil
- grid.7048.b0000 0001 1956 2722Bioinformatics Research Centre, Aarhus University, Aarhus C, Denmark
| | - Asger Hobolth
- grid.7048.b0000 0001 1956 2722Bioinformatics Research Centre, Aarhus University, Aarhus C, Denmark
| | - Mikkel H. Schierup
- grid.7048.b0000 0001 1956 2722Bioinformatics Research Centre, Aarhus University, Aarhus C, Denmark
| | - Oliver A. Ryder
- grid.452788.40000 0004 0458 5309San Diego Zoo’s Institute for Conservation Research, Escondido, California USA
| | - Yuko Yoshinaga
- grid.414016.60000 0004 0433 7727Children’s Hospital Oakland Research Institute, Oakland, California USA
| | - Pieter J. de Jong
- grid.414016.60000 0004 0433 7727Children’s Hospital Oakland Research Institute, Oakland, California USA
| | - George M. Weinstock
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Jeffrey Rogers
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Elaine R. Mardis
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| | - Richard A. Gibbs
- grid.39382.330000 0001 2160 926XHuman Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas USA
| | - Richard K. Wilson
- grid.4367.60000 0001 2355 7002The Genome Center at Washington University, Washington University School of Medicine, Saint Louis, Missouri USA
| |
Collapse
|
12
|
Schweizer G, Haider MB, Barroso GV, Rössel N, Münch K, Kahmann R, Dutheil JY. Population Genomics of the Maize Pathogen Ustilago maydis: Demographic History and Role of Virulence Clusters in Adaptation. Genome Biol Evol 2021; 13:evab073. [PMID: 33837781 PMCID: PMC8120014 DOI: 10.1093/gbe/evab073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/06/2021] [Indexed: 11/14/2022] Open
Abstract
The tight interaction between pathogens and their hosts results in reciprocal selective forces that impact the genetic diversity of the interacting species. The footprints of this selection differ between pathosystems because of distinct life-history traits, demographic histories, or genome architectures. Here, we studied the genome-wide patterns of genetic diversity of 22 isolates of the causative agent of the corn smut disease, Ustilago maydis, originating from five locations in Mexico, the presumed center of origin of this species. In this species, many genes encoding secreted effector proteins reside in so-called virulence clusters in the genome, an arrangement that is so far not found in other filamentous plant pathogens. Using a combination of population genomic statistical analyses, we assessed the geographical, historical, and genome-wide variation of genetic diversity in this fungal pathogen. We report evidence of two partially admixed subpopulations that are only loosely associated with geographic origin. Using the multiple sequentially Markov coalescent model, we inferred the demographic history of the two pathogen subpopulations over the last 0.5 Myr. We show that both populations experienced a recent strong bottleneck starting around 10,000 years ago, coinciding with the assumed time of maize domestication. Although the genome average genetic diversity is low compared with other fungal pathogens, we estimated that the rate of nonsynonymous adaptive substitutions is three times higher in genes located within virulence clusters compared with nonclustered genes, including nonclustered effector genes. These results highlight the role that these singular genomic regions play in the evolution of this pathogen.
Collapse
Affiliation(s)
- Gabriel Schweizer
- Department of Organismic Interactions, Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
| | - Muhammad Bilal Haider
- Max-Planck-Institute for Evolutionary Biology, Research Group Molecular Systems Evolution, Plön, Germany
| | - Gustavo V Barroso
- Max-Planck-Institute for Evolutionary Biology, Research Group Molecular Systems Evolution, Plön, Germany
| | - Nicole Rössel
- Department of Organismic Interactions, Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
| | - Karin Münch
- Department of Organismic Interactions, Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
| | - Regine Kahmann
- Department of Organismic Interactions, Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
| | - Julien Y Dutheil
- Department of Organismic Interactions, Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
- Max-Planck-Institute for Evolutionary Biology, Research Group Molecular Systems Evolution, Plön, Germany
- Institute of Evolutionary Sciences of Montpellier, University of Montpellier 2, France
| |
Collapse
|
13
|
Dutheil JY, Münch K, Schotanus K, Stukenbrock EH, Kahmann R. The insertion of a mitochondrial selfish element into the nuclear genome and its consequences. Ecol Evol 2020; 10:11117-11132. [PMID: 33144953 PMCID: PMC7593156 DOI: 10.1002/ece3.6749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 08/12/2020] [Indexed: 12/15/2022] Open
Abstract
Homing endonucleases (HE) are enzymes capable of cutting DNA at highly specific target sequences, the repair of the generated double-strand break resulting in the insertion of the HE-encoding gene ("homing" mechanism). HEs are present in all three domains of life and viruses; in eukaryotes, they are mostly found in the genomes of mitochondria and chloroplasts, as well as nuclear ribosomal RNAs. We here report the case of a HE that accidentally integrated into a telomeric region of the nuclear genome of the fungal maize pathogen Ustilago maydis. We show that the gene has a mitochondrial origin, but its original copy is absent from the U. maydis mitochondrial genome, suggesting a subsequent loss or a horizontal transfer from a different species. The telomeric HE underwent mutations in its active site and lost its original start codon. A potential other start codon was retained downstream, but we did not detect any significant transcription of the newly created open reading frame, suggesting that the inserted gene is not functional. Besides, the insertion site is located in a putative RecQ helicase gene, truncating the C-terminal domain of the protein. The truncated helicase is expressed during infection of the host, together with other homologous telomeric helicases. This unusual mutational event altered two genes: The integrated HE gene subsequently lost its homing activity, while its insertion created a truncated version of an existing gene, possibly altering its function. As the insertion is absent in other field isolates, suggesting that it is recent, the U. maydis 521 reference strain offers a snapshot of this singular mutational event.
Collapse
Affiliation(s)
- Julien Y. Dutheil
- Max Planck Institute for Evolutionary BiologyPlönGermany
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
- Institute of Evolutionary SciencesCNRS – University of Montpellier – IRD – EPHEMontpellierFrance
| | - Karin Münch
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
| | - Klaas Schotanus
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
- Christian Albrechts University of KielKielGermany
- Present address:
Department of Molecular Genetics and Microbiology (MGM)Duke University Medical CenterDurhamNCUSA
| | - Eva H. Stukenbrock
- Max Planck Institute for Evolutionary BiologyPlönGermany
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
- Christian Albrechts University of KielKielGermany
| | - Regine Kahmann
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
| |
Collapse
|
14
|
Potgieter L, Feurtey A, Dutheil JY, Stukenbrock EH. On Variant Discovery in Genomes of Fungal Plant Pathogens. Front Microbiol 2020; 11:626. [PMID: 32373089 PMCID: PMC7176817 DOI: 10.3389/fmicb.2020.00626] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 03/19/2020] [Indexed: 11/13/2022] Open
Abstract
Comparative genome analyses of eukaryotic pathogens including fungi and oomycetes have revealed extensive variability in genome composition and structure. The genomes of individuals from the same population can exhibit different numbers of chromosomes and different organization of chromosomal segments, defining so-called accessory compartments that have been shown to be crucial to pathogenicity in plant-infecting fungi. This high level of structural variation confers a methodological challenge for population genomic analyses. Variant discovery from population sequencing data is typically achieved using established pipelines based on the mapping of short reads to a reference genome. These pipelines have been developed, and extensively used, for eukaryote genomes of both plants and animals, to retrieve single nucleotide polymorphisms and short insertions and deletions. However, they do not permit the inference of large-scale genomic structural variation, as this task typically requires the alignment of complete genome sequences. Here, we compare traditional variant discovery approaches to a pipeline based on de novo genome assembly of short read data followed by whole genome alignment, using simulated data sets with properties mimicking that of fungal pathogen genomes. We show that the latter approach exhibits levels of performance comparable to that of read-mapping based methodologies, when used on sequence data with sufficient coverage. We argue that this approach further allows additional types of genomic diversity to be explored, in particular as long-read third-generation sequencing technologies are becoming increasingly available to generate population genomic data.
Collapse
Affiliation(s)
- Lizel Potgieter
- Environmental Genomics, Max Planck Institute for Evolutionary Biology, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, Kiel, Germany
| | - Alice Feurtey
- Environmental Genomics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Julien Y Dutheil
- Molecular Systems Evolution, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Eva H Stukenbrock
- Environmental Genomics, Max Planck Institute for Evolutionary Biology, Plön, Germany.,Environmental Genomics, Christian-Albrechts University of Kiel, Kiel, Germany
| |
Collapse
|
15
|
Abstract
Population genomics is a growing field stemming from soon a 100 years of developments in population genetics. Here, we summarize the main concepts and terminology underlying both theoretical and empirical statistical population genomics studies. We provide the reader with pointers toward the original literature as well as methodological and historical reviews.
Collapse
Affiliation(s)
- Gustavo V Barroso
- Department of Evolutionary Genetics, Max Planck Institute of Evolutionary Biology, Plön, Germany
| | - Ana Filipa Moutinho
- Department of Evolutionary Genetics, Max Planck Institute of Evolutionary Biology, Plön, Germany
| | - Julien Y Dutheil
- Department of Evolutionary Genetics, Max Planck Institute of Evolutionary Biology, Plön, Germany
| |
Collapse
|
16
|
Abstract
Chapter 2, “Processing and Analyzing Multiple Genomes Alignments with MafFilter,” was previously published without including the Electronic Supplementary Material. This has now been included in the revised version of this book.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Department of Evolutionary Genetics, Max Planck Institute of Evolutionary Biology, Pl-n, Germany.
| |
Collapse
|
17
|
Abstract
AbstractThe importance of adaptive mutations in molecular evolution is extensively debated. Recent developments in population genomics allow inferring rates of adaptive mutations by fitting a distribution of fitness effects to the observed patterns of polymorphism and divergence at sites under selection and sites assumed to evolve neutrally. Here, we summarize the current state-of-the-art of these methods and review the factors that affect the molecular rate of adaptation. Several studies have reported extensive cross-species variation in the proportion of adaptive amino-acid substitutions (α) and predicted that species with larger effective population sizes undergo less genetic drift and higher rates of adaptation. Disentangling the rates of positive and negative selection, however, revealed that mutations with deleterious effects are the main driver of this population size effect and that adaptive substitution rates vary comparatively little across species. Conversely, rates of adaptive substitution have been documented to vary substantially within genomes. On a genome-wide scale, gene density, recombination and mutation rate were observed to play a role in shaping molecular rates of adaptation, as predicted under models of linked selection. At the gene level, it has been reported that the gene functional category and the macromolecular structure substantially impact the rate of adaptive mutations. Here, we deliver a comprehensive review of methods used to infer the molecular adaptive rate, the potential drivers of adaptive evolution and how positive selection shapes molecular evolution within genes, across genes within species and between species.
Collapse
|
18
|
V. Barroso G, Puzović N, Dutheil JY. Inference of recombination maps from a single pair of genomes and its application to ancient samples. PLoS Genet 2019; 15:e1008449. [PMID: 31725722 PMCID: PMC6879166 DOI: 10.1371/journal.pgen.1008449] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 11/26/2019] [Accepted: 09/30/2019] [Indexed: 12/11/2022] Open
Abstract
Understanding the causes and consequences of recombination landscape evolution is a fundamental goal in genetics that requires recombination maps from across the tree of life. Such maps can be obtained from population genomic datasets, but require large sample sizes. Alternative methods are therefore necessary to research organisms where such datasets cannot be generated easily, such as non-model or ancient species. Here we extend the sequentially Markovian coalescent model to jointly infer demography and the spatial variation in recombination rate. Using extensive simulations and sequence data from humans, fruit-flies and a fungal pathogen, we demonstrate that iSMC accurately infers recombination maps under a wide range of scenarios-remarkably, even from a single pair of unphased genomes. We exploit this possibility and reconstruct the recombination maps of ancient hominins. We report that the ancient and modern maps are correlated in a manner that reflects the established phylogeny of Neanderthals, Denisovans, and modern human populations.
Collapse
Affiliation(s)
- Gustavo V. Barroso
- Max Planck Institute for Evolutionary Biology, Department of Evolutionary Genetics, August-Thienemann-Straße , Plön–GERMANY
- * E-mail:
| | - Nataša Puzović
- Max Planck Institute for Evolutionary Biology, Department of Evolutionary Genetics, August-Thienemann-Straße , Plön–GERMANY
| | - Julien Y. Dutheil
- Max Planck Institute for Evolutionary Biology, Department of Evolutionary Genetics, August-Thienemann-Straße , Plön–GERMANY
| |
Collapse
|
19
|
Grandaubert J, Dutheil JY, Stukenbrock EH. The genomic determinants of adaptive evolution in a fungal pathogen. Evol Lett 2019; 3:299-312. [PMID: 31171985 PMCID: PMC6546377 DOI: 10.1002/evl3.117] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2018] [Revised: 04/02/2019] [Accepted: 04/05/2019] [Indexed: 12/16/2022] Open
Abstract
Unravelling the strength, frequency, and distribution of selective variants along the genome as well as the underlying factors shaping this distribution are fundamental goals of evolutionary biology. Antagonistic host-pathogen coevolution is thought to be a major driver of genome evolution between interacting species. While rapid evolution of pathogens has been documented in several model organisms, the genetic mechanisms of their adaptation are still poorly understood and debated, particularly the role of sexual reproduction. Here, we apply a population genomic approach to infer genome-wide patterns of selection among 13 isolates of Zymoseptoria tritici, a fungal pathogen characterized by extremely high genetic diversity, gene density, and recombination rates. We report that the genome of Z. tritici undergoes a high rate of adaptive substitutions, with 44% of nonsynonymous substitutions being adaptive on average. This fraction reaches 68% in so-called effector genes encoding determinants of pathogenicity, and the distribution of fitness effects differs in this class of genes as they undergo adaptive mutations with stronger positive fitness effects, but also more slightly deleterious mutations. Besides the globally high rate of adaptive substitutions, we report a negative relationship between pN/pS and the fine-scale recombination rate and a strong positive correlation between the rate of adaptive nonsynonymous substitutions (ωa) and recombination rate. This result suggests a pervasive role of both background selection and Hill-Robertson interference even in a species with an exceptionally high recombination rate (60 cM/Mb on average). While transposable elements (TEs) have been suggested to contribute to adaptation by creating compartments of fast-evolving genomic regions, we do not find a significant effect of TEs on the rate of adaptive mutations. Overall our study suggests that sexual recombination is a significant driver of genome evolution, even in rapidly evolving organisms subject to recurrent mutations with large positive effects.
Collapse
Affiliation(s)
- Jonathan Grandaubert
- Environmental Genomics GroupMax Planck Institute for Evolutionary BiologyAugust‐Thienemann‐Str. 224306PlönGermany
- Christian‐Albrechts University of KielAm Botanischen Garten 1–924118KielGermany
| | - Julien Y. Dutheil
- Research group Molecular Systems EvolutionMax Planck Institute for Evolutionary BiologyAugust‐Thienemann‐Str. 224306PlönGermany
- UMR 5554 Institut des Sciences de l'Evolution, CNRS, IRD, EPHEUniversité de MontpellierPlace E. Bataillon34095MontpellierFrance
| | - Eva H. Stukenbrock
- Environmental Genomics GroupMax Planck Institute for Evolutionary BiologyAugust‐Thienemann‐Str. 224306PlönGermany
- Christian‐Albrechts University of KielAm Botanischen Garten 1–924118KielGermany
| |
Collapse
|
20
|
Dutheil JY, Hobolth A. Ancestral Population Genomics. Methods Mol Biol 2019; 1910:555-589. [PMID: 31278677 DOI: 10.1007/978-1-4939-9074-0_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Borrowing both from population genetics and phylogenetics, the field of population genomics emerged as full genomes of several closely related species were available. Providing we can properly model sequence evolution within populations undergoing speciation events, this resource enables us to estimate key population genetics parameters such as ancestral population sizes and split times. Furthermore we can enhance our understanding of the recombination process and investigate various selective forces. With the advent of resequencing technologies, genome-wide patterns of diversity in extant populations have now come to complement this picture, offering an increasing power to study more recent genetic history.We discuss the basic models of genomes in populations, including speciation models for closely related species. A major point in our discussion is that only a few complete genomes contain much information about the whole population. The reason being that recombination unlinks genomic regions, and therefore a few genomes contain many segments with distinct histories. The challenge of population genomics is to decode this mosaic of histories in order to infer scenarios of demography and selection. We survey modeling strategies for understanding genetic variation in ancestral populations and species. The underlying models build on the coalescent with recombination process and introduce further assumptions to scale the analyses to genomic data sets.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Department of Evolutionary Genetics, Max Planck Institute of Evolutionary Biology, Plön, Germany.
| | - Asger Hobolth
- Bioinformatics Research Center (BiRC), Aarhus University, Aarhus, Denmark
| |
Collapse
|
21
|
Stukenbrock EH, Dutheil JY. Fine-Scale Recombination Maps of Fungal Plant Pathogens Reveal Dynamic Recombination Landscapes and Intragenic Hotspots. Genetics 2018; 208:1209-1229. [PMID: 29263029 PMCID: PMC5844332 DOI: 10.1534/genetics.117.300502] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 12/15/2017] [Indexed: 11/18/2022] Open
Abstract
Meiotic recombination is an important driver of evolution. Variability in the intensity of recombination across chromosomes can affect sequence composition, nucleotide variation, and rates of adaptation. In many organisms, recombination events are concentrated within short segments termed recombination hotspots. The variation in recombination rate and positions of recombination hotspot can be studied using population genomics data and statistical methods. In this study, we conducted population genomics analyses to address the evolution of recombination in two closely related fungal plant pathogens: the prominent wheat pathogen Zymoseptoria tritici and a sister species infecting wild grasses Z. ardabiliae We specifically addressed whether recombination landscapes, including hotspot positions, are conserved in the two recently diverged species and if recombination contributes to rapid evolution of pathogenicity traits. We conducted a detailed simulation analysis to assess the performance of methods of recombination rate estimation based on patterns of linkage disequilibrium, in particular in the context of high nucleotide diversity. Our analyses reveal overall high recombination rates, a lack of suppressed recombination in centromeres, and significantly lower recombination rates on chromosomes that are known to be accessory. The comparison of the recombination landscapes of the two species reveals a strong correlation of recombination rate at the megabase scale, but little correlation at smaller scales. The recombination landscapes in both pathogen species are dominated by frequent recombination hotspots across the genome including coding regions, suggesting a strong impact of recombination on gene evolution. A significant but small fraction of these hotspots colocalize between the two species, suggesting that hotspot dynamics contribute to the overall pattern of fast evolving recombination in these species.
Collapse
Affiliation(s)
- Eva H Stukenbrock
- Environmental Genomics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Environmental Genomics, Christian-Albrechts University of Kiel, 24118, Germany
| | - Julien Y Dutheil
- Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Institut des Sciences de L'Évolution de Montpellier, Centre National de la Recherche Scientifique, Université Montpellier 2, 34095, France
| |
Collapse
|
22
|
Schweizer G, Münch K, Mannhaupt G, Schirawski J, Kahmann R, Dutheil JY. Positively Selected Effector Genes and Their Contribution to Virulence in the Smut Fungus Sporisorium reilianum. Genome Biol Evol 2018; 10:629-645. [PMID: 29390140 PMCID: PMC5811872 DOI: 10.1093/gbe/evy023] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/29/2018] [Indexed: 12/13/2022] Open
Abstract
Plants and fungi display a broad range of interactions in natural and agricultural ecosystems ranging from symbiosis to parasitism. These ecological interactions result in coevolution between genes belonging to different partners. A well-understood example is secreted fungal effector proteins and their host targets, which play an important role in pathogenic interactions. Biotrophic smut fungi (Basidiomycota) are well-suited to investigate the evolution of plant pathogens, because several reference genomes and genetic tools are available for these species. Here, we used the genomes of Sporisorium reilianum f. sp. zeae and S. reilianum f. sp. reilianum, two closely related formae speciales infecting maize and sorghum, respectively, together with the genomes of Ustilago hordei, Ustilago maydis, and Sporisorium scitamineum to identify and characterize genes displaying signatures of positive selection. We identified 154 gene families having undergone positive selection during species divergence in at least one lineage, among which 77% were identified in the two investigated formae speciales of S. reilianum. Remarkably, only 29% of positively selected genes encode predicted secreted proteins. We assessed the contribution to virulence of nine of these candidate effector genes in S. reilianum f. sp. zeae by deleting individual genes, including a homologue of the effector gene pit2 previously characterized in U. maydis. Only the pit2 deletion mutant was found to be strongly reduced in virulence. Additional experiments are required to understand the molecular mechanisms underlying the selection forces acting on the other candidate effector genes, as well as the large fraction of positively selected genes encoding predicted cytoplasmic proteins.
Collapse
Affiliation(s)
- Gabriel Schweizer
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland
| | - Karin Münch
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Gertrud Mannhaupt
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
- Institute for Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Jan Schirawski
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
- Microbial Genetics, Institute of Applied Microbiology, RWTH Aachen, Aachen, Germany
| | - Regine Kahmann
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Julien Y Dutheil
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
- Institute of Evolutionary Sciences of Montpellier, “Genome” Department, CNRS, University of Montpellier 2, France
- Research Group Molecular Systems Evolution, Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
23
|
Barroso GV, Puzovic N, Dutheil JY. The Evolution of Gene-Specific Transcriptional Noise Is Driven by Selection at the Pathway Level. Genetics 2018; 208:173-189. [PMID: 29097405 PMCID: PMC5753856 DOI: 10.1534/genetics.117.300467] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 10/13/2017] [Indexed: 11/18/2022] Open
Abstract
Biochemical reactions within individual cells result from the interactions of molecules, typically in small numbers. Consequently, the inherent stochasticity of binding and diffusion processes generates noise along the cascade that leads to the synthesis of a protein from its encoding gene. As a result, isogenic cell populations display phenotypic variability even in homogeneous environments. The extent and consequences of this stochastic gene expression have only recently been assessed on a genome-wide scale, owing, in particular, to the advent of single-cell transcriptomics. However, the evolutionary forces shaping this stochasticity have yet to be unraveled. Here, we take advantage of two recently published data sets for the single-cell transcriptome of the domestic mouse Mus musculus to characterize the effect of natural selection on gene-specific transcriptional stochasticity. We show that noise levels in the mRNA distributions (also known as transcriptional noise) significantly correlate with three-dimensional nuclear domain organization, evolutionary constraints on the encoded protein, and gene age. However, the position of the encoded protein in a biological pathway is the main factor that explains observed levels of transcriptional noise, in agreement with models of noise propagation within gene networks. Because transcriptional noise is under widespread selection, we argue that it constitutes an important component of the phenotype and that variance of expression is a potential target of adaptation. Stochastic gene expression should therefore be considered together with the mean expression level in functional and evolutionary studies of gene expression.
Collapse
Affiliation(s)
- Gustavo Valadares Barroso
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Natasa Puzovic
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
| | - Julien Y Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany
- Unité mixte de recherche 5554, Institut des Sciences de l'Évolution, Université de Montpellier, 34095, France
| |
Collapse
|
24
|
Odenthal-Hesse L, Dutheil JY, Klötzl F, Haubold B. hotspot: software to support sperm-typing for investigating recombination hotspots. Bioinformatics 2016; 32:2554-5. [PMID: 27153632 PMCID: PMC4978934 DOI: 10.1093/bioinformatics/btw195] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2016] [Revised: 03/24/2016] [Accepted: 04/07/2016] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION In many organisms, including humans, recombination clusters within recombination hotspots. The standard method for de novo detection of recombinants at hotspots is sperm typing. This relies on allele-specific PCR at single nucleotide polymorphisms. Designing allele-specific primers by hand is time-consuming. We have therefore written a package to support hotspot detection and analysis. RESULTS hotspot consists of four programs: asp looks up SNPs and designs allele-specific primers; aso constructs allele-specific oligos for mapping recombinants; xov implements a maximum-likelihood method for estimating the crossover rate; six, finally, simulates typing data. AVAILABILITY AND IMPLEMENTATION hotspot is written in C. Sources are freely available under the GNU General Public License from http://github.com/evolbioinf/hotspot/ CONTACT haubold@evolbio.mpg.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Linda Odenthal-Hesse
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Plön, Germany
| | - Julien Y Dutheil
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Plön, Germany
| | - Fabian Klötzl
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Plön, Germany
| | - Bernhard Haubold
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Biology, Plön, Germany
| |
Collapse
|
25
|
Tollot M, Assmann D, Becker C, Altmüller J, Dutheil JY, Wegner CE, Kahmann R. The WOPR Protein Ros1 Is a Master Regulator of Sporogenesis and Late Effector Gene Expression in the Maize Pathogen Ustilago maydis. PLoS Pathog 2016; 12:e1005697. [PMID: 27332891 PMCID: PMC4917244 DOI: 10.1371/journal.ppat.1005697] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 05/20/2016] [Indexed: 12/31/2022] Open
Abstract
The biotrophic basidiomycete fungus Ustilago maydis causes smut disease in maize. Hallmarks of the disease are large tumors that develop on all aerial parts of the host in which dark pigmented teliospores are formed. We have identified a member of the WOPR family of transcription factors, Ros1, as major regulator of spore formation in U. maydis. ros1 expression is induced only late during infection and hence Ros1 is neither involved in plant colonization of dikaryotic fungal hyphae nor in plant tumor formation. However, during late stages of infection Ros1 is essential for fungal karyogamy, massive proliferation of diploid fungal cells and spore formation. Premature expression of ros1 revealed that Ros1 counteracts the b-dependent filamentation program and induces morphological alterations resembling the early steps of sporogenesis. Transcriptional profiling and ChIP-seq analyses uncovered that Ros1 remodels expression of about 30% of all U. maydis genes with 40% of these being direct targets. In total the expression of 80 transcription factor genes is controlled by Ros1. Four of the upregulated transcription factor genes were deleted and two of the mutants were affected in spore development. A large number of b-dependent genes were differentially regulated by Ros1, suggesting substantial changes in this regulatory cascade that controls filamentation and pathogenic development. Interestingly, 128 genes encoding secreted effectors involved in the establishment of biotrophic development were downregulated by Ros1 while a set of 70 “late effectors” was upregulated. These results indicate that Ros1 is a master regulator of late development in U. maydis and show that the biotrophic interaction during sporogenesis involves a drastic shift in expression of the fungal effectome including the downregulation of effectors that are essential during early stages of infection. The fungus Ustilago maydis is a pathogen of maize which induces tumor formation in the infected tissue. In these tumors huge amounts of fungal spores develop. As a biotrophic pathogen, U. maydis establishes itself in the plant with the help of a large number of secreted effector proteins. Many effector proteins are important for virulence because they counteract plant defense reactions. In this manuscript we have identified and characterized Ros1, a master regulator for the late stages of U. maydis development. This transcription factor is expressed late during infection and controls nuclear fusion, hyphal aggregation and late proliferation. ros1 mutants are still able to induce tumor formation but these are a dead end because they do not contain any spores. We show that Ros1 interferes with the early regulatory cascade controlled by a complex of two homeodomain proteins. In addition, Ros1 triggers a major switch in the effector repertoire, suggesting that different sets of effectors are needed for different stages of fungal development inside the plant.
Collapse
Affiliation(s)
- Marie Tollot
- Max Planck Institute for Terrestrial Microbiology, Department of Organismic Interactions, Marburg, Germany
| | - Daniela Assmann
- Max Planck Institute for Terrestrial Microbiology, Department of Organismic Interactions, Marburg, Germany
| | - Christian Becker
- Cologne Center for Genomics (CCG), University of Cologne, Cologne, Germany
| | - Janine Altmüller
- Cologne Center for Genomics (CCG), University of Cologne, Cologne, Germany
| | - Julien Y. Dutheil
- Max Planck Institute for Terrestrial Microbiology, Department of Organismic Interactions, Marburg, Germany
| | - Carl-Eric Wegner
- Max Planck Institute for Terrestrial Microbiology, Deparment of Biogeochemistry, Marburg, Germany
| | - Regine Kahmann
- Max Planck Institute for Terrestrial Microbiology, Department of Organismic Interactions, Marburg, Germany
- * E-mail:
| |
Collapse
|
26
|
Dutheil JY, Mannhaupt G, Schweizer G, M K Sieber C, Münsterkötter M, Güldener U, Schirawski J, Kahmann R. A Tale of Genome Compartmentalization: The Evolution of Virulence Clusters in Smut Fungi. Genome Biol Evol 2016; 8:681-704. [PMID: 26872771 PMCID: PMC4824034 DOI: 10.1093/gbe/evw026] [Citation(s) in RCA: 94] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Smut fungi are plant pathogens mostly parasitizing wild species of grasses as well as domesticated cereal crops. Genome analysis of several smut fungi including Ustilago maydis revealed a singular clustered organization of genes encoding secreted effectors. In U. maydis, many of these clusters have a role in virulence. Reconstructing the evolutionary history of clusters of effector genes is difficult because of their intrinsically fast evolution, which erodes the phylogenetic signal and homology relationships. Here, we describe the use of comparative evolutionary analyses of quality draft assemblies of genomes to study the mechanisms of this evolution. We report the genome sequence of a South African isolate of Sporisorium scitamineum, a smut fungus parasitizing sugar cane with a phylogenetic position intermediate to the two previously sequenced species U. maydis and Sporisorium reilianum. We show that the genome of S. scitamineum contains more and larger gene clusters encoding secreted effectors than any previously described species in this group. We trace back the origin of the clusters and find that their evolution is mainly driven by tandem gene duplication. In addition, transposable elements play a major role in the evolution of the clustered genes. Transposable elements are significantly associated with clusters of genes encoding fast evolving secreted effectors. This suggests that such clusters represent a case of genome compartmentalization that restrains the activity of transposable elements on genes under diversifying selection for which this activity is potentially beneficial, while protecting the rest of the genome from its deleterious effect.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Gertrud Mannhaupt
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Gabriel Schweizer
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Christian M K Sieber
- German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Martin Münsterkötter
- German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Ulrich Güldener
- German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Jan Schirawski
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| | - Regine Kahmann
- Department of Organismic Interactions, Max Planck Institute for Terrestrial Microbiology, Marburg, Germany
| |
Collapse
|
27
|
Dutheil JY, Munch K, Nam K, Mailund T, Schierup MH. Strong Selective Sweeps on the X Chromosome in the Human-Chimpanzee Ancestor Explain Its Low Divergence. PLoS Genet 2015; 11:e1005451. [PMID: 26274919 PMCID: PMC4537231 DOI: 10.1371/journal.pgen.1005451] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 07/20/2015] [Indexed: 11/18/2022] Open
Abstract
The human and chimpanzee X chromosomes are less divergent than expected based on autosomal divergence. We study incomplete lineage sorting patterns between humans, chimpanzees and gorillas to show that this low divergence can be entirely explained by megabase-sized regions comprising one-third of the X chromosome, where polymorphism in the human-chimpanzee ancestral species was severely reduced. We show that background selection can explain at most 10% of this reduction of diversity in the ancestor. Instead, we show that several strong selective sweeps in the ancestral species can explain it. We also report evidence of population specific sweeps in extant humans that overlap the regions of low diversity in the ancestral species. These regions further correspond to chromosomal sections shown to be devoid of Neanderthal introgression into modern humans. This suggests that the same X-linked regions that undergo selective sweeps are among the first to form reproductive barriers between diverging species. We hypothesize that meiotic drive is the underlying mechanism causing these two observations.
Collapse
Affiliation(s)
- Julien Y. Dutheil
- Institut des Sciences de l'Évolution–Montpellier (ISEM), UMR 5554, CNRS, Université Montpellier 2, Montpellier, France
- * E-mail:
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Kiwoong Nam
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Thomas Mailund
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Mikkel H. Schierup
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Department of Bioscience, Aarhus University, Aarhus, Denmark
| |
Collapse
|
28
|
Dutheil JY, Figuet E. Optimization of sequence alignments according to the number of sequences vs. number of sites trade-off. BMC Bioinformatics 2015; 16:190. [PMID: 26055961 PMCID: PMC4459672 DOI: 10.1186/s12859-015-0619-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2014] [Accepted: 05/18/2015] [Indexed: 12/02/2022] Open
Abstract
Background Comparative analysis of homologous sequences enables the understanding of evolutionary patterns at the molecular level, unraveling the functional constraints that shaped the underlying genes. Bioinformatic pipelines for comparative sequence analysis typically include procedures for (i) alignment quality assessment and (ii) control of sequence redundancy. An additional, underassessed step is the control of the amount and distribution of missing data in sequence alignments. While the number of sequences available for a given gene typically increases with time, the site-specific coverage of each alignment position remains highly variable because of differences in sequencing and annotation quality, or simply because of biological variation. For any given alignment-based analysis, the selection of sequences thus defines a trade-off between the species representation and the quantity of sites with sufficient coverage to be included in the subsequent analyses. Results We introduce an algorithm for the optimization of sequence alignments according to the number of sequences vs. number of sites trade-off. The algorithm uses a guide tree to compute scores for each bipartition of the alignment, allowing the recursive selection of sequence subsets with optimal combinations of sequence and site numbers. By applying our methods to two large data sets of several thousands of gene families, we show that significant site-specific coverage increases can be achieved while controlling for the species representation. Conclusions The algorithm introduced in this work allows the control of the distribution of missing data in any sequence alignment by removing sequences to increase the number of sites with a defined minimum coverage. We advocate that our missing data optimization procedure in an important step which should be considered in comparative analysis pipelines, together with alignment quality assessment and control of sampled diversity. An open source C++ implementation is available at http://bioweb.me/physamp. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0619-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, 24306, Germany. .,Institut des Sciences de l'Évolution - Montpellier, Place Eugène Bataillon - C.C. 065 -, Montpellier, 34095, France.
| | - Emeric Figuet
- Institut des Sciences de l'Évolution - Montpellier, Place Eugène Bataillon - C.C. 065 -, Montpellier, 34095, France.
| |
Collapse
|
29
|
Figuet E, Romiguier J, Dutheil JY, Galtier N. Mitochondrial DNA as a tool for reconstructing past life-history traits in mammals. J Evol Biol 2014; 27:899-910. [PMID: 24720883 DOI: 10.1111/jeb.12361] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Revised: 02/27/2014] [Accepted: 02/28/2014] [Indexed: 12/23/2022]
Abstract
Reconstructing the ancestral characteristics of species is a major goal in evolutionary and comparative biology. Unfortunately, fossils are not always available and sufficiently informative, and phylogenetic methods based on models of character evolution can be unsatisfactory. Genomic data offer a new opportunity to estimate ancestral character states, through (i) the correlation between DNA evolutionary processes and species life-history traits and (ii) available reliable methods for ancestral sequence inference. Here, we assess the relevance of mitochondrial DNA--the most popular molecular marker in animals--as a predictor of ancestral life-history traits in mammals, using the order of Cetartiodactyla as a benchmark. Using the complete set of 13 mitochondrial protein-coding genes, we show that the lineage-specific nonsynonymous over synonymous substitution rate ratio (dN/dS) is closely correlated with the species body mass, longevity and age of sexual maturity in Cetartiodactyla and can be used as a marker of ancestral traits provided that the noise introduced by short branches is appropriately dealt with. Based on ancestral dN/dS estimates, we predict that the first cetartiodactyls were relatively small animals (around 20 kg). This finding is in accordance with Cope's rule and the fossil record but could not be recovered via continuous character evolution methods.
Collapse
Affiliation(s)
- E Figuet
- UMR 5554, ISEM, CNRS, Université Montpellier 2, Montpellier, France
| | | | | | | |
Collapse
|
30
|
Dutheil JY, Gaillard S, Stukenbrock EH. MafFilter: a highly flexible and extensible multiple genome alignment files processor. BMC Genomics 2014; 15:53. [PMID: 24447531 PMCID: PMC3904536 DOI: 10.1186/1471-2164-15-53] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Accepted: 01/16/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sequence alignments are the starting point for most evolutionary and comparative analyses. Full genome sequences can be compared to study patterns of within and between species variation. Genome sequence alignments are complex structures containing information such as coordinates, quality scores and synteny structure, which are stored in Multiple Alignment Format (MAF) files. Processing these alignments therefore involves parsing and manipulating typically large MAF files in an efficient way. RESULTS MafFilter is a command-line driven program written in C++ that enables the processing of genome alignments stored in the Multiple Alignment Format in an efficient and extensible manner. It provides an extensive set of tools which can be parametrized and combined by the user via option files. We demonstrate the software's functionality and performance on several biological examples covering Primate genomics and fungal population genomics. Example analyses involve window-based alignment filtering, feature extractions and various statistics, phylogenetics and population genomics calculations. CONCLUSIONS MafFilter is a highly efficient and flexible tool to analyse multiple genome alignments. By allowing the user to combine a large set of available methods, as well as designing his/her own, it enables the design of custom data filtering and analysis pipelines for genomic studies. MafFilter is an open source software available at http://bioweb.me/maffilter.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Max Planck Institute for Terrestrial Microbiology, Department of Organismic Interactions, Marburg, Germany.
| | | | | |
Collapse
|
31
|
Munch K, Mailund T, Dutheil JY, Schierup MH. A fine-scale recombination map of the human-chimpanzee ancestor reveals faster change in humans than in chimpanzees and a strong impact of GC-biased gene conversion. Genome Res 2013; 24:467-74. [PMID: 24190946 PMCID: PMC3941111 DOI: 10.1101/gr.158469.113] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recombination is a major determinant of adaptive and nonadaptive evolution. Understanding how the recombination landscape has evolved in humans is thus key to the interpretation of human genomic evolution. Comparison of fine-scale recombination maps of human and chimpanzee has revealed large changes at fine genomic scales and conservation over large scales. Here we demonstrate how a fine-scale recombination map can be derived for the ancestor of human and chimpanzee, allowing us to study the changes that have occurred in human and chimpanzee since these species diverged. The map is produced from more than one million accurately determined recombination events. We find that this new recombination map is intermediate to the maps of human and chimpanzee but that the recombination landscape has evolved more rapidly in the human lineage than in the chimpanzee lineage. We use the map to show that recombination rate, through the effect of GC-biased gene conversion, is an even stronger determinant of base composition evolution than previously reported.
Collapse
Affiliation(s)
- Kasper Munch
- Bioinformatics Research Centre, Aarhus University, 8000 Aarhus C, Denmark
| | | | | | | |
Collapse
|
32
|
Guéguen L, Gaillard S, Boussau B, Gouy M, Groussin M, Rochette NC, Bigot T, Fournier D, Pouyet F, Cahais V, Bernard A, Scornavacca C, Nabholz B, Haudry A, Dachary L, Galtier N, Belkhir K, Dutheil JY. Bio++: Efficient Extensible Libraries and Tools for Computational Molecular Evolution. Mol Biol Evol 2013; 30:1745-50. [DOI: 10.1093/molbev/mst097] [Citation(s) in RCA: 132] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
|
33
|
Mailund T, Halager AE, Westergaard M, Dutheil JY, Munch K, Andersen LN, Lunter G, Prüfer K, Scally A, Hobolth A, Schierup MH. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. PLoS Genet 2012; 8:e1003125. [PMID: 23284294 PMCID: PMC3527290 DOI: 10.1371/journal.pgen.1003125] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2012] [Accepted: 10/14/2012] [Indexed: 11/18/2022] Open
Abstract
We present a hidden Markov model (HMM) for inferring gradual isolation between two populations during speciation, modelled as a time interval with restricted gene flow. The HMM describes the history of adjacent nucleotides in two genomic sequences, such that the nucleotides can be separated by recombination, can migrate between populations, or can coalesce at variable time points, all dependent on the parameters of the model, which are the effective population sizes, splitting times, recombination rate, and migration rate. We show by extensive simulations that the HMM can accurately infer all parameters except the recombination rate, which is biased downwards. Inference is robust to variation in the mutation rate and the recombination rate over the sequence and also robust to unknown phase of genomes unless they are very closely related. We provide a test for whether divergence is gradual or instantaneous, and we apply the model to three key divergence processes in great apes: (a) the bonobo and common chimpanzee, (b) the eastern and western gorilla, and (c) the Sumatran and Bornean orang-utan. We find that the bonobo and chimpanzee appear to have undergone a clear split, whereas the divergence processes of the gorilla and orang-utan species occurred over several hundred thousands years with gene flow stopping quite recently. We also apply the model to the Homo/Pan speciation event and find that the most likely scenario involves an extended period of gene flow during speciation.
Collapse
Affiliation(s)
- Thomas Mailund
- Bioinformatics Research Center, Aarhus University, Aarhus, Denmark.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Romiguier J, Figuet E, Galtier N, Douzery EJP, Boussau B, Dutheil JY, Ranwez V. Fast and robust characterization of time-heterogeneous sequence evolutionary processes using substitution mapping. PLoS One 2012; 7:e33852. [PMID: 22479459 PMCID: PMC3313935 DOI: 10.1371/journal.pone.0033852] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Accepted: 02/22/2012] [Indexed: 12/22/2022] Open
Abstract
Genes and genomes do not evolve similarly in all branches of the tree of life. Detecting and characterizing the heterogeneity in time, and between lineages, of the nucleotide (or amino acid) substitution process is an important goal of current molecular evolutionary research. This task is typically achieved through the use of non-homogeneous models of sequence evolution, which being highly parametrized and computationally-demanding are not appropriate for large-scale analyses. Here we investigate an alternative methodological option based on probabilistic substitution mapping. The idea is to first reconstruct the substitutional history of each site of an alignment under a homogeneous model of sequence evolution, then to characterize variations in the substitution process across lineages based on substitution counts. Using simulated and published datasets, we demonstrate that probabilistic substitution mapping is robust in that it typically provides accurate reconstruction of sequence ancestry even when the true process is heterogeneous, but a homogeneous model is adopted. Consequently, we show that the new approach is essentially as efficient as and extremely faster than (up to 25 000 times) existing methods, thus paving the way for a systematic survey of substitution process heterogeneity across genes and lineages.
Collapse
Affiliation(s)
- Jonathan Romiguier
- Institut des Sciences de l'Evolution de Montpellier, CNRS-Université Montpellier 2, Montpellier, France.
| | | | | | | | | | | | | |
Collapse
|
35
|
Scally A, Dutheil JY, Hillier LW, Jordan GE, Goodhead I, Herrero J, Hobolth A, Lappalainen T, Mailund T, Marques-Bonet T, McCarthy S, Montgomery SH, Schwalie PC, Tang YA, Ward MC, Xue Y, Yngvadottir B, Alkan C, Andersen LN, Ayub Q, Ball EV, Beal K, Bradley BJ, Chen Y, Clee CM, Fitzgerald S, Graves TA, Gu Y, Heath P, Heger A, Karakoc E, Kolb-Kokocinski A, Laird GK, Lunter G, Meader S, Mort M, Mullikin JC, Munch K, O'Connor TD, Phillips AD, Prado-Martinez J, Rogers AS, Sajjadian S, Schmidt D, Shaw K, Simpson JT, Stenson PD, Turner DJ, Vigilant L, Vilella AJ, Whitener W, Zhu B, Cooper DN, de Jong P, Dermitzakis ET, Eichler EE, Flicek P, Goldman N, Mundy NI, Ning Z, Odom DT, Ponting CP, Quail MA, Ryder OA, Searle SM, Warren WC, Wilson RK, Schierup MH, Rogers J, Tyler-Smith C, Durbin R. Insights into hominid evolution from the gorilla genome sequence. Nature 2012; 483:169-75. [PMID: 22398555 PMCID: PMC3303130 DOI: 10.1038/nature10842] [Citation(s) in RCA: 457] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 01/10/2012] [Indexed: 12/13/2022]
Abstract
Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.
Collapse
Affiliation(s)
- Aylwyn Scally
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Corbi J, Dutheil JY, Damerval C, Tenaillon MI, Manicacci D. Accelerated evolution and coevolution drove the evolutionary history of AGPase sub-units during angiosperm radiation. Ann Bot 2012; 109:693-708. [PMID: 22307567 PMCID: PMC3286274 DOI: 10.1093/aob/mcr303] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2011] [Accepted: 11/07/2011] [Indexed: 05/10/2023]
Abstract
BACKGROUND AND AIMS ADP-glucose pyrophosphorylase (AGPase) is a key enzyme of starch biosynthesis. In the green plant lineage, it is composed of two large (LSU) and two small (SSU) sub-units encoded by paralogous genes, as a consequence of several rounds of duplication. First, our aim was to detect specific patterns of molecular evolution following duplication events and the divergence between monocotyledons and dicotyledons. Secondly, we investigated coevolution between amino acids both within and between sub-units. METHODS A phylogeny of each AGPase sub-unit was built using all gymnosperm and angiosperm sequences available in databases. Accelerated evolution along specific branches was tested using the ratio of the non-synonymous to the synonymous substitution rate. Coevolution between amino acids was investigated taking into account compensatory changes between co-substitutions. KEY RESULTS We showed that SSU paralogues evolved under high functional constraints during angiosperm radiation, with a significant level of coevolution between amino acids that participate in SSU major functions. In contrast, in the LSU paralogues, we identified residues under positive selection (1) following the first LSU duplication that gave rise to two paralogues mainly expressed in angiosperm source and sink tissues, respectively; and (2) following the emergence of grass-specific paralogues expressed in the endosperm. Finally, we found coevolution between residues that belong to the interaction domains of both sub-units. CONCLUSIONS Our results support the view that coevolution among amino acid residues, especially those lying in the interaction domain of each sub-unit, played an important role in AGPase evolution. First, within SSU, coevolution allowed compensating mutations in a highly constrained context. Secondly, the LSU paralogues probably acquired tissue-specific expression and regulatory properties via the coevolution between sub-unit interacting domains. Finally, the pattern we observed during LSU evolution is consistent with repeated sub-functionalization under 'Escape from Adaptive Conflict', a model rarely illustrated in the literature.
Collapse
Affiliation(s)
- Jonathan Corbi
- CNRS, UMR 0320/UMR 8120 Génétique Végétale, Ferme du Moulon, F-91190 Gif sur Yvette, France
| | - Julien Y. Dutheil
- BiRC-Bioinformatics Research Center, Aarhus University, C.F. Møllers Alle 8, Building 1110, DK-8000 Århus C, Denmark
| | - Catherine Damerval
- CNRS, UMR 0320/UMR 8120 Génétique Végétale, Ferme du Moulon, F-91190 Gif sur Yvette, France
| | - Maud I. Tenaillon
- CNRS, UMR 0320/UMR 8120 Génétique Végétale, Ferme du Moulon, F-91190 Gif sur Yvette, France
| | - Domenica Manicacci
- Université Paris-Sud, UMR 0320/UMR 8120 Génétique Végétale, Ferme du Moulon, F-91190 Gif sur Yvette, France
| |
Collapse
|
37
|
Dutheil JY, Galtier N, Romiguier J, Douzery EJ, Ranwez V, Boussau B. Efficient Selection of Branch-Specific Models of Sequence Evolution. Mol Biol Evol 2012; 29:1861-74. [DOI: 10.1093/molbev/mss059] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
38
|
Abstract
Large amount of genome data are being generated by second- and now also third-generation sequencing technologies. The challenge no longer lies in the generation of the data, but in the analyses of it. We present an overview of approaches and methods to compare complete sequences of related fungal genomes. We focus on evolutionary analyses of genome alignments to describe species divergence and to identify footprints of demography and natural selection within and between species.
Collapse
Affiliation(s)
- Eva H Stukenbrock
- Max Planck Institute for Terrestrial Microbiology, Karl von Frisch Str, Marburg, Germany.
| | | |
Collapse
|
39
|
Abstract
The full genomes of several closely related species are now available, opening an emerging field of investigation borrowing both from population genetics and phylogenetics. Providing we can properly model sequence evolution within populations undergoing speciation events, this resource enables us to estimate key population genetics parameters, such as ancestral population sizes and split times. Furthermore, we can enhance our understanding of the recombination process and investigate various selective forces. We discuss the basic speciation models for closely related species, including the isolation and isolation-with-migration models. A major point in our discussion is that only a few complete genomes contain much information about the whole population. The reason being that recombination unlinks genomic regions, and therefore a few genomes contain many segments with distinct histories. The challenge of population genomics is to decode this mosaic of histories in order to infer scenarios of demography and selection. We survey different approaches for understanding ancestral species from analyses of genomic data from closely related species. In particular, we emphasize core assumptions and working hypothesis. Finally, we discuss computational and statistical challenges that arise in the analysis of population genomics data sets.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Institut des Sciences de l'Évolution Montpellier (ISE-M), UMR 5554, CNRS, Unversité Montpellier, Montpellier, France.
| | | |
Collapse
|
40
|
Stukenbrock EH, Bataillon T, Dutheil JY, Hansen TT, Li R, Zala M, McDonald BA, Wang J, Schierup MH. The making of a new pathogen: insights from comparative population genomics of the domesticated wheat pathogen Mycosphaerella graminicola and its wild sister species. Genome Res 2011; 21:2157-66. [PMID: 21994252 DOI: 10.1101/gr.118851.110] [Citation(s) in RCA: 159] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The fungus Mycosphaerella graminicola emerged as a new pathogen of cultivated wheat during its domestication ~11,000 yr ago. We assembled 12 high-quality full genome sequences to investigate the genetic footprints of selection in this wheat pathogen and closely related sister species that infect wild grasses. We demonstrate a strong effect of natural selection in shaping the pathogen genomes with only ~3% of nonsynonymous mutations being effectively neutral. Forty percent of all fixed nonsynonymous substitutions, on the other hand, are driven by positive selection. Adaptive evolution has affected M. graminicola to the highest extent, consistent with recent host specialization. Positive selection has prominently altered genes encoding secreted proteins and putative pathogen effectors supporting the premise that molecular host-pathogen interaction is a strong driver of pathogen evolution. Recent divergence between pathogen sister species is attested by the high degree of incomplete lineage sorting (ILS) in their genomes. We exploit ILS to generate a genetic map of the species without any crossing data, document recent times of species divergence relative to genome divergence, and show that gene-rich regions or regions with low recombination experience stronger effects of natural selection on neutral diversity. Emergence of a new agricultural host selected a highly specialized and fast-evolving pathogen with unique evolutionary patterns compared with its wild relatives. The strong impact of natural selection, we document, is at odds with the small effective population sizes estimated and suggest that population sizes were historically large but likely unstable.
Collapse
Affiliation(s)
- Eva H Stukenbrock
- Bioinformatics Research Center, Aarhus University, C.F. Moellers Alle, DK-8000 Aarhus C, Denmark.
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Abstract
Positions in a molecule that share a common constraint do not evolve independently, and therefore leave a signature in the patterns of homologous sequences. Exhibiting such positions with a coevolution pattern from a sequence alignment has great potential for predicting functional and structural properties of molecules through comparative analysis. This task is complicated by the existence of additional correlation sources, leading to false predictions. The nature of the data is a major source of noise correlation: sequences are taken from individuals with different degrees of relatedness, and who therefore are intrinsically correlated. This has led to several method developments in different fields that are potentially confusing for non-expert users interested in these methodologies. It also explains why coevolution detection methods are largely unemployed despite the importance of the biological questions they address. In this article, I focus on the role of shared ancestry for understanding molecular coevolution patterns. I review and classify existing coevolution detection methods according to their ability to handle shared ancestry. Using a ribosomal RNA benchmark data set, for which detailed knowledge of the structure and coevolution patterns is available, I demonstrate and explain why taking the underlying evolutionary history of sequences into account is the only way to extract the full coevolution signal in the data. I also evaluate, using rigorous statistical procedures, the best approaches to do so, and discuss several important biological aspects to consider when performing coevolution analyses.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Institut des Sciences de l'Evolution - Montpellier (I.S.E.-M.) Unité Mixte de Recherche UMII - CNRS (UMR 5554) Université de Montpellier II - CC 065 34095 Montpellier Cedex 05.
| |
Collapse
|
42
|
Abstract
In this paper, we introduce a new Graphical User Interface that estimates evolutionary rates on protein sequences by assessing changes in biochemical constraints. We describe IMPACT, a platform-independent (tested in Linux, Windows, and MacOS), easy to install software written in Java. IMPACT integrates the use of a built-in multiple sequence alignment editor, with programs that perform phylogenetic and protein structure analyses (ConTest, PhyML, ATV, and Jmol) allowing the user to quickly and efficiently perform evolutionary analyses on protein sequences, including the detection of selection (negative and positive) signatures at the amino acid scale, which can provide fundamental insight about species evolution and ecological fitness. IMPACT provides the user with a working platform that combines a number of bioinformatics tools and utilities in one place, transferring information directly among the various programs and therefore increasing the overall performance of evolutionary analyses on proteins.
Collapse
Affiliation(s)
- Emanuel Maldonado
- CIMAR/CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Porto, Portugal
| | | | | | | | | |
Collapse
|
43
|
Mailund T, Dutheil JY, Hobolth A, Lunter G, Schierup MH. Estimating divergence time and ancestral effective population size of Bornean and Sumatran orangutan subspecies using a coalescent hidden Markov model. PLoS Genet 2011; 7:e1001319. [PMID: 21408205 PMCID: PMC3048369 DOI: 10.1371/journal.pgen.1001319] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2009] [Accepted: 01/25/2011] [Indexed: 12/01/2022] Open
Abstract
Due to genetic variation in the ancestor of two populations or two species, the divergence time for DNA sequences from two populations is variable along the genome. Within genomic segments all bases will share the same divergence—because they share a most recent common ancestor—when no recombination event has occurred to split them apart. The size of these segments of constant divergence depends on the recombination rate, but also on the speciation time, the effective population size of the ancestral population, as well as demographic effects and selection. Thus, inference of these parameters may be possible if we can decode the divergence times along a genomic alignment. Here, we present a new hidden Markov model that infers the changing divergence (coalescence) times along the genome alignment using a coalescent framework, in order to estimate the speciation time, the recombination rate, and the ancestral effective population size. The model is efficient enough to allow inference on whole-genome data sets. We first investigate the power and consistency of the model with coalescent simulations and then apply it to the whole-genome sequences of the two orangutan sub-species, Bornean (P. p. pygmaeus) and Sumatran (P. p. abelii) orangutans from the Orangutan Genome Project. We estimate the speciation time between the two sub-species to be thousand years ago and the effective population size of the ancestral orangutan species to be , consistent with recent results based on smaller data sets. We also report a negative correlation between chromosome size and ancestral effective population size, which we interpret as a signature of recombination increasing the efficacy of selection. We present a hidden Markov model that uses variation in coalescence times between two distantly related populations, or closely related species, to infer population genetics parameters in ancestral population or species. The model infers the divergence times in segments along the alignment. Using coalescent simulations, we show that the model accurately estimates the divergence time between the two populations and the effective population size of the ancestral population. We apply the model to the recently sequenced orangutan sub-species and estimate their divergence time and the effective population size of their ancestor population.
Collapse
Affiliation(s)
- Thomas Mailund
- Bioinformatics Research Centre, Aarhus University, Denmark.
| | | | | | | | | |
Collapse
|
44
|
Hobolth A, Dutheil JY, Hawks J, Schierup MH, Mailund T. Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection. Genome Res 2011; 21:349-56. [PMID: 21270173 PMCID: PMC3044849 DOI: 10.1101/gr.114751.110] [Citation(s) in RCA: 168] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We search the complete orangutan genome for regions where humans are more closely related to orangutans than to chimpanzees due to incomplete lineage sorting (ILS) in the ancestor of human and chimpanzees. The search uses our recently developed coalescent hidden Markov model (HMM) framework. We find ILS present in ∼1% of the genome, and that the ancestral species of human and chimpanzees never experienced a severe population bottleneck. The existence of ILS is validated with simulations, site pattern analysis, and analysis of rare genomic events. The existence of ILS allows us to disentangle the time of isolation of humans and orangutans (the speciation time) from the genetic divergence time, and we find speciation to be as recent as 9-13 million years ago (Mya; contingent on the calibration point). The analyses provide further support for a recent speciation of human and chimpanzee at ∼4 Mya and a diverse ancestor of human and chimpanzee with an effective population size of about 50,000 individuals. Posterior decoding infers ILS for each nucleotide in the genome, and we use this to deduce patterns of selection in the ancestral species. We demonstrate the effect of background selection in the common ancestor of humans and chimpanzees. In agreement with predictions from population genetics, ILS was found to be reduced in exons and gene-dense regions when we control for confounding factors such as GC content and recombination rate. Finally, we find the broad-scale recombination rate to be conserved through the complete ape phylogeny.
Collapse
Affiliation(s)
- Asger Hobolth
- Bioinformatics Research Center, Aarhus University, DK-8000 Aarhus C, Denmark
| | - Julien Y. Dutheil
- Bioinformatics Research Center, Aarhus University, DK-8000 Aarhus C, Denmark
| | - John Hawks
- University of Wisconsin–Madison, Madison, Wisconsin 53706, USA
| | - Mikkel H. Schierup
- Bioinformatics Research Center, Aarhus University, DK-8000 Aarhus C, Denmark
- Department of Biology, Aarhus University, DK-8000 Aarhus C, Denmark
- Corresponding authors.E-mail ; fax 45-8942-3077.E-mail
| | - Thomas Mailund
- Bioinformatics Research Center, Aarhus University, DK-8000 Aarhus C, Denmark
- Corresponding authors.E-mail ; fax 45-8942-3077.E-mail
| |
Collapse
|
45
|
Abstract
It has long been accepted that the structural constraints stemming from the 3D structure of ribosomal RNA (rRNA) lead to coevolution through compensating mutations between interacting sites. State-of-the-art methods for detecting coevolving sites, however, while reaching high levels of specificity and sensitivity for Watson-Crick (WC) pairs of the helices defining the secondary structure, only scarcely reveal tertiary interactions occurring at the level of the 3D structure. In order to understand the relative failure of coevolutionary methods to detect such interactions, we analyze 2,682 interacting sites derived from high-resolution structures, which include a comprehensive data set of rRNA sequences from Archaea and Bacteria. We report a striking difference in the amount of coevolution between WC and non-WC pairs. In order to understand this pattern, we derive fitness landscapes from the geometry of base pairing interactions and construct neutral networks of substitutions for each type of interaction. These networks show that coevolution is a property of WC pairs because, unlike non-WC pairs, their landscapes exhibit fitness valleys, a single mutation in a WC pair resulting in a fitness drop. Second, we used the inferred neutral networks to estimate the level of constraint acting on each type of base pair and show that it correlates negatively with the observed rate of substitutions for all non-WC pairs. WC pairs appear as outliers, fixing more substitutions than expected according to their level of constraint. We here propose that the rate of substitution in WC pairs is due to coevolution resulting from constraints acting at intermediate levels of organization, namely the one of the helical stem with its forming WC pairs. In agreement with this hypothesis, we report a significant excess of intrahelical, inter-WC pairs coevolution compared with interhelices pairs. Altogether, these results show that detailed biochemical knowledge is required and has to be incorporated into evolutionary reasoning in order to understand the fine patterns of variation at the molecular level. They also demonstrate that coevolutionary analysis provides almost exclusively 2D information and only little 3D signal.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Bioinformatics Research Center (BiRC), Aarhus University, Arhus, Denmark.
| | | | | |
Collapse
|