1
|
Chen Y, Ma T, Zhang T, Ma L. Trends in the evolution of intronless genes in Poaceae. FRONTIERS IN PLANT SCIENCE 2023; 14:1065631. [PMID: 36875616 PMCID: PMC9978806 DOI: 10.3389/fpls.2023.1065631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 02/01/2023] [Indexed: 06/18/2023]
Abstract
Intronless genes (IGs), which are a feature of prokaryotes, are a fascinating group of genes that are also present in eukaryotes. In the current study, a comparison of Poaceae genomes revealed that the origin of IGs may have involved ancient intronic splicing, reverse transcription, and retrotranspositions. Additionally, IGs exhibit the typical features of rapid evolution, including recent duplications, variable copy numbers, low divergence between paralogs, and high non-synonymous to synonymous substitution ratios. By tracing IG families along the phylogenetic tree, we determined that the evolutionary dynamics of IGs differed among Poaceae subfamilies. IG families developed rapidly before the divergence of Pooideae and Oryzoideae and expanded slowly after the divergence. In contrast, they emerged gradually and consistently in the Chloridoideae and Panicoideae clades during evolution. Furthermore, IGs are expressed at low levels. Under relaxed selection pressure, retrotranspositions, intron loss, and gene duplications and conversions may promote the evolution of IGs. The comprehensive characterization of IGs is critical for in-depth studies on intron functions and evolution as well as for assessing the importance of introns in eukaryotes.
Collapse
Affiliation(s)
- Yong Chen
- *Correspondence: Tingting Zhang, ; Lei Ma,
| | | | | | - Lei Ma
- *Correspondence: Tingting Zhang, ; Lei Ma,
| |
Collapse
|
2
|
Lim CS, Weinstein BN, Roy SW, Brown CM. Analysis of fungal genomes reveals commonalities of intron gain or loss and functions in intron-poor species. Mol Biol Evol 2021; 38:4166-4186. [PMID: 33772558 PMCID: PMC8476143 DOI: 10.1093/molbev/msab094] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Previous evolutionary reconstructions have concluded that early eukaryotic ancestors including both the last common ancestor of eukaryotes and of all fungi had intron-rich genomes. By contrast, some extant eukaryotes have few introns, underscoring the complex histories of intron–exon structures, and raising the question as to why these few introns are retained. Here, we have used recently available fungal genomes to address a variety of questions related to intron evolution. Evolutionary reconstruction of intron presence and absence using 263 diverse fungal species supports the idea that massive intron reduction through intron loss has occurred in multiple clades. The intron densities estimated in various fungal ancestors differ from zero to 7.6 introns per 1 kb of protein-coding sequence. Massive intron loss has occurred not only in microsporidian parasites and saccharomycetous yeasts, but also in diverse smuts and allies. To investigate the roles of the remaining introns in highly-reduced species, we have searched for their special characteristics in eight intron-poor fungi. Notably, the introns of ribosome-associated genes RPL7 and NOG2 have conserved positions; both intron-containing genes encoding snoRNAs. Furthermore, both the proteins and snoRNAs are involved in ribosome biogenesis, suggesting that the expression of the protein-coding genes and noncoding snoRNAs may be functionally coordinated. Indeed, these introns are also conserved in three-quarters of fungi species. Our study shows that fungal introns have a complex evolutionary history and underappreciated roles in gene expression.
Collapse
Affiliation(s)
- Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Brooke N Weinstein
- Quantitative & Systems Biology, School of Natural Sciences, University of California-Merced, Merced, CA, USA.,Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Scott W Roy
- Quantitative & Systems Biology, School of Natural Sciences, University of California-Merced, Merced, CA, USA.,Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Chris M Brown
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| |
Collapse
|
3
|
Schaefke B, Sun W, Li YS, Fang L, Chen W. The evolution of posttranscriptional regulation. WILEY INTERDISCIPLINARY REVIEWS-RNA 2018; 9:e1485. [PMID: 29851258 DOI: 10.1002/wrna.1485] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 04/23/2018] [Accepted: 04/26/2018] [Indexed: 12/13/2022]
Abstract
"DNA makes RNA makes protein." After transcription, mRNAs undergo a series of intertwining processes to be finally translated into functional proteins. The "posttranscriptional" regulation (PTR) provides cells an extended option to fine-tune their proteomes. To meet the demands of complex organism development and the appropriate response to environmental stimuli, every step in these processes needs to be finely regulated. Moreover, changes in these regulatory processes are important driving forces underlying the evolution of phenotypic differences across different species. The major PTR mechanisms discussed in this review include the regulation of splicing, polyadenylation, decay, and translation. For alternative splicing and polyadenylation, we mainly discuss their evolutionary dynamics and the genetic changes underlying the regulatory differences in cis-elements versus trans-factors. For mRNA decay and translation, which, together with transcription, determine the cellular RNA or protein abundance, we focus our discussion on how their divergence coordinates with transcriptional changes to shape the evolution of gene expression. Then to highlight the importance of PTR in the evolution of higher complexity, we focus on their roles in two major phenomena during eukaryotic evolution: the evolution of multicellularity and the division of labor between different cell types and tissues; and the emergence of diverse, often highly specialized individual phenotypes, especially those concerning behavior in eusocial insects. This article is categorized under: RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution Translation > Translation Regulation RNA Processing > Splicing Regulation/Alternative Splicing.
Collapse
Affiliation(s)
- Bernhard Schaefke
- Department of Biology, Southern University of Science and Technology, Shenzhen, China
| | - Wei Sun
- Department of Biology, Southern University of Science and Technology, Shenzhen, China.,Department of Pharmaceutical Chemistry and Cardiovascular Research Institute, University of California San Francisco, San Francisco
| | - Yi-Sheng Li
- Department of Biology, Southern University of Science and Technology, Shenzhen, China
| | - Liang Fang
- Department of Biology, Southern University of Science and Technology, Shenzhen, China.,Medi-X Institute, SUSTech Academy for Advanced Interdisciplinary Studies, Southern University of Science and Technology, Shenzhen, China
| | - Wei Chen
- Department of Biology, Southern University of Science and Technology, Shenzhen, China.,Medi-X Institute, SUSTech Academy for Advanced Interdisciplinary Studies, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
4
|
Grau-Bové X, Torruella G, Donachie S, Suga H, Leonard G, Richards TA, Ruiz-Trillo I. Dynamics of genomic innovation in the unicellular ancestry of animals. eLife 2017; 6:26036. [PMID: 28726632 PMCID: PMC5560861 DOI: 10.7554/elife.26036] [Citation(s) in RCA: 94] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 07/11/2017] [Indexed: 12/29/2022] Open
Abstract
Which genomic innovations underpinned the origin of multicellular animals is still an open debate. Here, we investigate this question by reconstructing the genome architecture and gene family diversity of ancestral premetazoans, aiming to date the emergence of animal-like traits. Our comparative analysis involves genomes from animals and their closest unicellular relatives (the Holozoa), including four new genomes: three Ichthyosporea and Corallochytrium limacisporum. Here, we show that the earliest animals were shaped by dynamic changes in genome architecture before the emergence of multicellularity: an early burst of gene diversity in the ancestor of Holozoa, enriched in transcription factors and cell adhesion machinery, was followed by multiple and differently-timed episodes of synteny disruption, intron gain and genome expansions. Thus, the foundations of animal genome architecture were laid before the origin of complex multicellularity – highlighting the necessity of a unicellular perspective to understand early animal evolution. DOI:http://dx.doi.org/10.7554/eLife.26036.001 Hundreds of millions of years ago, some single-celled organisms gained the ability to work together and form multicellular organisms. This transition was a major step in evolution and took place at separate times in several parts of the tree of life, including in animals, plants, fungi and algae. Animals are some of the most complex organisms on Earth. Their single-celled ancestors were also quite genetically complex themselves and their genomes (the complete set of the organism’s DNA) already contained many genes that now coordinate the activity of the cells in a multicellular organism. The genome of an animal typically has certain features: it is large, diverse and contains many segments (called introns) that are not genes. By seeing if the single-celled relatives of animals share these traits, it is possible to learn more about when specific genetic features first evolved, and whether they are linked to the origin of animals. Now, Grau-Bové et al. have studied the genomes of several of the animal kingdom’s closest single-celled relatives using a technique called whole genome sequencing. This revealed that there was a period of rapid genetic change in the single-celled ancestors of animals during which their genes became much more diverse. Another ‘explosion’ of diversity happened after animals had evolved. Furthermore, the overall amount of the genomic content inside cells and the number of introns found in the genome rapidly increased in separate, independent events in both animals and their single-celled ancestors. Future research is needed to investigate whether other multicellular life forms – such as plants, fungi and algae – originated in the same way as animal life. Understanding how the genetic material of animals evolved also helps us to understand the genetic structures that affect our health. For example, genes that coordinate the behavior of cells (and so are important for multicellular organisms) also play a role in cancer, where cells break free of this regulation to divide uncontrollably. DOI:http://dx.doi.org/10.7554/eLife.26036.002
Collapse
Affiliation(s)
- Xavier Grau-Bové
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Catalonia, Spain.,Departament de Genètica, Microbiologia i Estadística, Universitat de Barelona, Barcelona, Catalonia, Spain
| | - Guifré Torruella
- Unité d'Ecologie, Systématique et Evolution, Université Paris-Sud/Paris-Saclay, AgroParisTech, Orsay, France
| | - Stuart Donachie
- Department of Microbiology, University of Hawai'i at Mānoa, Honolulu, United States.,Advanced Studies in Genomics, Proteomics and Bioinformatics, University of Hawai'i at Mānoa, Honolulu, United States
| | - Hiroshi Suga
- Faculty of Life and Environmental Sciences, Prefectural University of Hiroshima, Hiroshima, Japan
| | - Guy Leonard
- Department of Biosciences, University of Exeter, Exeter, United Kingdom
| | - Thomas A Richards
- Department of Biosciences, University of Exeter, Exeter, United Kingdom
| | - Iñaki Ruiz-Trillo
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Catalonia, Spain.,Departament de Genètica, Microbiologia i Estadística, Universitat de Barelona, Barcelona, Catalonia, Spain.,ICREA, Passeig Lluís Companys, Barcelona, Catalonia, Spain
| |
Collapse
|
5
|
Wang Y, Xu L, Thilmony R, You FM, Gu YQ, Coleman-Derr D. PIECE 2.0: an update for the plant gene structure comparison and evolution database. Nucleic Acids Res 2016; 45:1015-1020. [PMID: 27742820 PMCID: PMC5210635 DOI: 10.1093/nar/gkw935] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 10/04/2016] [Accepted: 10/12/2016] [Indexed: 11/30/2022] Open
Abstract
PIECE (Plant Intron Exon Comparison and Evolution) is a web-accessible database that houses intron and exon information of plant genes. PIECE serves as a resource for biologists interested in comparing intron–exon organization and provides valuable insights into the evolution of gene structure in plant genomes. Recently, we updated PIECE to a new version, PIECE 2.0 (http://probes.pw.usda.gov/piece or http://aegilops.wheat.ucdavis.edu/piece). PIECE 2.0 contains annotated genes from 49 sequenced plant species as compared to 25 species in the previous version. In the current version, we also added several new features: (i) a new viewer was developed to show phylogenetic trees displayed along with the structure of individual genes; (ii) genes in the phylogenetic tree can now be also grouped according to KOG (The annotation of Eukaryotic Orthologous Groups) and KO (KEGG Orthology) in addition to Pfam domains; (iii) information on intronless genes are now included in the database; (iv) a statistical summary of global gene structure information for each species and its comparison with other species was added; and (v) an improved GSDraw tool was implemented in the web server to enhance the analysis and display of gene structure. The updated PIECE 2.0 database will be a valuable resource for the plant research community for the study of gene structure and evolution.
Collapse
Affiliation(s)
- Yi Wang
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA.,Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA.,USDA-ARS, Plant Gene Expression Center, Albany, CA 94710, USA
| | - Ling Xu
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA.,USDA-ARS, Plant Gene Expression Center, Albany, CA 94710, USA
| | - Roger Thilmony
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA
| | - Frank M You
- Cereal Research Centre, Agriculture and Agri-Food Canada, Morden R6M 1Y5 MB, Canada
| | - Yong Q Gu
- USDA-ARS, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA 94710, USA
| | - Devin Coleman-Derr
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA .,USDA-ARS, Plant Gene Expression Center, Albany, CA 94710, USA
| |
Collapse
|
6
|
Kannan S, Rogozin IB, Koonin EV. MitoCOGs: clusters of orthologous genes from mitochondria and implications for the evolution of eukaryotes. BMC Evol Biol 2014; 14:237. [PMID: 25421434 PMCID: PMC4256733 DOI: 10.1186/s12862-014-0237-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Accepted: 11/07/2014] [Indexed: 01/19/2023] Open
Abstract
Background Mitochondria are ubiquitous membranous organelles of eukaryotic cells that evolved from an alpha-proteobacterial endosymbiont and possess a small genome that encompasses from 3 to 106 genes. Accumulation of thousands of mitochondrial genomes from diverse groups of eukaryotes provides an opportunity for a comprehensive reconstruction of the evolution of the mitochondrial gene repertoire. Results Clusters of orthologous mitochondrial protein-coding genes (MitoCOGs) were constructed from all available mitochondrial genomes and complemented with nuclear orthologs of mitochondrial genes. With minimal exceptions, the mitochondrial gene complements of eukaryotes are subsets of the superset of 66 genes found in jakobids. Reconstruction of the evolution of mitochondrial genomes indicates that the mitochondrial gene set of the last common ancestor of the extant eukaryotes was slightly larger than that of jakobids. This superset of mitochondrial genes likely represents an intermediate stage following the loss and transfer to the nucleus of most of the endosymbiont genes early in eukaryote evolution. Subsequent evolution in different lineages involved largely parallel transfer of ancestral endosymbiont genes to the nuclear genome. The intron density in nuclear orthologs of mitochondrial genes typically is nearly the same as in the rest of the genes in the respective genomes. However, in land plants, the intron density in nuclear orthologs of mitochondrial genes is almost 1.5-fold lower than the genomic mean, suggestive of ongoing transfer of functional genes from mitochondria to the nucleus. Conclusions The MitoCOGs are expected to become an important resource for the study of mitochondrial evolution. The nearly complete superset of mitochondrial genes in jakobids likely represents an intermediate stage in the evolution of eukaryotes after the initial, extensive loss and transfer of the endosymbiont genes. In addition, the bacterial multi-subunit RNA polymerase that is encoded in the jakobid mitochondrial genomes was replaced by a single-subunit phage-type RNA polymerase in the rest of the eukaryotes. These results are best compatible with the rooting of the eukaryotic tree between jakobids and the rest of the eukaryotes. The land plants are the only eukaryotic branch in which the gene transfer from the mitochondrial to the nuclear genome appears to be an active, ongoing process. Electronic supplementary material The online version of this article (doi:10.1186/s12862-014-0237-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sivakumar Kannan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
7
|
Effects of taxon sampling in reconstructions of intron evolution. Int J Genomics 2013; 2013:671316. [PMID: 23671844 PMCID: PMC3647540 DOI: 10.1155/2013/671316] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2012] [Accepted: 01/02/2013] [Indexed: 11/26/2022] Open
Abstract
Introns comprise a considerable portion of eukaryotic genomes; however, their evolution is understudied. Numerous works of the last years largely disagree on many aspects of intron evolution. Interpretation of these differences is hindered because different algorithms and taxon sampling strategies were used. Here, we present the first attempt of a systematic evaluation of the effects of taxon sampling on popular intron evolution estimation algorithms. Using the “taxon jackknife” method, we compared the effect of taxon sampling on the behavior of intron evolution inferring algorithms. We show that taxon sampling can dramatically affect the inferences and identify conditions where algorithms are prone to systematic errors. Presence or absence of some key species is often more important than the taxon sampling size alone. Criteria of representativeness of the taxonomic sampling for reliable reconstructions are outlined. Presence of the deep-branching species with relatively high intron density is more important than sheer number of species. According to these criteria, currently available genomic databases are representative enough to provide reliable inferences of the intron evolution in animals, land plants, and fungi, but they underrepresent many groups of unicellular eukaryotes, including the well-studied Alveolata.
Collapse
|
8
|
Hammesfahr B, Odronitz F, Mühlhausen S, Waack S, Kollmar M. GenePainter: a fast tool for aligning gene structures of eukaryotic protein families, visualizing the alignments and mapping gene structures onto protein structures. BMC Bioinformatics 2013; 14:77. [PMID: 23496949 PMCID: PMC3605371 DOI: 10.1186/1471-2105-14-77] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Accepted: 02/24/2013] [Indexed: 11/10/2022] Open
Abstract
Background All sequenced eukaryotic genomes have been shown to possess at least a few introns. This includes those unicellular organisms, which were previously suspected to be intron-less. Therefore, gene splicing must have been present at least in the last common ancestor of the eukaryotes. To explain the evolution of introns, basically two mutually exclusive concepts have been developed. The introns-early hypothesis says that already the very first protein-coding genes contained introns while the introns-late concept asserts that eukaryotic genes gained introns only after the emergence of the eukaryotic lineage. A very important aspect in this respect is the conservation of intron positions within homologous genes of different taxa. Results GenePainter is a standalone application for mapping gene structure information onto protein multiple sequence alignments. Based on the multiple sequence alignments the gene structures are aligned down to single nucleotides. GenePainter accounts for variable lengths in exons and introns, respects split codons at intron junctions and is able to handle sequencing and assembly errors, which are possible reasons for frame-shifts in exons and gaps in genome assemblies. Thus, even gene structures of considerably divergent proteins can properly be compared, as it is needed in phylogenetic analyses. Conserved intron positions can also be mapped to user-provided protein structures. For their visualization GenePainter provides scripts for the molecular graphics system PyMol. Conclusions GenePainter is a tool to analyse gene structure conservation providing various visualization options. A stable version of GenePainter for all operating systems as well as documentation and example data are available at http://www.motorprotein.de/genepainter.html.
Collapse
Affiliation(s)
- Björn Hammesfahr
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, Göttingen, 37077, Germany
| | | | | | | | | |
Collapse
|
9
|
Abstract
Gene structure data can substantially advance our understanding of metazoan evolution and deliver an independent approach to resolve conflicts among existing hypotheses. Here, we used changes of spliceosomal intron positions as novel phylogenetic marker to reconstruct the animal tree. This kind of data is inferred from orthologous genes containing mutually exclusive introns at pairs of sequence positions in close proximity, so-called near intron pairs (NIPs). NIP data were collected for 48 species and utilized as binary genome-level characters in maximum parsimony (MP) analyses to reconstruct deep metazoan phylogeny. All groupings that were obtained with more than 80% bootstrap support are consistent with currently supported phylogenetic hypotheses. This includes monophyletic Chordata, Vertebrata, Nematoda, Platyhelminthes and Trochozoa. Several other clades such as Deuterostomia, Protostomia, Arthropoda, Ecdysozoa, Spiralia, and Eumetazoa, however, failed to be recovered due to a few problematic taxa such as the mite Ixodesand the warty comb jelly Mnemiopsis. The corresponding unexpected branchings can be explained by the paucity of synapomorphic changes of intron positions shared between some genomes, by the sensitivity of MP analyses to long-branch attraction (LBA), and by the very unequal evolutionary rates of intron loss and intron gain during evolution of the different subclades of metazoans. In addition, we obtained an assemblage of Cnidaria, Porifera, and Placozoa as sister group of Bilateria+Ctenophora with medium support, a disputable, but remarkable result. We conclude that NIPs can be used as phylogenetic characters also within a broader phylogenetic context, given that they have emerged regularly during evolution irrespective of the large variation of intron density across metazoan genomes.
Collapse
Affiliation(s)
- Jörg Lehmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany
| | | | | |
Collapse
|
10
|
Koonin EV, Csuros M, Rogozin IB. Whence genes in pieces: reconstruction of the exon-intron gene structures of the last eukaryotic common ancestor and other ancestral eukaryotes. WILEY INTERDISCIPLINARY REVIEWS-RNA 2012; 4:93-105. [PMID: 23139082 DOI: 10.1002/wrna.1143] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
In eukaryotes, protein-coding sequences are interrupted by non-coding sequences known as introns. During mRNA maturation, introns are excised by the spliceosome and the coding regions, exons, are spliced to form the mature coding region. The intron densities widely differ between eukaryotic lineages, from 6 to 7 introns per kb of coding sequence in vertebrates, some invertebrates and green plants, to only a few introns across the entire genome in many unicellular eukaryotes. Evolutionary reconstructions using maximum likelihood methods suggest intron-rich ancestors for each major group of eukaryotes. For the last common ancestor of animals, the highest intron density of all extant and extinct eukaryotes was inferred, at 120-130% of the human intron density. Furthermore, an intron density within 53-74% of the human values was inferred for the last eukaryotic common ancestor. Accordingly, evolution of eukaryotic genes in all lines of descent involved primarily intron loss, with substantial gain only at the bases of several branches including plants and animals. These conclusions have substantial biological implications indicating that the common ancestor of all modern eukaryotes was a complex organism with a gene architecture resembling those in multicellular organisms. Alternative splicing most likely initially appeared as an inevitable result of splicing errors and only later was employed to generate structural and functional diversification of proteins.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information NLM/NIH, Bethesda, MD, USA.
| | | | | |
Collapse
|
11
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 245] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
12
|
Zhu B, Zhou S, Lou M, Zhu J, Li B, Xie G, Jin G, De Mot R. Characterization and inference of gene gain/loss along burkholderia evolutionary history. Evol Bioinform Online 2011; 7:191-200. [PMID: 22084562 PMCID: PMC3210638 DOI: 10.4137/ebo.s7510] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
A comparative analysis of 60 complete Burkholderia genomes was conducted to obtain insight in the evolutionary history behind the diversity and pathogenicity at species level. A concatenated multiprotein phyletic pattern and a dataset with Burkholderia clusters of orthologous genes (BuCOGs) were constructed. The extent of horizontal gene transfer (HGT) was assessed using a Markov based probabilistic method. A reconstruction of the gene gains and losses history shows that more than half of the Burkholderia genes families are inferred to have experienced HGT at least once during their evolution. Further analysis revealed that the number of gene gain and loss was correlated with the branch length. Genomic islands (GEIs) analysis based on evolutionary history reconstruction not only revealed that most genes in ancient GEIs were gained but also suggested that the fraction of the genome located in GEIs in the small chromosomes is higher than in the large chromosomes in Burkholderia. The mapping of coexpressed genes onto biological pathway schemes revealed that pathogenicity of Burkholderia strains is probably mainly determined by the gained genes in its ancestor. Taken together, our results strongly support that gene gain and loss especially in ancient evolutionary history play an important role in strain divergence, pathogenicity determinants of Burkholderia and GEIs formation.
Collapse
Affiliation(s)
- Bo Zhu
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - Shengli Zhou
- Environmental Monitoring Center of Zhejiang Province, Hangzhou 310015, China
| | - Miaomiao Lou
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - Jun Zhu
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310029, China
| | - Bin Li
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - Guanlin Xie
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - GuLei Jin
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310029, China
| | - René De Mot
- Centre of Microbial and Plant Genetics, Katholieke Universiteit Leuven, 3001 Heverlee-Leuven 3001, Belgium
| |
Collapse
|
13
|
A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes. PLoS Comput Biol 2011; 7:e1002150. [PMID: 21935348 PMCID: PMC3174169 DOI: 10.1371/journal.pcbi.1002150] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Accepted: 06/21/2011] [Indexed: 11/19/2022] Open
Abstract
Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. In eukaryotes, protein-coding genes are interrupted by non-coding introns. The intron densities widely differ, from 6–7 introns per kilobase of coding sequence in vertebrates, some invertebrates and plants, to only a few introns across the entire genome in many unicellular forms. We applied a robust statistical methodology, Markov Chain Monte Carlo, to reconstruct the history of intron gain and loss throughout the evolution of eukaryotes using a set of 245 homologous genes from 99 genomes that represent the diversity of eukaryotes. Intron-rich ancestors were confidently inferred for each major eukaryotic group including 53% to 74% of the human intron density for the last eukaryotic common ancestor, and 120% to 130% of the human value for the last common ancestor of animals. Evolution of eukaryotic genes involved primarily intron loss, with substantial gain only at the bases of several major branches including plants and animals. Thus, the common ancestor of all extant eukaryotes was a complex organism with a gene architecture resembling those in multicellular organisms. The line of descent from the last common ancestor to mammals was an uninterrupted intron-rich state that, given the error-prone splicing in intron-rich organisms, was conducive to the elaboration of functional alternative splicing.
Collapse
|
14
|
Lehmann J, Eisenhardt C, Stadler PF, Krauss V. Some novel intron positions in conserved Drosophila genes are caused by intron sliding or tandem duplication. BMC Evol Biol 2010; 10:156. [PMID: 20500887 PMCID: PMC2891723 DOI: 10.1186/1471-2148-10-156] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2009] [Accepted: 05/26/2010] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Positions of spliceosomal introns are often conserved between remotely related genes. Introns that reside in non-conserved positions are either novel or remnants of frequent losses of introns in some evolutionary lineages. A recent gain of such introns is difficult to prove. However, introns verified as novel are needed to evaluate contemporary processes of intron gain. RESULTS We identified 25 unambiguous cases of novel intron positions in 31 Drosophila genes that exhibit near intron pairs (NIPs). Here, a NIP consists of an ancient and a novel intron position that are separated by less than 32 nt. Within a single gene, such closely-spaced introns are very unlikely to have coexisted. In most cases, therefore, the ancient intron position must have disappeared in favour of the novel one. A survey for NIPs among 12 Drosophila genomes identifies intron sliding (migration) as one of the more frequent causes of novel intron positions. Other novel introns seem to have been gained by regional tandem duplications of coding sequences containing a proto-splice site. CONCLUSIONS Recent intron gains sometimes appear to have arisen by duplication of exonic sequences and subsequent intronization of one of the copies. Intron migration and exon duplication together may account for a significant amount of novel intron positions in conserved coding sequences.
Collapse
Affiliation(s)
- Jörg Lehmann
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, 04107 Leipzig, Germany
| | | | | | | |
Collapse
|
15
|
Wilkerson MD, Ru Y, Brendel VP. Common introns within orthologous genes: software and application to plants. Brief Bioinform 2010; 10:631-44. [PMID: 19933210 DOI: 10.1093/bib/bbp051] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The residence of spliceosomal introns within protein-coding genes can fluctuate over time, with genes gaining, losing or conserving introns in a complex process that is not entirely understood. One approach for studying intron evolution is to compare introns with respect to position and type within closely related genes. Here, we describe new, freely available software called Common Introns Within Orthologous Genes (CIWOG), available at http://ciwog.gdcb.iastate.edu/, which detects common introns in protein-coding genes based on position and sequence conservation in the corresponding protein alignments. CIWOG provides dynamic web displays that facilitate detailed intron studies within orthologous genes. User-supplied options control how introns are clustered into sets of common introns. CIWOG also identifies special classes of introns, in particular those with GC- or U12-type donor sites, which enables analyses of these introns in relation to their counterparts in the other genes in orthologous groups. The software is demonstrated with application to a comprehensive study of eight plant transcriptomes. Three specific examples are discussed: intron class conversion from GT- to GC-donor-type introns in monocots, plant U12-type intron conservation and a global analysis of intron evolution across the eight plant species.
Collapse
|
16
|
Nonsense-mediated decay enables intron gain in Drosophila. PLoS Genet 2010; 6:e1000819. [PMID: 20107520 PMCID: PMC2809761 DOI: 10.1371/journal.pgen.1000819] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2009] [Accepted: 12/18/2009] [Indexed: 12/03/2022] Open
Abstract
Intron number varies considerably among genomes, but despite their fundamental importance, the mutational mechanisms and evolutionary processes underlying the expansion of intron number remain unknown. Here we show that Drosophila, in contrast to most eukaryotic lineages, is still undergoing a dramatic rate of intron gain. These novel introns carry significantly weaker splice sites that may impede their identification by the spliceosome. Novel introns are more likely to encode a premature termination codon (PTC), indicating that nonsense-mediated decay (NMD) functions as a backup for weak splicing of new introns. Our data suggest that new introns originate when genomic insertions with weak splice sites are hidden from selection by NMD. This mechanism reduces the sequence requirement imposed on novel introns and implies that the capacity of the spliceosome to recognize weak splice sites was a prerequisite for intron gain during eukaryotic evolution. The surprising observation 30 years ago that genes are interrupted by non-coding introns changed our view of gene architecture. Intron number varies dramatically among species; ranging from nine introns/gene in humans to less than one in some simple eukyarotes. Here we ask where new introns come from and how they are maintained in a population. We find that novel introns do not arise from pre-existing introns, although the mechanisms that generate novel introns remain unclear. We also show that novel introns carry only weak signals for their identification and removal, and therefore depend on nonsense-mediated decay (NMD). NMD maintains RNA quality control by degrading transcripts that have not been spliced properly. We propose that NMD shelters novel introns from natural selection. This increases the likelihood that a novel intron will rise in frequency and be maintained within a population, thus increasing the rate of intron gain.
Collapse
|
17
|
Estimating trees from filtered data: identifiability of models for morphological phylogenetics. J Theor Biol 2009; 263:108-19. [PMID: 20004210 DOI: 10.1016/j.jtbi.2009.12.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2009] [Revised: 12/01/2009] [Accepted: 12/01/2009] [Indexed: 11/23/2022]
Abstract
As an alternative to parsimony analyses, stochastic models have been proposed (Lewis, 2001; Nylander et al., 2004) for morphological characters, so that maximum likelihood or Bayesian analyses may be used for phylogenetic inference. A key feature of these models is that they account for ascertainment bias, in that only varying, or parsimony-informative characters are observed. However, statistical consistency of such model-based inference requires that the model parameters be identifiable from the joint distribution they entail, and this issue has not been addressed. Here we prove that parameters for several such models, with finite state spaces of arbitrary size, are identifiable, provided the tree has at least eight leaves. If the tree topology is already known, then seven leaves suffice for identifiability of the numerical parameters. The method of proof involves first inferring a full distribution of both parsimony-informative and non-informative pattern joint probabilities from the parsimony-informative ones, using phylogenetic invariants. The failure of identifiability of the tree parameter for four-taxon trees is also investigated.
Collapse
|
18
|
Abstract
Fibroblast Growth Factors (FGFs) are polypeptides with diverse activities in development and physiology. The mammalian Fgf family can be divided into the intracellular Fgf11/12/13/14 subfamily (iFGFs), the hormone-like Fgf15/21/23 subfamily (hFGFs), and the canonical Fgf subfamilies, including Fgf1/2/5, Fgf3/4/6, Fgf7/10/22, Fgf8/17/18, and Fgf9/16/20. However, all Fgfs are evolutionarily related. We propose that an Fgf13-like gene is the ancestor of the iFgf subfamily and the most likely evolutionary ancestor of the entire Fgf family. Potential ancestors of the canonical and hFgf subfamilies, Fgf4-, Fgf5-, Fgf8-, Fgf9-, Fgf10-, and Fgf15-like, appear to have derived from an Fgf13-like ancestral gene. Canonical FGFs function in a paracrine manner, while hFGFs function in an endocrine manner. We conclude that the ancestral Fgfs for these subfamilies acquired this functional diversity before the evolution of vertebrates. During the evolution of early vertebrates, the Fgf subfamilies further expanded to contain three or four members in each subfamily.
Collapse
Affiliation(s)
- Nobuyuki Itoh
- Department of Genetic Biochemistry, Kyoto University Graduate School of Pharmaceutical Sciences, Sakyo, Kyoto, Japan.
| | | |
Collapse
|
19
|
Abstract
Summary: Malin is a software package for the analysis of eukaryotic gene structure evolution. It provides a graphical user interface for various tasks commonly used to infer the evolution of exon–intron structure in protein-coding orthologs. Implemented tasks include the identification of conserved homologous intron sites in protein alignments, as well as the estimation of ancestral intron content, lineage-specific intron losses and gains. Estimates are computed either with parsimony, or with a probabilistic model that incorporates rate variation across lineages and intron sites. Availability: Malin is available as a stand-alone Java application, as well as an application bundle for MacOS X, at the website http://www.iro.umontreal.ca/~csuros/introns/malin/. The software is distributed under a BSD-style license. Contact:csuros@iro.umontreal.ca
Collapse
Affiliation(s)
- Miklós Csurös
- Department of Computer Science and Operations Research, University of Montréal, Montréal, Québec, Canada.
| |
Collapse
|
20
|
Abstract
Spliceosomal introns, a hallmark of eukaryotic gene organization, were an unexpected discovery. After three decades, crucial issues such as when and how introns first appeared in evolution remain unsettled. An issue yet to be answered is how intron positions arise de novo. Phylogenetic investigations concur that intron positions continue to emerge, at least in some lineages. Yet genomic scans for the sources of introns occupying new positions have been fruitless. Two alternative solutions to this paradox are: (i) formation of new intron positions halted before the recent past and (ii) it continues to occur, but through processes different from those generally assumed. One process generally dismissed is intron sliding--the relocation of a preexisting intron over short distances--because of supposed associated deleterious effects. The puzzle of intron gain arises owing to a pervasive operational definition of introns, which sees them as precisely demarcated segments of the genome separated from the neighboring nonintronic DNA by unmovable limits. Intron homology is defined as position homology. Recent studies of pre-mRNA processing indicate that this assumption needs to be revised. We incorporate recent advances on the evolutionarily frequent process of alternative splicing, by which exons of primary transcripts are spliced in different patterns, into a new model of intron sliding that accounts for the diversity of intron positions. We posit that intron positional diversity is driven by two overlapping processes: (i) background process of continuous relocation of preexisting introns by sliding and (ii) spurts of extensive gain/loss of new intron sequences.
Collapse
|
21
|
Irimia M, Roy SW. Spliceosomal introns as tools for genomic and evolutionary analysis. Nucleic Acids Res 2008; 36:1703-12. [PMID: 18263615 PMCID: PMC2275149 DOI: 10.1093/nar/gkn012] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Over the past 5 years, the availability of dozens of whole genomic sequences from a wide variety of eukaryotic lineages has revealed a very large amount of information about the dynamics of intron loss and gain through eukaryotic history, as well as the evolution of intron sequences. Implicit in these advances is a great deal of information about the structure and evolution of surrounding sequences. Here, we review the wealth of ways in which structures of spliceosomal introns as well as their conservation and change through evolution may be harnessed for evolutionary and genomic analysis. First, we discuss uses of intron length distributions and positions in sequence assembly and annotation, and for improving alignment of homologous regions. Second, we review uses of introns in evolutionary studies, including the utility of introns as indicators of rates of sequence evolution, for inferences about molecular evolution, as signatures of orthology and paralogy, and for estimating rates of nucleotide substitution. We conclude with a discussion of phylogenetic methods utilizing intron sequences and positions.
Collapse
Affiliation(s)
- Manuel Irimia
- Departament de Genètica, Universitat de Barcelona, Barcelona, Spain
| | | |
Collapse
|