1
|
Xie YY, Wen B, Bai MZ, Guo YY. De Novo Creation of Two Novel Spliceosomal Introns of RECG1 by Intronization of Formerly Exonic Sequences in Orchidaceae. J Mol Evol 2025; 93:267-277. [PMID: 40202594 DOI: 10.1007/s00239-025-10242-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2024] [Accepted: 03/21/2025] [Indexed: 04/10/2025]
Abstract
Spliceosomal introns are a key characteristic of eukaryotic genes. However, the origins and mechanisms of new spliceosomal introns remain elusive, and definitive case studies documenting intron creation are still limited. This study examined the RECG1 genes of 49 land plants, including 21 orchids and 28 non-orchid species. Sequence comparison revealed that the fourth intron of Gastrodia and Platanthera (Orchidaceae) is a newly gained spliceosomal intron, originating from the intronization of former exonic sequences. This intronization event was accompanied by the creation of novel recognizable GT/AG splice sites. In contrast, other orchid species lack the corresponding splice sites in the counterpart regions. Moreover, the secondary and tertiary protein structures implied that the intronization events do not affect the protein function. Given the diverse trophic modes of the two genera, we infer that relaxed selection may have contributed to the fluidity of gene structures. This study provides a typical example of de novo lineage-specific intron creation via intronization in orchids supported by multiple lines of evidence, and the two intronization events occurred independently in the same gene. This research enhances our understanding of gene evolution in orchids and provides valuable insights that may assist the annotation of structurally complex genes.
Collapse
Affiliation(s)
- Yuan-Yuan Xie
- College of Plant Protection, Henan Agricultural University, Zhengzhou, China
| | - Bin Wen
- College of Plant Protection, Henan Agricultural University, Zhengzhou, China
| | - Ming-Zhu Bai
- College of Plant Protection, Henan Agricultural University, Zhengzhou, China
| | - Yan-Yan Guo
- College of Plant Protection, Henan Agricultural University, Zhengzhou, China.
| |
Collapse
|
2
|
Brattig-Correia R, Almeida JM, Wyrwoll MJ, Julca I, Sobral D, Misra CS, Di Persio S, Guilgur LG, Schuppe HC, Silva N, Prudêncio P, Nóvoa A, Leocádio AS, Bom J, Laurentino S, Mallo M, Kliesch S, Mutwil M, Rocha LM, Tüttelmann F, Becker JD, Navarro-Costa P. The conserved genetic program of male germ cells uncovers ancient regulators of human spermatogenesis. eLife 2024; 13:RP95774. [PMID: 39388236 PMCID: PMC11466473 DOI: 10.7554/elife.95774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2024] Open
Abstract
Male germ cells share a common origin across animal species, therefore they likely retain a conserved genetic program that defines their cellular identity. However, the unique evolutionary dynamics of male germ cells coupled with their widespread leaky transcription pose significant obstacles to the identification of the core spermatogenic program. Through network analysis of the spermatocyte transcriptome of vertebrate and invertebrate species, we describe the conserved evolutionary origin of metazoan male germ cells at the molecular level. We estimate the average functional requirement of a metazoan male germ cell to correspond to the expression of approximately 10,000 protein-coding genes, a third of which defines a genetic scaffold of deeply conserved genes that has been retained throughout evolution. Such scaffold contains a set of 79 functional associations between 104 gene expression regulators that represent a core component of the conserved genetic program of metazoan spermatogenesis. By genetically interfering with the acquisition and maintenance of male germ cell identity, we uncover 161 previously unknown spermatogenesis genes and three new potential genetic causes of human infertility. These findings emphasize the importance of evolutionary history on human reproductive disease and establish a cross-species analytical pipeline that can be repurposed to other cell types and pathologies.
Collapse
Affiliation(s)
- Rion Brattig-Correia
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Department of Systems Science and Industrial Engineering, Binghamton UniversityNew YorkUnited States
| | - Joana M Almeida
- Instituto Gulbenkian de CiênciaOeirasPortugal
- EvoReproMed Lab, Environmental Health Institute (ISAMB), Associate Laboratory TERRA, Faculty of Medicine, University of LisbonLisbonPortugal
| | - Margot Julia Wyrwoll
- Centre of Medical Genetics, Institute of Reproductive Genetics, University and University Hospital of MünsterMünsterGermany
| | - Irene Julca
- School of Biological Sciences, Nanyang Technological UniversitySingaporeSingapore
| | - Daniel Sobral
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, NOVA School of Science and Technology, NOVA University LisbonLisbonPortugal
- UCIBIO - Applied Molecular Biosciences Unit, Department of Life Sciences, NOVA School of Science and Technology, NOVA University LisbonCaparicaPortugal
| | - Chandra Shekhar Misra
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de LisboaOeirasPortugal
| | - Sara Di Persio
- Centre of Reproductive Medicine and Andrology, University Hospital MünsterMünsterGermany
| | | | - Hans-Christian Schuppe
- Clinic of Urology, Pediatric Urology and Andrology, Justus-Liebig-UniversityGiessenGermany
| | - Neide Silva
- Instituto Gulbenkian de CiênciaOeirasPortugal
| | - Pedro Prudêncio
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de LisboaLisboaPortugal
| | - Ana Nóvoa
- Instituto Gulbenkian de CiênciaOeirasPortugal
| | | | - Joana Bom
- Instituto Gulbenkian de CiênciaOeirasPortugal
| | - Sandra Laurentino
- Centre of Reproductive Medicine and Andrology, University Hospital MünsterMünsterGermany
| | | | - Sabine Kliesch
- Centre of Reproductive Medicine and Andrology, University Hospital MünsterMünsterGermany
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological UniversitySingaporeSingapore
| | - Luis M Rocha
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Department of Systems Science and Industrial Engineering, Binghamton UniversityNew YorkUnited States
| | - Frank Tüttelmann
- Centre of Medical Genetics, Institute of Reproductive Genetics, University and University Hospital of MünsterMünsterGermany
| | - Jörg D Becker
- Instituto Gulbenkian de CiênciaOeirasPortugal
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de LisboaOeirasPortugal
| | - Paulo Navarro-Costa
- Instituto Gulbenkian de CiênciaOeirasPortugal
- EvoReproMed Lab, Environmental Health Institute (ISAMB), Associate Laboratory TERRA, Faculty of Medicine, University of LisbonLisbonPortugal
| |
Collapse
|
3
|
Kozłowska-Masłoń J, Ciomborowska-Basheer J, Kubiak MR, Makałowska I. Evolution of retrocopies in the context of HUSH silencing. Biol Direct 2024; 19:60. [PMID: 39095906 PMCID: PMC11295320 DOI: 10.1186/s13062-024-00507-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/29/2024] [Indexed: 08/04/2024] Open
Abstract
Retrotransposition is one of the main factors responsible for gene duplication and thus genome evolution. However, the sequences that undergo this process are not only an excellent source of biological diversity, but in certain cases also pose a threat to the integrity of the DNA. One of the mechanisms that protects against the incorporation of mobile elements is the HUSH complex, which is responsible for silencing long, intronless, transcriptionally active transposed sequences that are rich in adenine on the sense strand. In this study, broad sets of human and porcine retrocopies were analysed with respect to the above factors, taking into account evolution of these molecules. Analysis of expression pattern, genomic structure, transcript length, and nucleotide substitution frequency showed the strong relationship between the expression level and exon length as well as the protective nature of introns. The results of the studies also showed that there is no direct correlation between the expression level and adenine content. However, protein-coding retrocopies, which have a lower adenine content, have a significantly higher expression level than the adenine-rich non-coding but expressed retrocopies. Therefore, although the mechanism of HUSH silencing may be an important part of the regulation of retrocopy expression, it is one component of a more complex molecular network that remains to be elucidated.
Collapse
Affiliation(s)
- Joanna Kozłowska-Masłoń
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
- Laboratory of Cancer Genetics, Greater Poland Cancer Centre, Garbary 15, Poznań, Poland
| | - Joanna Ciomborowska-Basheer
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
- Laboratory of Nature Education and Conservation, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
| | - Magdalena Regina Kubiak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland.
| |
Collapse
|
4
|
Hoh C, Salzberg SL. Discovering Intron Gain Events in Humans through Large-Scale Evolutionary Comparisons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.02.592247. [PMID: 38746259 PMCID: PMC11092651 DOI: 10.1101/2024.05.02.592247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The rapid growth in the number of sequenced genomes makes it possible to search for the appearance of entirely new introns in the human lineage. In this study, we compared the genomic sequences for 19,120 human protein-coding genes to a collection of 3493 vertebrate genomes, mapping the patterns of intron alignments onto a phylogenetic tree. This mapping allowed us to trace many intron gain events to precise locations in the tree, corresponding to distinct points in evolutionary history. We discovered 584 intron gain events, all of them relatively recent, in 514 distinct human genes. Among these events, we explored the hypothesis that intronization was the mechanism responsible for intron gain. Intronization events were identified by locating instances where human introns correspond to exonic sequences in homologous vertebrate genes. Although apparently rare, we found three compelling cases of intronization, and for each of those we compared the human protein sequence and structure to homologous genes that lack the introns.
Collapse
Affiliation(s)
- Celine Hoh
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21211, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
| |
Collapse
|
5
|
Gabrielli F, Antinucci M, Tofanelli S. Gene Structure Evolution of the Short-Chain Dehydrogenase/Reductase (SDR) Family. Genes (Basel) 2022; 14:110. [PMID: 36672851 PMCID: PMC9859523 DOI: 10.3390/genes14010110] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/12/2022] [Accepted: 12/22/2022] [Indexed: 12/31/2022] Open
Abstract
SDR (Short-chain Dehydrogenases/Reductases) are one of the oldest and heterogeneous superfamily of proteins, whose classification is problematic because of the low percent identity, even within families. To get clearer insights into SDR molecular evolution, we explored the splicing site organization of the 75 human SDR genes across their vertebrate and invertebrate orthologs. We found anomalous gene structures in members of the human SDR7C and SDR42E families that provide clues of retrogene properties and independent evolutionary trajectories from a common invertebrate ancestor. The same analyses revealed that the identity value between human and invertebrate non-allelic variants is not necessarily associated with the homologous gene structure. Accordingly, a revision of the SDR nomenclature is proposed by including the human SDR40C1 and SDR7C gene in the same family.
Collapse
Affiliation(s)
- Franco Gabrielli
- Department of Biology, University of Pisa, Via Ghini, 13-56126 Pisa, Italy
| | - Marco Antinucci
- Department of Medicine and Life Sciences, Institute of Evolutionary Biology (UPF-CSIC), Universitat Pompeu Fabra, 08002 Barcelona, Spain
| | - Sergio Tofanelli
- Department of Biology, University of Pisa, Via Ghini, 13-56126 Pisa, Italy
| |
Collapse
|
6
|
Houston BJ, Lopes AM, Laan M, Nagirnaja L, O'Connor AE, Merriner DJ, Nguyen J, Punab M, Riera-Escamilla A, Krausz C, Aston KI, Conrad DF, O'Bryan MK. DDB1- and CUL4-associated factor 12-like protein 1 (Dcaf12l1) is not essential for male fertility in mice. Dev Biol 2022; 490:66-72. [PMID: 35850260 DOI: 10.1016/j.ydbio.2022.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 07/08/2022] [Accepted: 07/11/2022] [Indexed: 11/17/2022]
Abstract
Male infertility is a common condition affecting at least 7% of men worldwide and is often genetic in origin. Using whole exome sequencing, we recently discovered three hemizygous, likely damaging variants in DDB1- and CUL4-associated factor 12-like protein 1 (DCAF12L1) in men with azoospermia. DCAF12L1 is located on the X-chromosome and as identified by single cell sequencing studies, its expression is enriched in human testes and specifically in Sertoli cells and spermatogonia. However, very little is known about the role of DCAF12L1 in spermatogenesis, thus we generated a knockout mouse model to further explore the role of DCAF12L1 in male fertility. Knockout mice were generated using CRISPR/Cas9 technology to remove the entire coding region of Dcaf12l1 and were assessed for fertility over a broad range of ages (2-8 months of age). Despite outstanding genetic evidence in men, loss of DCAF12L1 had no discernible impact on male fertility in mice, as highlighted by breeding trials, histological assessment of the testis and epididymis, daily sperm production and evaluation of sperm motility using computer assisted methods. This disparity is likely due to the parallel evolution, and subsequent divergence, of DCAF12 family members in mice and men or the presence of compounding environmental factors in men.
Collapse
Affiliation(s)
- Brendan J Houston
- School of BioSciences and Bio21 Institute, The University of Melbourne, Parkville, VIC, Australia.
| | - Alexandra M Lopes
- Instituto de Investigação e Inovação em Saúde, Porto, Portugal; Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Portugal; Genetics of Male Infertility Initiative (GEMINI), USA
| | - Maris Laan
- Genetics of Male Infertility Initiative (GEMINI), USA; Faculty of Medicine, Institute of Biomedicine and Translational Medicine, University of Tartu, Estonia
| | - Liina Nagirnaja
- Genetics of Male Infertility Initiative (GEMINI), USA; Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA
| | - Anne E O'Connor
- School of BioSciences and Bio21 Institute, The University of Melbourne, Parkville, VIC, Australia
| | - D Jo Merriner
- School of BioSciences and Bio21 Institute, The University of Melbourne, Parkville, VIC, Australia
| | - Joseph Nguyen
- School of BioSciences and Bio21 Institute, The University of Melbourne, Parkville, VIC, Australia
| | - Margus Punab
- Genetics of Male Infertility Initiative (GEMINI), USA; Faculty of Medicine, Institute of Biomedicine and Translational Medicine, University of Tartu, Estonia; Andrology Centre, Tartu University Hospital, Tartu, Estonia; Faculty of Medicine, Institute of Clinical Medicine, University of Tartu, Tartu, Estonia
| | - Antoni Riera-Escamilla
- Andrology Department, Fundació Puigvert, Universitat Autònoma de Barcelona, Instituto de Investigaciones Biomédicas Sant Pau (IIB-Sant Pau), Barcelona, Catalonia, Spain
| | - Csilla Krausz
- Genetics of Male Infertility Initiative (GEMINI), USA; International Male Infertility Genomics Consortium (IMIGC); Department of Experimental and Clinical Biomedical Sciences "Mario Serio", Centre of Excellence DeNothe, University of Florence, Florence, Italy
| | - Kenneth Ivan Aston
- Genetics of Male Infertility Initiative (GEMINI), USA; International Male Infertility Genomics Consortium (IMIGC); Division of Reproductive Endocrinology and Infertility, School of Medicine, Washington University, St Louis, MO, USA
| | - Donald F Conrad
- Genetics of Male Infertility Initiative (GEMINI), USA; Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA; International Male Infertility Genomics Consortium (IMIGC)
| | - Moira K O'Bryan
- School of BioSciences and Bio21 Institute, The University of Melbourne, Parkville, VIC, Australia; Genetics of Male Infertility Initiative (GEMINI), USA; International Male Infertility Genomics Consortium (IMIGC)
| |
Collapse
|
7
|
Troskie RL, Faulkner GJ, Cheetham SW. Processed pseudogenes: A substrate for evolutionary innovation: Retrotransposition contributes to genome evolution by propagating pseudogene sequences with rich regulatory potential throughout the genome. Bioessays 2021; 43:e2100186. [PMID: 34569081 DOI: 10.1002/bies.202100186] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/09/2021] [Accepted: 09/13/2021] [Indexed: 11/08/2022]
Abstract
Processed pseudogenes may serve as a genetic reservoir for evolutionary innovation. Here, we argue that through the activity of long interspersed element-1 retrotransposons, processed pseudogenes disperse coding and noncoding sequences rich with regulatory potential throughout the human genome. While these sequences may appear to be non-functional, a lack of contemporary function does not prohibit future development of biological activity. Here, we discuss the dynamic evolution of certain processed pseudogenes into coding and noncoding genes and regulatory elements, and their implication in wide-ranging biological and pathological processes. Also see the video abstract here: https://youtu.be/iUY_mteVoPI.
Collapse
Affiliation(s)
- Robin-Lee Troskie
- Mater Research Institute, University of Queensland, Woolloongabba, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, Woolloongabba, Australia.,Queensland Brain Institute, University of Queensland, Brisbane, Australia
| | - Seth W Cheetham
- Mater Research Institute, University of Queensland, Woolloongabba, Australia
| |
Collapse
|
8
|
Lidak T, Baloghova N, Korinek V, Sedlacek R, Balounova J, Kasparek P, Cermak L. CRL4-DCAF12 Ubiquitin Ligase Controls MOV10 RNA Helicase during Spermatogenesis and T Cell Activation. Int J Mol Sci 2021; 22:5394. [PMID: 34065512 PMCID: PMC8161014 DOI: 10.3390/ijms22105394] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 05/12/2021] [Accepted: 05/16/2021] [Indexed: 12/27/2022] Open
Abstract
Multisubunit cullin-RING ubiquitin ligase 4 (CRL4)-DCAF12 recognizes the C-terminal degron containing acidic amino acid residues. However, its physiological roles and substrates are largely unknown. Purification of CRL4-DCAF12 complexes revealed a wide range of potential substrates, including MOV10, an "ancient" RNA-induced silencing complex (RISC) complex RNA helicase. We show that DCAF12 controls the MOV10 protein level via its C-terminal motif in a proteasome- and CRL-dependent manner. Next, we generated Dcaf12 knockout mice and demonstrated that the DCAF12-mediated degradation of MOV10 is conserved in mice and humans. Detailed analysis of Dcaf12-deficient mice revealed that their testes produce fewer mature sperms, phenotype accompanied by elevated MOV10 and imbalance in meiotic markers SCP3 and γ-H2AX. Additionally, the percentages of splenic CD4+ T and natural killer T (NKT) cell populations were significantly altered. In vitro, activated Dcaf12-deficient T cells displayed inappropriately stabilized MOV10 and increased levels of activated caspases. In summary, we identified MOV10 as a novel substrate of CRL4-DCAF12 and demonstrated the biological relevance of the DCAF12-MOV10 pathway in spermatogenesis and T cell activation.
Collapse
Affiliation(s)
- Tomas Lidak
- Laboratory of Cancer Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 42 Vestec, Czech Republic; (T.L.); (N.B.); (V.K.)
- Faculty of Science, Charles University, 128 00 Prague, Czech Republic
| | - Nikol Baloghova
- Laboratory of Cancer Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 42 Vestec, Czech Republic; (T.L.); (N.B.); (V.K.)
| | - Vladimir Korinek
- Laboratory of Cancer Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 42 Vestec, Czech Republic; (T.L.); (N.B.); (V.K.)
- Laboratory of Cell and Developmental Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 42 Vestec, Czech Republic
| | - Radislav Sedlacek
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 50 Vestec, Czech Republic; (R.S.); (J.B.); (P.K.)
| | - Jana Balounova
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 50 Vestec, Czech Republic; (R.S.); (J.B.); (P.K.)
| | - Petr Kasparek
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 50 Vestec, Czech Republic; (R.S.); (J.B.); (P.K.)
| | - Lukas Cermak
- Laboratory of Cancer Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 42 Vestec, Czech Republic; (T.L.); (N.B.); (V.K.)
| |
Collapse
|
9
|
Zhang X, Cvetkovska M, Morgan-Kiss R, Hüner NPA, Smith DR. Draft genome sequence of the Antarctic green alga Chlamydomonas sp. UWO241. iScience 2021; 24:102084. [PMID: 33644715 PMCID: PMC7887394 DOI: 10.1016/j.isci.2021.102084] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 12/08/2020] [Accepted: 01/14/2021] [Indexed: 11/22/2022] Open
Abstract
Antarctica is home to an assortment of psychrophilic algae, which have evolved various survival strategies for coping with their frigid environments. Here, we explore Antarctic psychrophily by examining the ∼212 Mb draft nuclear genome of the green alga Chlamydomonas sp. UWO241, which resides within the water column of a perennially ice-covered, hypersaline lake. Like certain other Antarctic algae, UWO241 encodes a large number (≥37) of ice-binding proteins, putatively originating from horizontal gene transfer. Even more striking, UWO241 harbors hundreds of highly similar duplicated genes involved in diverse cellular processes, some of which we argue are aiding its survival in the Antarctic via gene dosage. Gene and partial gene duplication appear to be an ongoing phenomenon within UWO241, one which might be mediated by retrotransposons. Ultimately, we consider how such a process could be associated with adaptation to extreme environments but explore potential non-adaptive hypotheses as well. Chlamydomonas sp. UWO241 is a green alga originating from Lake Bonney, Antarctica We present a draft nuclear genome sequence of UWO241 (∼212 Mb). The UWO genome contains hundreds of highly similar duplicated genes These duplicates, we argue, might be involved in cold adaptation
Collapse
Affiliation(s)
- Xi Zhang
- Department of Biology, University of Western Ontario, London, ON N6A 5B7, Canada
| | - Marina Cvetkovska
- Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada
| | | | - Norman P A Hüner
- Department of Biology, University of Western Ontario, London, ON N6A 5B7, Canada
| | - David Roy Smith
- Department of Biology, University of Western Ontario, London, ON N6A 5B7, Canada
| |
Collapse
|
10
|
Cancer, Retrogenes, and Evolution. Life (Basel) 2021; 11:life11010072. [PMID: 33478113 PMCID: PMC7835786 DOI: 10.3390/life11010072] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 01/14/2021] [Accepted: 01/15/2021] [Indexed: 12/18/2022] Open
Abstract
This review summarizes the knowledge about retrogenes in the context of cancer and evolution. The retroposition, in which the processed mRNA from parental genes undergoes reverse transcription and the resulting cDNA is integrated back into the genome, results in additional copies of existing genes. Despite the initial misconception, retroposition-derived copies can become functional, and due to their role in the molecular evolution of genomes, they have been named the “seeds of evolution”. It is convincing that retrogenes, as important elements involved in the evolution of species, also take part in the evolution of neoplastic tumors at the cell and species levels. The occurrence of specific “resistance mechanisms” to neoplastic transformation in some species has been noted. This phenomenon has been related to additional gene copies, including retrogenes. In addition, the role of retrogenes in the evolution of tumors has been described. Retrogene expression correlates with the occurrence of specific cancer subtypes, their stages, and their response to therapy. Phylogenetic insights into retrogenes show that most cancer-related retrocopies arose in the lineage of primates, and the number of identified cancer-related retrogenes demonstrates that these duplicates are quite important players in human carcinogenesis.
Collapse
|
11
|
Complex Analysis of Retroposed Genes' Contribution to Human Genome, Proteome and Transcriptome. Genes (Basel) 2020; 11:genes11050542. [PMID: 32408516 PMCID: PMC7290577 DOI: 10.3390/genes11050542] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/06/2020] [Accepted: 05/08/2020] [Indexed: 02/07/2023] Open
Abstract
Gene duplication is a major driver of organismal evolution. One of the main mechanisms of gene duplications is retroposition, a process in which mRNA is first transcribed into DNA and then reintegrated into the genome. Most gene retrocopies are depleted of the regulatory regions. Nevertheless, examples of functional retrogenes are rapidly increasing. These functions come from the gain of new spatio-temporal expression patterns, imposed by the content of the genomic sequence surrounding inserted cDNA and/or by selectively advantageous mutations, which may lead to the switch from protein coding to regulatory RNA. As recent studies have shown, these genes may lead to new protein domain formation through fusion with other genes, new regulatory RNAs or other regulatory elements. We utilized existing data from high-throughput technologies to create a complex description of retrogenes functionality. Our analysis led to the identification of human retroposed genes that substantially contributed to transcriptome and proteome. These retrocopies demonstrated the potential to encode proteins or short peptides, act as cis- and trans- Natural Antisense Transcripts (NATs), regulate their progenitors’ expression by competing for the same microRNAs, and provide a sequence to lncRNA and novel exons to existing protein-coding genes. Our study also revealed that retrocopies, similarly to retrotransposons, may act as recombination hot spots. To our best knowledge this is the first complex analysis of these functions of retrocopies.
Collapse
|
12
|
Jiao Y, Cao Y, Zheng Z, Liu M, Guo X. Massive expansion and diversity of nicotinic acetylcholine receptors in lophotrochozoans. BMC Genomics 2019; 20:937. [PMID: 31805848 PMCID: PMC6896357 DOI: 10.1186/s12864-019-6278-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 11/13/2019] [Indexed: 02/07/2023] Open
Abstract
Background Nicotinic acetylcholine receptors (nAChRs) are among the oldest and most conserved transmembrane receptors involved in signal transduction. Despite the prevalence and significance of cholinergic signaling, the diversity and evolution of nAChRs are not fully understood. Result By comparative genomic analysis, we found massive expansions of nAChR genes in molluscs and some other lophotrochozoans. The expansion is particularly pronounced in stationary bivalve molluscs with simple nervous systems, with the number of nAChR genes ranging from 99 to 217 in five bivalves, compared with 10 to 29 in five ecdysozoans and vertebrates. The expanded molluscan nAChR genes tend to be intronless and in tandem arrays due to retroposition followed by tandem duplication. Phylogenetic analysis revealed diverse nAChR families in the common ancestor of bilaterians, which subsequently experienced lineage-specific expansions or contractions. The expanded molluscan nAChR genes are highly diverse in sequence, domain structure, temporal and spatial expression profiles, implying diversified functions. Some molluscan nAChR genes are expressed in early development before the development of the nervous system, while others are involved in immune and stress responses. Conclusion The massive expansion and diversification of nAChR genes in bivalve molluscs may be a compensation for reduced nervous systems as part of adaptation to stationary life under dynamic environments, while in vertebrates a subset of specialized nAChRs are retained to work with advanced nervous systems. The unprecedented diversity identified in molluscs broadens our view on the evolution and function of nAChRs that are critical to animal physiology and human health.
Collapse
Affiliation(s)
- Yu Jiao
- Fishery College, Guangdong Ocean University, Zhanjiang, 524025, Guangdong, China.,Haskin Shellfish Research Laboratory, Department of Marine and Coastal Sciences, Rutgers University, 6959 Miller Avenue, Port Norris, NJ, 08349, USA
| | - Yanfei Cao
- Fishery College, Guangdong Ocean University, Zhanjiang, 524025, Guangdong, China
| | - Zhe Zheng
- Fishery College, Guangdong Ocean University, Zhanjiang, 524025, Guangdong, China
| | - Ming Liu
- Haskin Shellfish Research Laboratory, Department of Marine and Coastal Sciences, Rutgers University, 6959 Miller Avenue, Port Norris, NJ, 08349, USA
| | - Ximing Guo
- Haskin Shellfish Research Laboratory, Department of Marine and Coastal Sciences, Rutgers University, 6959 Miller Avenue, Port Norris, NJ, 08349, USA.
| |
Collapse
|
13
|
Transposable Elements: Classification, Identification, and Their Use As a Tool For Comparative Genomics. Methods Mol Biol 2019; 1910:177-207. [PMID: 31278665 DOI: 10.1007/978-1-4939-9074-0_6] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Most genomes are populated by hundreds of thousands of sequences originated from mobile elements. On the one hand, these sequences present a real challenge in the process of genome analysis and annotation. On the other hand, they are very interesting biological subjects involved in many cellular processes. Here we present an overview of transposable elements biodiversity, and we discuss different approaches to transposable elements detection and analyses.
Collapse
|
14
|
Kini RM. Accelerated evolution of toxin genes: Exonization and intronization in snake venom disintegrin/metalloprotease genes. Toxicon 2018; 148:16-25. [PMID: 29634956 DOI: 10.1016/j.toxicon.2018.04.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 03/21/2018] [Accepted: 04/01/2018] [Indexed: 12/20/2022]
Abstract
Toxin genes in animals undergo accelerated evolution compared to non-toxin genes to be effective and competitive in prey capture, as well as to enhance their predator defense. Several mechanisms have been proposed to explain this unusual phenomenon. These include (a) frequent mutations in exons compared to introns and nonsynonymous substitutions in exons; (b) high frequency of point mutations are due to the presence of more unstable triplets in exons compared to introns; (c) Accelerated Segment Switch in Exons to alter Targeting (ASSET); (d) Rapid Accumulation of Variations in Exposed Residues (RAVERs); (e) alteration in intron-exon boundary; (f) deletion of exon; and (g) loss/gain of domains through recombination. By systematic analyses of snake venom disintegrin/metalloprotease genes, I describe a new mechanism in the evolution of these genes through exonization and intronization. In the evolution of RTS/KTS disintegrins, a new exon (10a) is formed in intron 10 of the disintegrin/metalloprotease gene. Unlike more than 90% new exons that are from repetitive elements in introns, exon 10a originated from a non-repetitive element. To incorporate exon 10a, part of the exon 11 is intronized to retain the open reading frame. This is the first case of simultaneous exonization and intronization within a single gene. This new mechanism alters the function of toxins through drastic changes to the molecular surface via insertion of new exons and deletion of exons.
Collapse
Affiliation(s)
- R Manjunatha Kini
- Protein Science Laboratory, Department of Biological Sciences, Faculty of Science, National University of Singapore, Singapore, 117543, Singapore.
| |
Collapse
|
15
|
Rosikiewicz W, Kabza M, Kosinski JG, Ciomborowska-Basheer J, Kubiak MR, Makalowska I. RetrogeneDB-a database of plant and animal retrocopies. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2017:3964680. [PMID: 29220443 PMCID: PMC5509963 DOI: 10.1093/database/bax038] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 04/14/2017] [Indexed: 01/08/2023]
Abstract
For a long time, retrocopies were considered ‘junk DNA’, but numerous studies have shown that retrocopies may gain functionality and become so-called retrogenes. Retrogenes may code fully functional proteins that coexist with parental gene products or may even replace them. Retrocopies may also function as regulatory RNAs and, for example, become a source of small interfering RNAs, act as trans natural antisense transcripts or as alternative targets for miRNAs. Numerous researchers have emphasized that retrogenes play a crucial role in various organisms’ developmental stages and diseases. Despite the ever-growing evidence of the importance of retrocopies, resources dedicated to retroposition are very limited. Here, we report an update of the RetrogeneDB, which, to the best of our knowledge, is the largest database dedicated to retrocopies. It provides annotations of 86 458 retrocopies in 62 animal and 37 plant species. The database contains information about the retrocopies’ localization, open reading frame conservation, expression, RNA Polymerase II activity and the alternative transcription start site studies. Orthologous relationships between retrogenes were also determined, which made retrocopy conservation studies much more valuable. Additionally, based on the RNA-Seq data from the Geuvadis project, the expression levels of retrocopies were estimated in a total of 50 individuals from 5 human populations. The information is now presented in a new, more user-friendly web interface, with easy access to the source data, which may be used for the downstream analysis. RetrogeneDB is freely available at http://yeti.amu.edu.pl/retrogenedb. Database URL:http://yeti.amu.edu.pl/retrogenedb Secondary database URL:http://rhesus.amu.edu.pl/retrogenedb
Collapse
Affiliation(s)
- Wojciech Rosikiewicz
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Michal Kabza
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Jan G Kosinski
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Joanna Ciomborowska-Basheer
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Magdalena R Kubiak
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Izabela Makalowska
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| |
Collapse
|
16
|
Catania F. From intronization to intron loss: How the interplay between mRNA-associated processes can shape the architecture and the expression of eukaryotic genes. Int J Biochem Cell Biol 2017; 91:136-144. [PMID: 28673893 DOI: 10.1016/j.biocel.2017.06.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 06/25/2017] [Accepted: 06/30/2017] [Indexed: 12/29/2022]
Abstract
Transcription-coupled processes such as capping, splicing, and cleavage/polyadenylation participate in the journey from genes to proteins. Although they are traditionally thought to serve only as steps in the generation of mature mRNAs, a synthesis of available data indicates that these processes could also act as a driving force for the evolution of eukaryotic genes. A theoretical framework for how mRNA-associated processes may shape gene structure and expression has recently been proposed. Factors that promote splicing and cleavage/polyadenylation in this framework compete for access to overlapping or neighboring signals throughout the transcription cycle. These antagonistic interactions allow mechanisms for intron gain and splice site recognition as well as common trends in eukaryotic gene structure and expression to be coherently integrated. Here, I extend this framework further. Observations that largely (but not exclusively) revolve around the formation of DNA-RNA hybrid structures, called R loops, and promoter directionality are integrated. Additionally, the interplay between splicing factors and cleavage/polyadenylation factors is theorized to also affect the formation of intragenic DNA double-stranded breaks thereby contributing to intron loss. The most notable prediction in this proposition is that RNA molecules can mediate intron loss by serving as a template to repair DNA double-stranded breaks. The framework presented here leverages a vast body of empirical observations, logically extending previous suggestions, and generating verifiable predictions to further substantiate the view that the intracellular environment plays an active role in shaping the structure and the expression of eukaryotic genes.
Collapse
Affiliation(s)
- Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstraße 1, 48149 Münster, Germany.
| |
Collapse
|
17
|
Exploring the Impact of Cleavage and Polyadenylation Factors on Pre-mRNA Splicing Across Eukaryotes. G3-GENES GENOMES GENETICS 2017; 7:2107-2114. [PMID: 28500052 PMCID: PMC5499120 DOI: 10.1534/g3.117.041483] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In human, mouse, and Drosophila, the spliceosomal complex U1 snRNP (U1) protects transcripts from premature cleavage and polyadenylation at proximal intronic polyadenylation signals (PAS). These U1-mediated effects preserve transcription integrity, and are known as telescripting. The watchtower role of U1 throughout transcription is clear. What is less clear is whether cleavage and polyadenylation factors (CPFs) are simply patrolled or if they might actively antagonize U1 recruitment. In addressing this question, we found that, in the introns of human, mouse, and Drosophila, and of 14 other eukaryotes, including multi- and single-celled species, the conserved AATAAA PAS—a major target for CPFs—is selected against. This selective pressure, approximated using DNA strand asymmetry, is detected for peripheral and internal introns alike. Surprisingly, it is more pronounced within—rather than outside—the action range of telescripting, and particularly intense in the vicinity of weak 5′ splice sites. Our study uncovers a novel feature of eukaryotic genes: that the AATAAA PAS is universally counter-selected in spliceosomal introns. This pattern implies that CPFs may attempt to access introns at any time during transcription. However, natural selection operates to minimize this access. By corroborating and extending previous work, our study further indicates that CPF access to intronic PASs might perturb the recruitment of U1 to the adjacent 5′ splice sites. These results open the possibility that CPFs may impact the splicing process across eukaryotes.
Collapse
|
18
|
Casola C, Betrán E. The Genomic Impact of Gene Retrocopies: What Have We Learned from Comparative Genomics, Population Genomics, and Transcriptomic Analyses? Genome Biol Evol 2017; 9:1351-1373. [PMID: 28605529 PMCID: PMC5470649 DOI: 10.1093/gbe/evx081] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/18/2017] [Indexed: 02/07/2023] Open
Abstract
Gene duplication is a major driver of organismal evolution. Gene retroposition is a mechanism of gene duplication whereby a gene's transcript is used as a template to generate retroposed gene copies, or retrocopies. Intriguingly, the formation of retrocopies depends upon the enzymatic machinery encoded by retrotransposable elements, genomic parasites occurring in the majority of eukaryotes. Most retrocopies are depleted of the regulatory regions found upstream of their parental genes; therefore, they were initially considered transcriptionally incompetent gene copies, or retropseudogenes. However, examples of functional retrocopies, or retrogenes, have accumulated since the 1980s. Here, we review what we have learned about retrocopies in animals, plants and other eukaryotic organisms, with a particular emphasis on comparative and population genomic analyses complemented with transcriptomic datasets. In addition, these data have provided information about the dynamics of the different "life cycle" stages of retrocopies (i.e., polymorphic retrocopy number variants, fixed retropseudogenes and retrogenes) and have provided key insights into the retroduplication mechanisms, the patterns and evolutionary forces at work during the fixation process and the biological function of retrogenes. Functional genomic and transcriptomic data have also revealed that many retropseudogenes are transcriptionally active and a biological role has been experimentally determined for many. Finally, we have learned that not only non-long terminal repeat retroelements but also long terminal repeat retroelements play a role in the emergence of retrocopies across eukaryotes. This body of work has shown that mRNA-mediated duplication represents a widespread phenomenon that produces an array of new genes that contribute to organismal diversity and adaptation.
Collapse
Affiliation(s)
- Claudio Casola
- Department of Ecosystem Science and Management, Texas A&M University, TX
| | - Esther Betrán
- Department of Biology, University of Texas at Arlington, Arlington, TX
| |
Collapse
|
19
|
Protein-Coding Genes' Retrocopies and Their Functions. Viruses 2017; 9:v9040080. [PMID: 28406439 PMCID: PMC5408686 DOI: 10.3390/v9040080] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 04/07/2017] [Accepted: 04/11/2017] [Indexed: 12/11/2022] Open
Abstract
Transposable elements, often considered to be not important for survival, significantly contribute to the evolution of transcriptomes, promoters, and proteomes. Reverse transcriptase, encoded by some transposable elements, can be used in trans to produce a DNA copy of any RNA molecule in the cell. The retrotransposition of protein-coding genes requires the presence of reverse transcriptase, which could be delivered by either non-long terminal repeat (non-LTR) or LTR transposons. The majority of these copies are in a state of “relaxed” selection and remain “dormant” because they are lacking regulatory regions; however, many become functional. In the course of evolution, they may undergo subfunctionalization, neofunctionalization, or replace their progenitors. Functional retrocopies (retrogenes) can encode proteins, novel or similar to those encoded by their progenitors, can be used as alternative exons or create chimeric transcripts, and can also be involved in transcriptional interference and participate in the epigenetic regulation of parental gene expression. They can also act in trans as natural antisense transcripts, microRNA (miRNA) sponges, or a source of various small RNAs. Moreover, many retrocopies of protein-coding genes are linked to human diseases, especially various types of cancer.
Collapse
|
20
|
Ma MY, Lan XR, Niu DK. Intron gain by tandem genomic duplication: a novel case in a potato gene encoding RNA-dependent RNA polymerase. PeerJ 2016; 4:e2272. [PMID: 27547574 PMCID: PMC4974935 DOI: 10.7717/peerj.2272] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 06/29/2016] [Indexed: 01/15/2023] Open
Abstract
The origin and subsequent accumulation of spliceosomal introns are prominent events in the evolution of eukaryotic gene structure. However, the mechanisms underlying intron gain remain unclear because there are few proven cases of recently gained introns. In an RNA-dependent RNA polymerase (RdRp) gene, we found that a tandem duplication occurred after the divergence of potato and its wild relatives among other Solanum plants. The duplicated sequence crosses the intron-exon boundary of the first intron and the second exon. A new intron was detected at this duplicated region, and it includes a small previously exonic segment of the upstream copy of the duplicated sequence and the intronic segment of the downstream copy of the duplicated sequence. The donor site of this new intron was directly obtained from the small previously exonic segment. Most of the splicing signals were inherited directly from the parental intron/exon structure, including a putative branch site, the polypyrimidine tract, the 3' splicing site, two putative exonic splicing enhancers, and the GC contents differed between the intron and exon. In the widely cited model of intron gain by tandem genomic duplication, the duplication of an AGGT-containing exonic segment provides the GT and AG splicing sites for the new intron. Our results illustrate that the tandem duplication model of intron gain should be diverse in terms of obtaining the proper splicing signals.
Collapse
Affiliation(s)
- Ming-Yue Ma
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University , Beijing , China
| | - Xin-Ran Lan
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University , Beijing , China
| | - Deng-Ke Niu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University , Beijing , China
| |
Collapse
|
21
|
Jąkalski M, Takeshita K, Deblieck M, Koyanagi KO, Makałowska I, Watanabe H, Makałowski W. Comparative genomic analysis of retrogene repertoire in two green algae Volvox carteri and Chlamydomonas reinhardtii. Biol Direct 2016; 11:35. [PMID: 27487948 PMCID: PMC4972966 DOI: 10.1186/s13062-016-0138-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Accepted: 07/27/2016] [Indexed: 01/08/2023] Open
Abstract
Background Retroposition, one of the processes of copying the genetic material, is an important RNA-mediated mechanism leading to the emergence of new genes. Because the transcription controlling segments are usually not copied to the new location in this mechanism, the duplicated gene copies (retrocopies) become pseudogenized. However, few can still survive, e.g. by recruiting novel regulatory elements from the region of insertion. Subsequently, these duplicated genes can contribute to the formation of lineage-specific traits and phenotypic diversity. Despite the numerous studies of the functional retrocopies (retrogenes) in animals and plants, very little is known about their presence in green algae, including morphologically diverse species. The current availability of the genomes of both uni- and multicellular algae provides a good opportunity to conduct a genome-wide investigation in order to fill the knowledge gap in retroposition phenomenon in this lineage. Results Here we present a comparative genomic analysis of uni- and multicellular algae, Chlamydomonas reinhardtii and Volvox carteri, respectively, to explore their retrogene complements. By adopting a computational approach, we identified 141 retrogene candidates in total in both genomes, with their fraction being significantly higher in the multicellular Volvox. Majority of the retrogene candidates showed signatures of functional constraints, thus indicating their functionality. Detailed analyses of the identified retrogene candidates, their parental genes, and homologs of both, revealed that most of the retrogene candidates were derived from ancient retroposition events in the common ancestor of the two algae and that the parental genes were subsequently lost from the respective lineages, making many retrogenes ‘orphan’. Conclusion We revealed that the genomes of the green algae have maintained many possibly functional retrogenes in spite of experiencing various molecular evolutionary events during a long evolutionary time after the retroposition events. Our first report about the retrogene set in the green algae provides a good foundation for any future investigation of the repertoire of retrogenes and facilitates the assessment of the evolutionary impact of retroposition on diverse morphological traits in this lineage. Reviewers This article was reviewed by William Martin and Piotr Zielenkiewicz. Electronic supplementary material The online version of this article (doi:10.1186/s13062-016-0138-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marcin Jąkalski
- Institute of Bioinformatics, Faculty of Medicine, University of Muenster, 48149, Muenster, Germany
| | - Kazutaka Takeshita
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, 060-0814, Japan.,Present address: Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST) Hokkaido, Sapporo, 062-8517, Japan
| | - Mathieu Deblieck
- Institute of Bioinformatics, Faculty of Medicine, University of Muenster, 48149, Muenster, Germany.,Present address: Julius Kühn-Institute, Institute for Resistance Research and Stress Tolerance, 06484, Quedlinburg, Germany
| | - Kanako O Koyanagi
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, 060-0814, Japan
| | - Izabela Makałowska
- Department of Bioinformatics, Faculty of Biology, Adam Mickiewicz University, 61-614, Poznań, Poland
| | - Hidemi Watanabe
- Graduate School of Information Science and Technology, Hokkaido University, Sapporo, 060-0814, Japan
| | - Wojciech Makałowski
- Institute of Bioinformatics, Faculty of Medicine, University of Muenster, 48149, Muenster, Germany.
| |
Collapse
|
22
|
Carelli FN, Hayakawa T, Go Y, Imai H, Warnefors M, Kaessmann H. The life history of retrocopies illuminates the evolution of new mammalian genes. Genome Res 2016; 26:301-14. [PMID: 26728716 PMCID: PMC4772013 DOI: 10.1101/gr.198473.115] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 12/21/2015] [Indexed: 02/03/2023]
Abstract
New genes contribute substantially to adaptive evolutionary innovation, but the functional evolution of new mammalian genes has been little explored at a broad scale. Previous work established mRNA-derived gene duplicates, known as retrocopies, as models for the study of new gene origination. Here we combine mammalian transcriptomic and epigenomic data to unveil the processes underlying the evolution of stripped-down retrocopies into complex new genes. We show that although some robustly expressed retrocopies are transcribed from preexisting promoters, most evolved new promoters from scratch or recruited proto-promoters in their genomic vicinity. In particular, many retrocopy promoters emerged from ancestral enhancers (or bivalent regulatory elements) or are located in CpG islands not associated with other genes. We detected 88–280 selectively preserved retrocopies per mammalian species, illustrating that these mechanisms facilitated the birth of many functional retrogenes during mammalian evolution. The regulatory evolution of originally monoexonic retrocopies was frequently accompanied by exon gain, which facilitated co-option of distant promoters and allowed expression of alternative isoforms. While young retrogenes are often initially expressed in the testis, increased regulatory and structural complexities allowed retrogenes to functionally diversify and evolve somatic organ functions, sometimes as complex as those of their parents. Thus, some retrogenes evolved the capacity to temporarily substitute for their parents during the process of male meiotic X inactivation, while others rendered parental functions superfluous, allowing for parental gene loss. Overall, our reconstruction of the “life history” of mammalian retrogenes highlights retroposition as a general model for understanding new gene birth and functional evolution.
Collapse
Affiliation(s)
- Francesco Nicola Carelli
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Takashi Hayakawa
- Department of Wildlife Science (Nagoya Railroad Company, Limited), Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan; Japan Monkey Center, Inuyama, Aichi 484-0081, Japan
| | - Yasuhiro Go
- Department of Brain Sciences, Center for Novel Science Initiatives, National Institutes of Natural Sciences, Okazaki, Aichi 444-8585, Japan; Department of Developmental Physiology, National Institute for Physiological Sciences, Okazaki, Aichi 444-8585, Japan; Department of Physiological Sciences, School of Life Science, SOKENDAI (The Graduate University for Advanced Studies), Okazaki, Aichi 484-8585, Japan
| | - Hiroo Imai
- Department of Cellular and Molecular Biology, Primate Research Institute, Kyoto University, Inuyama, Aichi 484-8506, Japan
| | - Maria Warnefors
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Henrik Kaessmann
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
23
|
Gingerich TJ, Stumpo DJ, Lai WS, Randall TA, Steppan SJ, Blackshear PJ. Emergence and evolution of Zfp36l3. Mol Phylogenet Evol 2015; 94:518-530. [PMID: 26493225 DOI: 10.1016/j.ympev.2015.10.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Revised: 10/06/2015] [Accepted: 10/13/2015] [Indexed: 11/19/2022]
Abstract
In most mammals, the Zfp36 gene family consists of three conserved members, with a fourth member, Zfp36l3, present only in rodents. The ZFP36 proteins regulate post-transcriptional gene expression at the level of mRNA stability in organisms from humans to yeasts, and appear to be expressed in all major groups of eukaryotes. In Mus musculus, Zfp36l3 expression is limited to the placenta and yolk sac, and is important for overall fecundity. We sequenced the Zfp36l3 gene from more than 20 representative species, from members of the Muridae, Cricetidae and Nesomyidae families. Zfp36l3 was not present in Dipodidae, or any families that branched earlier, indicating that this gene is exclusive to the Muroidea superfamily. We provide evidence that Zfp36l3 arose by retrotransposition of an mRNA encoded by a related gene, Zfp36l2 into an ancestral rodent X chromosome. Zfp36l3 has evolved rapidly since its origin, and numerous modifications have developed, including variations in start codon utilization, de novo intron formation by mechanisms including a nested retrotransposition, and the insertion of distinct repetitive regions. One of these repeat regions, a long alanine rich-sequence, is responsible for the full-time cytoplasmic localization of Mus musculus ZFP36L3. In contrast, this repeat sequence is lacking in Peromyscus maniculatus ZFP36L3, and this protein contains a novel nuclear export sequence that controls shuttling between the nucleus and cytosol. Zfp36l3 is an example of a recently acquired, rapidly evolving gene, and its various orthologues illustrate several different mechanisms by which new genes emerge and evolve.
Collapse
Affiliation(s)
- Timothy J Gingerich
- Laboratory of Signal Transduction, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Deborah J Stumpo
- Laboratory of Signal Transduction, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Wi S Lai
- Laboratory of Signal Transduction, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Thomas A Randall
- Integrative Bioinformatics, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Scott J Steppan
- Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
| | - Perry J Blackshear
- Laboratory of Signal Transduction, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA; Department of Medicine, Duke University Medical Center, Durham, NC 27710, USA; Department of Biochemistry, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
24
|
Inter-population Differences in Retrogene Loss and Expression in Humans. PLoS Genet 2015; 11:e1005579. [PMID: 26474060 PMCID: PMC4608704 DOI: 10.1371/journal.pgen.1005579] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Accepted: 09/15/2015] [Indexed: 11/19/2022] Open
Abstract
Gene retroposition leads to considerable genetic variation between individuals. Recent studies revealed the presence of at least 208 retroduplication variations (RDVs), a class of polymorphisms, in which a retrocopy is present or absent from individual genomes. Most of these RDVs resulted from recent retroduplications. In this study, we used the results of Phase 1 from the 1000 Genomes Project to investigate the variation in loss of ancestral (i.e. shared with other primates) retrocopies among different human populations. In addition, we examined retrocopy expression levels using RNA-Seq data derived from the Ilumina BodyMap project, as well as data from lymphoblastoid cell lines provided by the Geuvadis Consortium. We also developed a new approach to detect novel retrocopies absent from the reference human genome. We experimentally confirmed the existence of the detected retrocopies and determined their presence or absence in the human genomes of 17 different populations. Altogether, we were able to detect 193 RDVs; the majority resulted from retrocopy deletion. Most of these RDVs had not been previously reported. We experimentally confirmed the expression of 11 ancestral retrogenes that underwent deletion in certain individuals. The frequency of their deletion, with the exception of one retrogene, is very low. The expression, conservation and low rate of deletion of the remaining 10 retrocopies may suggest some functionality. Aside from the presence or absence of expressed retrocopies, we also searched for differences in retrocopy expression levels between populations, finding 9 retrogenes that undergo statistically significant differential expression.
Collapse
|
25
|
Nowak DM, Gajecka M. Nonrandom Distribution of miRNAs Genes and Single Nucleotide Variants in Keratoconus Loci. PLoS One 2015; 10:e0132143. [PMID: 26176855 PMCID: PMC4503774 DOI: 10.1371/journal.pone.0132143] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 06/10/2015] [Indexed: 12/14/2022] Open
Abstract
Despite numerous studies, the causes of both development and progression of keratoconus remain elusive. Previous studies of this disorder focused mainly on one or two genetic factors only. However, in the analysis of such complex diseases all potential factors should be taken into consideration. The purpose of this study was a comprehensive analysis of known keratoconus loci to uncover genetic factors involved in this disease causation in the general population, which could be omitted in the original studies. In this investigation genomic data available in various databases and experimental own data were assessed. The lists of single nucleotide variants and miRNA genes localized in reported keratoconus loci were obtained from Ensembl and miRBase, respectively. The potential impact of nonsynonymous amino acid substitutions on protein structure and function was assessed with PolyPhen-2 and SIFT. For selected protein genes the ranking was made to choose those most promising for keratoconus development. Ranking results were based on topological features in the protein-protein interaction network. High specificity for the populations in which the causative sequence variants have been identified was found. In addition, the possibility of links between previously analyzed keratoconus loci was confirmed including miRNA-gene interactions. Identified number of genes associated with oxidative stress and inflammatory agents corroborated the hypothesis of their effect on the disease etiology. Distribution of the numerous sequences variants within both exons and mature miRNA which forces you to search for a broader look at the determinants of keratoconus. Our findings highlight the complexity of the keratoconus genetics.
Collapse
Affiliation(s)
- Dorota M. Nowak
- Department of Genetics and Pharmaceutical Microbiology, Poznan University of Medical Sciences, Poznan, Poland
- Institute of Human Genetics, Polish Academy of Sciences, Poznan, Poland
| | - Marzena Gajecka
- Department of Genetics and Pharmaceutical Microbiology, Poznan University of Medical Sciences, Poznan, Poland
- Institute of Human Genetics, Polish Academy of Sciences, Poznan, Poland
- * E-mail:
| |
Collapse
|
26
|
Catania F, Schmitz J. On the path to genetic novelties: insights from programmed DNA elimination and RNA splicing. WILEY INTERDISCIPLINARY REVIEWS-RNA 2015; 6:547-61. [PMID: 26140477 DOI: 10.1002/wrna.1293] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 04/29/2015] [Accepted: 06/06/2015] [Indexed: 12/17/2022]
Abstract
Understanding how genetic novelties arise is a central goal of evolutionary biology. To this end, programmed DNA elimination and RNA splicing deserve special consideration. While programmed DNA elimination reshapes genomes by eliminating chromatin during organismal development, RNA splicing rearranges genetic messages by removing intronic regions during transcription. Small RNAs help to mediate this class of sequence reorganization, which is not error-free. It is this imperfection that makes programmed DNA elimination and RNA splicing excellent candidates for generating evolutionary novelties. Leveraging a number of these two processes' mechanistic and evolutionary properties, which have been uncovered over the past years, we present recently proposed models and empirical evidence for how splicing can shape the structure of protein-coding genes in eukaryotes. We also chronicle a number of intriguing similarities between the processes of programmed DNA elimination and RNA splicing, and highlight the role that the variation in the population-genetic environment may play in shaping their target sequences.
Collapse
Affiliation(s)
- Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, Münster, Germany
| |
Collapse
|
27
|
Corbett MA, Dudding-Byth T, Crock PA, Botta E, Christie LM, Nardo T, Caligiuri G, Hobson L, Boyle J, Mansour A, Friend KL, Crawford J, Jackson G, Vandeleur L, Hackett A, Tarpey P, Stratton MR, Turner G, Gécz J, Field M. A novel X-linked trichothiodystrophy associated with a nonsense mutation in RNF113A. J Med Genet 2015; 52:269-74. [DOI: 10.1136/jmedgenet-2014-102418] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
28
|
Zhan L, Meng Q, Chen R, Yue Y, Jin Y. Origin and evolution of a new retained intron on the vulcan gene in Drosophila melanogaster subgroup species. Genome 2015; 57:567-72. [PMID: 25723758 DOI: 10.1139/gen-2014-0132] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Although numerous intron gains have been discovered, the mechanisms of intron creation have proven to be elusive. Previous study revealed that the vulcan gene of Drosophila melanogaster contained four exons in its coding region. In the current study, a newly created intron (Intron L) was identified on exon 2 of vulcan in D. melanogaster by comparing expression sequence tags. The RT-PCR experiment revealed that Intron L was associated with intron retention, in which two alternative transcripts of the gene differ by the inclusion or removal of an intron. It was found that Intron L was created by intronization of exonic sequence, and its donor and acceptor splice sites were created by synonymous mutation, leading to the origin of a new vulcan protein that is 22 amino acids shorter than the previously reported vulcan protein. Moreover, to track the origin of Intron L, 36 orthologous genes of species of Drosophila were cloned or annotated, and phylogenetic analysis was carried out. It indicated that the common ancestor of D. melangaster subgroup species created Intron L about 15 million years ago.
Collapse
Affiliation(s)
- Leilei Zhan
- Institute of Biochemistry, College of Life Sciences, Zhejiang University (Zijingang Campus), Hangzhou, Zhejiang, ZJ310058, P.R. of China
| | | | | | | | | |
Collapse
|
29
|
Zhou K, Kuo A, Grigoriev IV. Reverse transcriptase and intron number evolution. Stem Cell Investig 2014; 1:17. [PMID: 27358863 DOI: 10.3978/j.issn.2306-9759.2014.08.01] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Accepted: 08/04/2014] [Indexed: 11/14/2022]
Abstract
BACKGROUND Introns are universal in eukaryotic genomes and play important roles in transcriptional regulation, mRNA export to the cytoplasm, nonsense-mediated decay as both a regulatory and a splicing quality control mechanism, R-loop avoidance, alternative splicing, chromatin structure, and evolution by exon-shuffling. METHODS Sixteen complete fungal genomes were used 13 of which were sequenced and annotated by JGI. Ustilago maydis, Cryptococcus neoformans, and Coprinus cinereus (also named Coprinopsis cinerea) were from the Broad Institute. Gene models from JGI-annotated genomes were taken from the GeneCatalog track that contained the best representative gene models. Varying fractions of the GeneCatalog were manually curated by external users. For clarity, we used the JGI unique database identifier. RESULTS The last common ancestor of eukaryotes (LECA) has an estimated 6.4 coding exons per gene (EPG) and evolved into the diverse eukaryotic life forms, which is recapitulated by the development of a stem cell. We found a parallel between the simulated reverse transcriptase (RT)-mediated intron loss and the comparative analysis of 16 fungal genomes that spanned a wide range of intron density. Although footprints of RT (RTF) were dynamic, relative intron location (RIL) to the 5'-end of mRNA faithfully traced RT-mediated intron loss and revealed 7.7 EPG for LECA. The mode of exon length distribution was conserved in simulated intron loss, which was exemplified by the shared mode of 75 nt between fungal and Chlamydomonas genomes. The dominant ancient exon length was corroborated by the average exon length of the most intron-rich genes in fungal genomes and consistent with ancient protein modules being ~25 aa. Combined with the conservation of a protein length of 400 aa, the earliest ancestor of eukaryotes could have 16 EPG. During earlier evolution, Ascomycota's ancestor had significantly more 3'-biased RT-mediated intron loss that was followed by dramatic RTF loss. There was a down trend of EPG from more conserved to less conserved genes. Moreover, species-specific genes have higher exon-densities, shorter exons, and longer introns when compared to genes conserved at the phylum level. However, intron length in species-specific genes became shorter than that of genes conserved in all species after genomes experiencing drastic intron loss. The estimated EPG from the most frequent exon length is more than double that from the RIL method. CONCLUSIONS This implies significant intron loss during the very early period of eukaryotic evolution. De novo gene-birth contributes to shorter exons, longer introns, and higher exon-density in species-specific genes relative to conserved genes.
Collapse
Affiliation(s)
- Kemin Zhou
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Alan Kuo
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Igor V Grigoriev
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
30
|
Abstract
Retrocopies of protein-coding genes, reverse transcribed and inserted into the genome copies of mature RNA, have commonly been categorized as pseudogenes with no biological importance. However, recent studies showed that they play important role in the genomes evolution and shaping interspecies differences. Here, we present RetrogeneDB, a database of retrocopies in 62 animal genomes. RetrogeneDB contains information about retrocopies, their genomic localization, parental genes, ORF conservation, and expression. To our best knowledge, this is the most complete retrocopies database providing information for dozens of species previously never analyzed in the context of protein-coding genes retroposition. The database is available at http://retrogenedb.amu.edu.pl.
Collapse
Affiliation(s)
- Michał Kabza
- Labolatory of Bioinformatics, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Joanna Ciomborowska
- Labolatory of Bioinformatics, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Izabela Makałowska
- Labolatory of Bioinformatics, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| |
Collapse
|
31
|
Zhu T, Niu DK. Mechanisms of intron loss and gain in the fission yeast Schizosaccharomyces. PLoS One 2013; 8:e61683. [PMID: 23613904 PMCID: PMC3629103 DOI: 10.1371/journal.pone.0061683] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 03/13/2013] [Indexed: 11/24/2022] Open
Abstract
The fission yeast, Schizosaccharomyces pombe, is an important model species with a low intron density. Previous studies showed extensive intron losses during its evolution. To test the models of intron loss and gain in fission yeasts, we conducted a comparative genomic analysis in four Schizosaccharomyces species. Both intronization and de-intronization were observed, although both were at a low frequency. A de-intronization event was caused by a degenerative mutation in the branch site. Four cases of imprecise intron losses were identified, indicating that genomic deletion is not a negligible mechanism of intron loss. Most intron losses were precise deletions of introns, and were significantly biased to the 3′ sides of genes. Adjacent introns tended to be lost simultaneously. These observations indicated that the main force shaping the exon-intron structures of fission yeasts was precise intron losses mediated by reverse transcriptase. We found two cases of intron gains caused by tandem genomic duplication, but failed to identify the mechanisms for the majority of the intron gain events observed. In addition, we found that intron-lost and intron-gained genes had certain similar features, such as similar Gene Ontology categories and expression levels.
Collapse
Affiliation(s)
- Tao Zhu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Deng-Ke Niu
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and Beijing Key Laboratory of Gene Resource and Molecular Development, College of Life Sciences, Beijing Normal University, Beijing, China
- * E-mail:
| |
Collapse
|
32
|
Janice J, Jąkalski M, Makałowski W. Surprisingly high number of Twintrons in vertebrates. Biol Direct 2013; 8:4. [PMID: 23356793 PMCID: PMC3564746 DOI: 10.1186/1745-6150-8-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 01/22/2013] [Indexed: 11/10/2022] Open
Abstract
Twintrons represent a special intronic arrangement in which introns of two different types occupy the same gene position. Consequently, alternative splicing of these introns requires two different spliceosomes competing for the same RNA molecule. So far, only two twintrons have been described in insects. Surprisingly, we discovered several such arrangements in vertebrate genomes, which are quite conserved throughout the lineages.
Collapse
Affiliation(s)
- Jessin Janice
- Institute of Bioinformatics, Faculty of Medicine, University of Muenster, Niels Stensen Strasse 14, Muenster 48149, Germany
| | | | | |
Collapse
|
33
|
Ciomborowska J, Rosikiewicz W, Szklarczyk D, Makałowski W, Makałowska I. "Orphan" retrogenes in the human genome. Mol Biol Evol 2012; 30:384-96. [PMID: 23066043 PMCID: PMC3548309 DOI: 10.1093/molbev/mss235] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Gene duplicates generated via retroposition were long thought to be pseudogenized and consequently decayed. However, a significant number of these genes escaped their evolutionary destiny and evolved into functional genes. Despite multiple studies, the number of functional retrogenes in human and other genomes remains unclear. We performed a comparative analysis of human, chicken, and worm genomes to identify “orphan” retrogenes, that is, retrogenes that have replaced their progenitors. We located 25 such candidates in the human genome. All of these genes were previously known, and the majority has been intensively studied. Despite this, they have never been recognized as retrogenes. Analysis revealed that the phenomenon of replacing parental genes with their retrocopies has been taking place over the entire span of animal evolution. This process was often species specific and contributed to interspecies differences. Surprisingly, these retrogenes, which should evolve in a more relaxed mode, are subject to a very strong purifying selection, which is, on average, two and a half times stronger than other human genes. Also, for retrogenes, they do not show a typical overall tendency for a testis-specific expression. Notably, seven of them are associated with human diseases. Recognizing them as “orphan” retrocopies, which have different regulatory machinery than their parents, is important for any disease studies in model organisms, especially when discoveries made in one species are transferred to humans.
Collapse
Affiliation(s)
- Joanna Ciomborowska
- Laboratory of Bionformatics, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | | | | | | | | |
Collapse
|
34
|
Yenerall P, Zhou L. Identifying the mechanisms of intron gain: progress and trends. Biol Direct 2012; 7:29. [PMID: 22963364 PMCID: PMC3443670 DOI: 10.1186/1745-6150-7-29] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Accepted: 08/22/2012] [Indexed: 12/22/2022] Open
Abstract
Abstract Continued improvements in Next-Generation DNA/RNA sequencing coupled with advances in gene annotation have provided researchers access to a plethora of annotated genomes. Subsequent analyses of orthologous gene structures have identified numerous intron gain and loss events that have occurred both recently and in the very distant past. This research has afforded exceptional insight into the temporal and lineage-specific rates of intron gain and loss among various species throughout evolution. Numerous studies have also attempted to identify the molecular mechanisms of intron gain and loss. However, even after considerable effort, very little is known about these processes. In particular, the mechanism(s) of intron gain have proven exceptionally enigmatic and remain topics of considerable debate. Currently, there exists no definitive consensus as to what mechanism(s) may generate introns. Because many introns are known to affect gene expression, it is necessary to understand the molecular process(es) by which introns may be gained. Here we review the seven most commonly purported mechanisms of intron gain and, when possible, summarize molecular evidence for or against the occurrence of each of these mechanisms. Furthermore, we catalogue indirect evidence that supports the occurrence of each mechanism. Finally, because these proposed mechanisms fail to explain the mechanistic origin of many recently gained introns, we also look at trends that may aid researchers in identifying other potential mechanism(s) of intron gain. Reviewers This article was reviewed by Eugene Koonin, Scott Roy (nominated by W. Ford Doolittle), and John Logsdon.
Collapse
Affiliation(s)
- Paul Yenerall
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | | |
Collapse
|
35
|
Kang LF, Zhu ZL, Zhao Q, Chen LY, Zhang Z. Newly evolved introns in human retrogenes provide novel insights into their evolutionary roles. BMC Evol Biol 2012; 12:128. [PMID: 22839428 PMCID: PMC3565874 DOI: 10.1186/1471-2148-12-128] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2012] [Accepted: 07/19/2012] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Retrogenes generally do not contain introns. However, in some instances, retrogenes may recruit internal exonic sequences as introns, which is known as intronization. A retrogene that undergoes intronization is a good model with which to investigate the origin of introns. Nevertheless, previously, only two cases in vertebrates have been reported. RESULTS In this study, we systematically screened the human (Homo sapiens) genome for retrogenes that evolved introns and analyzed their patterns in structure, expression and origin. In total, we identified nine intron-containing retrogenes. Alignment of pairs of retrogenes and their parents indicated that, in addition to intronization (five cases), retrogenes also may have gained introns by insertion of external sequences into the genes (one case) or reversal of the orientation of transcription (three cases). Interestingly, many intronizations were promoted not by base substitutions but by cryptic splice sites, which were silent in the parental genes but active in the retrogenes. We also observed that the majority of introns generated by intronization did not involve frameshifts. CONCLUSIONS Intron gains in retrogenes are not as rare as previously thought. Furthermore, diverse mechanisms may lead to intron creation in retrogenes. The activation of cryptic splice sites in the intronization of retrogenes may be triggered by the change of gene structure after retroposition. A high percentage of non-frameshift introns in retrogenes may be because non-frameshift introns do not dramatically affect host proteins. Introns generated by intronization in human retrogenes are generally young, which is consistent with previous findings for Caenorhabditis elegans. Our results provide novel insights into the evolutionary role of introns.
Collapse
Affiliation(s)
- Li-Fang Kang
- College of Life Sciences, Chongqing University, Chongqing 400044, China
| | | | | | | | | |
Collapse
|
36
|
Lozano JC, Vergé V, Schatt P, Juengel JL, Peaucellier G. Evolution of cyclin B3 shows an abrupt three-fold size increase, due to the extension of a single exon in placental mammals, allowing for new protein-protein interactions. Mol Biol Evol 2012; 29:3855-71. [PMID: 22826462 DOI: 10.1093/molbev/mss189] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Cyclin B3 evolution has the unique peculiarity of an abrupt 3-fold increase of the protein size in the mammalian lineage due to the extension of a single exon. We have analyzed the evolution of the gene to define the modalities of this event and the possible consequences on the function of the protein. Database searches can trace the appearance of the gene to the origin of metazoans. Most introns were already present in early metazoans, and the intron-exon structure as well as the protein size were fairly conserved in invertebrates and nonmammalian vertebrates. Although intron gains are considered as rare events, we identified two cases, one at the prochordate-chordate transition and one in murids, resulting from different mechanisms. At the emergence of mammals, the gene was relocated from chromosome 6 of platypus to the X chromosome in marsupials, but the exon extension occurred only in placental mammals. A repetitive structure of 18 amino acids, of uncertain origin, is detectable in the 3,000-nt mammalian exon-encoded sequence, suggesting an extension by multiple internal duplications, some of which are still detectable in the primate lineage. Structure prediction programs suggest that the repetitive structure has no associated three-dimensional structure but rather a tendency for disorder. Splice variant isoforms were detected in several mammalian species but without conserved pattern, notably excluding the constant coexistence of premammalian-like transcripts, without the extension. The yeast two-hybrid method revealed that, in human, the extension allowed new interactions with ten unrelated proteins, most of them with specific three-dimensional structures involved in protein-protein interactions, and some highly expressed in testis, as is cyclin B3. The interactions with activator of cAMP-responsive element modulator in testis (ACT), germ cell-less homolog 1, and chromosome 1 open reading frame 14 remain to be verified in vivo since they may not be expressed at the same stages of spermatogenesis as cyclin B3.
Collapse
|
37
|
Mbanefo EC, Chuanxin Y, Kikuchi M, Shuaibu MN, Boamah D, Kirinoki M, Hayashi N, Chigusa Y, Osada Y, Hamano S, Hirayama K. Origin of a novel protein-coding gene family with similar signal sequence in Schistosoma japonicum. BMC Genomics 2012; 13:260. [PMID: 22716200 PMCID: PMC3434034 DOI: 10.1186/1471-2164-13-260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Accepted: 06/11/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Evolution of novel protein-coding genes is the bedrock of adaptive evolution. Recently, we identified six protein-coding genes with similar signal sequence from Schistosoma japonicum egg stage mRNA using signal sequence trap (SST). To find the mechanism underlying the origination of these genes with similar core promoter regions and signal sequence, we adopted an integrated approach utilizing whole genome, transcriptome and proteome database BLAST queries, other bioinformatics tools, and molecular analyses. RESULTS Our data, in combination with database analyses showed evidences of expression of these genes both at the mRNA and protein levels exclusively in all developmental stages of S. japonicum. The signal sequence motif was identified in 27 distinct S. japonicum UniGene entries with multiple mRNA transcripts, and in 34 genome contigs distributed within 18 scaffolds with evidence of genome-wide dispersion. No homolog of these genes or similar domain was found in deposited data from any other organism. We observed preponderance of flanking repetitive elements (REs), albeit partial copies, especially of the RTE-like and Perere class at either side of the duplication source locus. The role of REs as major mediators of DNA-level recombination leading to dispersive duplication is discussed with evidence from our analyses. We also identified a stepwise pathway towards functional selection in evolving genes by alternative splicing. Equally, the possible transcription models of some protein-coding representatives of the duplicons are presented with evidence of expression in vitro. CONCLUSION Our findings contribute to the accumulating evidence of the role of REs in the generation of evolutionary novelties in organisms' genomes.
Collapse
Affiliation(s)
- Evaristus Chibunna Mbanefo
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
- Department of Parasitology and Entomology, Faculty of Bioscience, Nnamdi Azikiwe University, P.M.B. 5025, Awka, Nigeria
| | - Yu Chuanxin
- Laboratory on Technology for Parasitic Disease Prevention and Control, Jiangsu Institute of Parasitic Diseases, 117 Yangxiang, Meiyuan, Wuxi, 214064, People's Republic of China
| | - Mihoko Kikuchi
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Mohammed Nasir Shuaibu
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Daniel Boamah
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Masashi Kirinoki
- Laboratory of Tropical Medicine and Parasitology, Dokkyo Medical University, Tochigi, Japan
| | - Naoko Hayashi
- Laboratory of Tropical Medicine and Parasitology, Dokkyo Medical University, Tochigi, Japan
| | - Yuichi Chigusa
- Laboratory of Tropical Medicine and Parasitology, Dokkyo Medical University, Tochigi, Japan
| | - Yoshio Osada
- Department of Immunology and Parasitology, The University of Occupational and Environmental Health, Kitakyushu, Japan
| | - Shinjiro Hamano
- Department of Parasitology, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| | - Kenji Hirayama
- Department of Immunogenetics, Institute of Tropical Medicine (NEKKEN), and Global COE Program, Nagasaki University, 1-12-4 Sakamoto, 852-8523, Nagasaki, Japan
| |
Collapse
|
38
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 245] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
39
|
Janice J, Pande A, Weiner J, Lin CF, Makałowski W. U12-type spliceosomal introns of Insecta. Int J Biol Sci 2012; 8:344-52. [PMID: 22393306 PMCID: PMC3291851 DOI: 10.7150/ijbs.3933] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 01/25/2012] [Indexed: 11/05/2022] Open
Abstract
Most of eukaryotic genes are interrupted by introns that need to be removed from pre-mRNAs before they can perform their function. This is done by complex machinery called spliceosome. Many eukaryotes possess two separate spliceosomal systems that process separate sets of introns. The major (U2) spliceosome removes majority of introns, while minute fraction of intron repertoire is processed by the minor (U12) spliceosome. These two populations of introns are called U2-type and U12-type, respectively. The latter fall into two subtypes based on the terminal dinucleotides. The minor spliceosomal system has been lost independently in some lineages, while in some others few U12-type introns persist. We investigated twenty insect genomes in order to better understand the evolutionary dynamics of U12-type introns. Our work confirms dramatic drop of U12-type introns in Diptera, leaving these genomes just with a handful cases. This is mostly the result of intron deletion, but in a number of dipteral cases, minor type introns were switched to a major type, as well. Insect genes that harbor U12-type introns belong to several functional categories among which proteins binding ions and nucleic acids are enriched and these few categories are also overrepresented among these genes that preserved minor type introns in Diptera.
Collapse
Affiliation(s)
- Jessin Janice
- Institute of Bioinformatics, University of Muenster, Muenster, Germany
| | | | | | | | | |
Collapse
|
40
|
Abstract
Most genomes are populated by thousands of sequences that originated from mobile elements. On the one hand, these sequences present a real challenge in the process of genome analysis and annotation. On the other hand, there are very interesting biological subjects involved in many cellular processes. Here, we present an overview of transposable elements (TEs) biodiversity and their impact on genomic evolution. Finally, we discuss different approaches to the TEs detection and analyses.
Collapse
|
41
|
Czugala M, Karolak JA, Nowak DM, Polakowski P, Pitarque J, Molinari A, Rydzanicz M, Bejjani BA, Yue BYJT, Szaflik JP, Gajecka M. Novel mutation and three other sequence variants segregating with phenotype at keratoconus 13q32 susceptibility locus. Eur J Hum Genet 2011; 20:389-97. [PMID: 22045297 DOI: 10.1038/ejhg.2011.203] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Keratoconus (KTCN), a non-inflammatory corneal disorder characterized by stromal thinning, represents a major cause of corneal transplantations. Genetic and environmental factors have a role in the etiology of this complex disease. Previously reported linkage analysis revealed that chromosomal region 13q32 is likely to contain causative gene(s) for familial KTCN. Consequently, we have chosen eight positional candidate genes in this region: MBNL1, IPO5, FARP1, RNF113B, STK24, DOCK9, ZIC5 and ZIC2, and sequenced all of them in 51 individuals from Ecuadorian KTCN families and 105 matching controls. The mutation screening identified one mutation and three sequence variants showing 100% segregation under a dominant model with KTCN phenotype in one large Ecuadorian family. These substitutions were found in three different genes: c.2262A>C (p.Gln754His) and c.720+43A>G in DOCK9; c.2377-132A>C in IPO5 and c.1053+29G>C in STK24. PolyPhen analyses predicted that c.2262A>C (Gln754His) is possibly damaging for the protein function and structure. Our results suggest that c.2262A>C (p.Gln754His) mutation in DOCK9 may contribute to the KTCN phenotype in the large KTCN-014 family.
Collapse
Affiliation(s)
- Marta Czugala
- Institute of Human Genetics, Polish Academy of Sciences, Strzeszynska 32,Poznan, Poland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Sakai H, Mizuno H, Kawahara Y, Wakimoto H, Ikawa H, Kawahigashi H, Kanamori H, Matsumoto T, Itoh T, Gaut BS. Retrogenes in rice (Oryza sativa L. ssp. japonica) exhibit correlated expression with their source genes. Genome Biol Evol 2011; 3:1357-68. [PMID: 22042334 PMCID: PMC3240961 DOI: 10.1093/gbe/evr111] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Gene duplication occurs by either DNA- or RNA-based processes; the latter duplicates single genes via retroposition of messenger RNA. The expression of a retroposed gene copy (retrocopy) is expected to be uncorrelated with its source gene because upstream promoter regions are usually not part of the retroposition process. In contrast, DNA-based duplication often encompasses both the coding and the intergenic (promoter) regions; hence, expression is often correlated, at least initially, between DNA-based duplicates. In this study, we identified 150 retrocopies in rice (Oryza sativa L. ssp japonica), most of which represent ancient retroposition events. We measured their expression from high-throughput RNA sequencing (RNAseq) data generated from seven tissues. At least 66% of the retrocopies were expressed but at lower levels than their source genes. However, the tissue specificity of retrogenes was similar to their source genes, and expression between retrocopies and source genes was correlated across tissues. The level of correlation was similar between RNA- and DNA-based duplicates, and they decreased over time at statistically indistinguishable rates. We extended these observations to previously identified retrocopies in Arabidopsis thaliana, suggesting they may be general features of the process of retention of plant retrogenes.
Collapse
Affiliation(s)
- Hiroaki Sakai
- Agrogenomics Research Center, National Institute of Agrobiological Sciences, Tsukuba, Ibaraki, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|