1
|
Wu B, Xu W, Wu K, Li Y, Hu M, Feng C, Zhu C, Zheng J, Cui X, Li J, Fan D, Zhang F, Liu Y, Chen J, Liu C, Li G, Qiu Q, Qu K, Wang W, Wang K. Single-cell analysis of the amphioxus hepatic caecum and vertebrate liver reveals genetic mechanisms of vertebrate liver evolution. Nat Ecol Evol 2024; 8:1972-1990. [PMID: 39152328 DOI: 10.1038/s41559-024-02510-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 07/19/2024] [Indexed: 08/19/2024]
Abstract
The evolution of the vertebrate liver is a prime example of the evolution of complex organs, yet the driving genetic factors behind it remain unknown. Here we study the evolutionary genetics of liver by comparing the amphioxus hepatic caecum and the vertebrate liver, as well as examining the functional transition within vertebrates. Using in vivo and in vitro experiments, single-cell/nucleus RNA-seq data and gene knockout experiments, we confirm that the amphioxus hepatic caecum and vertebrate liver are homologous organs and show that the emergence of ohnologues from two rounds of whole-genome duplications greatly contributed to the functional complexity of the vertebrate liver. Two ohnologues, kdr and flt4, play an important role in the development of liver sinusoidal endothelial cells. In addition, we found that liver-related functions such as coagulation and bile production evolved in a step-by-step manner, with gene duplicates playing a crucial role. We reconstructed the genetic footprint of the transfer of haem detoxification from the liver to the spleen during vertebrate evolution. Together, these findings challenge the previous hypothesis that organ evolution is primarily driven by regulatory elements, underscoring the importance of gene duplicates in the emergence and diversification of a complex organ.
Collapse
Affiliation(s)
- Baosheng Wu
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Institute of Zoology, Guangdong Academy of Sciences, Guangzhou, China
| | - Wenjie Xu
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Kunjin Wu
- Key Laboratory of Surgical Critical Care and Life Support (Xi'an Jiaotong University), Ministry of Education, Xi'an, China
- Department of Hepatobiliary Surgery and Liver Transplantation, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Ye Li
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Mingliang Hu
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Chenguang Feng
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Chenglong Zhu
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Jiangmin Zheng
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Xinxin Cui
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Jing Li
- Key Laboratory of Surgical Critical Care and Life Support (Xi'an Jiaotong University), Ministry of Education, Xi'an, China
- Department of Hepatobiliary Surgery and Liver Transplantation, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Deqian Fan
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Fenghua Zhang
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Yuxuan Liu
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Jinping Chen
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Institute of Zoology, Guangdong Academy of Sciences, Guangzhou, China
| | - Chang Liu
- Key Laboratory of Surgical Critical Care and Life Support (Xi'an Jiaotong University), Ministry of Education, Xi'an, China
- Department of Hepatobiliary Surgery and Liver Transplantation, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Guang Li
- State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Xiamen University, Xiamen, China.
| | - Qiang Qiu
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China.
| | - Kai Qu
- Key Laboratory of Surgical Critical Care and Life Support (Xi'an Jiaotong University), Ministry of Education, Xi'an, China.
- Department of Hepatobiliary Surgery and Liver Transplantation, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.
| | - Wen Wang
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China.
- New Cornerstone Science Laboratory, Xi'an, China.
| | - Kun Wang
- Shaanxi Key Laboratory of Qinling Ecological Intelligent Monitoring and Protection, School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China.
- Laboratory for Marine Biology and Biotechnology, Qingdao Marine Science and Technology Center, Qingdao, China.
| |
Collapse
|
2
|
Kozłowska-Masłoń J, Ciomborowska-Basheer J, Kubiak MR, Makałowska I. Evolution of retrocopies in the context of HUSH silencing. Biol Direct 2024; 19:60. [PMID: 39095906 PMCID: PMC11295320 DOI: 10.1186/s13062-024-00507-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 07/29/2024] [Indexed: 08/04/2024] Open
Abstract
Retrotransposition is one of the main factors responsible for gene duplication and thus genome evolution. However, the sequences that undergo this process are not only an excellent source of biological diversity, but in certain cases also pose a threat to the integrity of the DNA. One of the mechanisms that protects against the incorporation of mobile elements is the HUSH complex, which is responsible for silencing long, intronless, transcriptionally active transposed sequences that are rich in adenine on the sense strand. In this study, broad sets of human and porcine retrocopies were analysed with respect to the above factors, taking into account evolution of these molecules. Analysis of expression pattern, genomic structure, transcript length, and nucleotide substitution frequency showed the strong relationship between the expression level and exon length as well as the protective nature of introns. The results of the studies also showed that there is no direct correlation between the expression level and adenine content. However, protein-coding retrocopies, which have a lower adenine content, have a significantly higher expression level than the adenine-rich non-coding but expressed retrocopies. Therefore, although the mechanism of HUSH silencing may be an important part of the regulation of retrocopy expression, it is one component of a more complex molecular network that remains to be elucidated.
Collapse
Affiliation(s)
- Joanna Kozłowska-Masłoń
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
- Laboratory of Cancer Genetics, Greater Poland Cancer Centre, Garbary 15, Poznań, Poland
| | - Joanna Ciomborowska-Basheer
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
- Laboratory of Nature Education and Conservation, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
| | - Magdalena Regina Kubiak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznańskiego 6, Poznań, Poland.
| |
Collapse
|
3
|
Yan Y, Tian Y, Wu Z, Zhang K, Yang R. Interchromosomal Colocalization with Parental Genes Is Linked to the Function and Evolution of Mammalian Retrocopies. Mol Biol Evol 2023; 40:msad265. [PMID: 38060983 PMCID: PMC10733166 DOI: 10.1093/molbev/msad265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 10/25/2023] [Accepted: 11/29/2023] [Indexed: 12/22/2023] Open
Abstract
Retrocopies are gene duplicates arising from reverse transcription of mature mRNA transcripts and their insertion back into the genome. While long being regarded as processed pseudogenes, more and more functional retrocopies have been discovered. How the stripped-down retrocopies recover expression capability and become functional paralogs continually intrigues evolutionary biologists. Here, we investigated the function and evolution of retrocopies in the context of 3D genome organization. By mapping retrocopy-parent pairs onto sequencing-based and imaging-based chromatin contact maps in human and mouse cell lines and onto Hi-C interaction maps in 5 other mammals, we found that retrocopies and their parental genes show a higher-than-expected interchromosomal colocalization frequency. The spatial interactions between retrocopies and parental genes occur frequently at loci in active subcompartments and near nuclear speckles. Accordingly, colocalized retrocopies are more actively transcribed and translated and are more evolutionarily conserved than noncolocalized ones. The active transcription of colocalized retrocopies may result from their permissive epigenetic environment and shared regulatory elements with parental genes. Population genetic analysis of retroposed gene copy number variants in human populations revealed that retrocopy insertions are not entirely random in regard to interchromosomal interactions and that colocalized retroposed gene copy number variants are more likely to reach high frequencies, suggesting that both insertion bias and natural selection contribute to the colocalization of retrocopy-parent pairs. Further dissection implies that reduced selection efficacy, rather than positive selection, contributes to the elevated allele frequency of colocalized retroposed gene copy number variants. Overall, our results hint a role of interchromosomal colocalization in the "resurrection" of initially neutral retrocopies.
Collapse
Affiliation(s)
- Yubin Yan
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Yuhan Tian
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Zefeng Wu
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Kunling Zhang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Ruolin Yang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| |
Collapse
|
4
|
Batcher K, Varney S, Raudsepp T, Jevit M, Dickinson P, Jagannathan V, Leeb T, Bannasch D. Ancient segmentally duplicated LCORL retrocopies in equids. PLoS One 2023; 18:e0286861. [PMID: 37289743 PMCID: PMC10249811 DOI: 10.1371/journal.pone.0286861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/25/2023] [Indexed: 06/10/2023] Open
Abstract
LINE-1 is an active transposable element encoding proteins capable of inserting host gene retrocopies, resulting in retro-copy number variants (retroCNVs) between individuals. Here, we performed retroCNV discovery using 86 equids and identified 437 retrocopy insertions. Only 5 retroCNVs were shared between horses and other equids, indicating that the majority of retroCNVs inserted after the species diverged. A large number (17-35 copies) of segmentally duplicated Ligand Dependent Nuclear Receptor Corepressor Like (LCORL) retrocopies were present in all equids but absent from other extant perissodactyls. The majority of LCORL transcripts in horses and donkeys originate from the retrocopies. The initial LCORL retrotransposition occurred 18 million years ago (17-19 95% CI), which is coincident with the increase in body size, reduction in digit number, and changes in dentition that characterized equid evolution. Evolutionary conservation of the LCORL retrocopy segmental amplification in the Equidae family, high expression levels and the ancient timeline for LCORL retrotransposition support a functional role for this structural variant.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California Davis, Davis, CA, United States of America
| | - Scarlett Varney
- Department of Population Health and Reproduction, University of California Davis, Davis, CA, United States of America
| | - Terje Raudsepp
- Veterinary Integrative Biosciences, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Matthew Jevit
- Veterinary Integrative Biosciences, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California Davis, Davis, CA, United States of America
| | - Vidhya Jagannathan
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, Switzerland
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California Davis, Davis, CA, United States of America
| |
Collapse
|
5
|
Batcher K, Varney S, Affolter VK, Friedenberg SG, Bannasch D. An SNN retrocopy insertion upstream of GPR22 is associated with dark red coat color in Poodles. G3 (BETHESDA, MD.) 2022; 12:jkac227. [PMID: 36047852 PMCID: PMC9635648 DOI: 10.1093/g3journal/jkac227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 08/27/2022] [Indexed: 11/21/2022]
Abstract
Pigment production and distribution is controlled through multiple genes, resulting in a wide range of coat color phenotypes in dogs. Dogs that produce only the pheomelanin pigment vary in intensity from white to deep red. The Poodle breed has a wide range of officially recognized coat colors, including the pheomelanin-based white, cream, apricot, and red coat colors, which are not fully explained by the previously identified genetic variants involved in pigment intensity. Here, a genome-wide association study for pheomelanin intensity was performed in Poodles which identified an association on canine chromosome 18. Whole-genome sequencing data revealed an SNN retrocopy insertion (SNNL1) in apricot and red Poodles within the associated region on chromosome 18. While equal numbers of melanocytes were observed in all Poodle skin hair bulbs, higher melanin content was observed in the darker Poodles. Several genes involved in melanogenesis were also identified as highly overexpressed in red Poodle skin. The most differentially expressed gene however was GPR22, which was highly expressed in red Poodle skin while unexpressed in white Poodle skin (log2 fold change in expression 6.1, P < 0.001). GPR22 is an orphan G-protein-coupled receptor normally expressed exclusively in the brain and heart. The SNNL1 retrocopy inserted 2.8 kb upstream of GPR22 and is likely disrupting regulation of the gene, resulting in atypical expression in the skin. Thus, we identify the SNNL1 insertion as a candidate variant for the CFA18 pheomelanin intensity locus in red Poodles.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California, Davis, Davis, CA 95616, USA
| | - Scarlett Varney
- Department of Population Health and Reproduction, University of California, Davis, Davis, CA 95616, USA
| | - Verena K Affolter
- Department of Pathology, Microbiology, & Immunology, University of California, Davis, Davis, CA 95616, USA
| | - Steven G Friedenberg
- Department of Veterinary Clinical Sciences, University of Minnesota, St Paul, MN 55455, USA
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California, Davis, Davis, CA 95616, USA
| |
Collapse
|
6
|
Zhang X, Smith DR. An overview of online resources for intra-species detection of gene duplications. Front Genet 2022; 13:1012788. [PMID: 36313461 PMCID: PMC9606816 DOI: 10.3389/fgene.2022.1012788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Gene duplication plays an important role in evolutionary mechanism, which can act as a new source of genetic material in genome evolution. However, detecting duplicate genes from genomic data can be challenging. Various bioinformatics resources have been developed to identify duplicate genes from single and/or multiple species. Here, we summarize the metrics used to measure sequence identity among gene duplicates within species, compare several computational approaches that have been used to predict gene duplicates, and review recent advancements of a Basic Local Alignment Search Tool (BLAST)-based web tool and database, allowing future researchers to easily identify intra-species gene duplications. This article is a quick reference guide for research tools used for detecting gene duplicates.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
| | - David Roy Smith
- Department of Biology, Western University, London, ON, Canada
| |
Collapse
|
7
|
Zhang X, Hu Y, Smith DR. HSDatabase-a database of highly similar duplicate genes from plants, animals, and algae. Database (Oxford) 2022; 2022:baac086. [PMID: 36208223 PMCID: PMC9547538 DOI: 10.1093/database/baac086] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 08/16/2022] [Accepted: 09/20/2022] [Indexed: 11/30/2022]
Abstract
Gene duplication is an important evolutionary mechanism capable of providing new genetic material, which in some instances can help organisms adapt to various environmental conditions. Recent studies, for example, have indicated that highly similar duplicate genes (HSDs) are aiding adaptation to extreme conditions via gene dosage. However, for most eukaryotic genomes HSDs remain uncharacterized, partly because they can be hard to identify and categorize efficiently and effectively. Here, we collected and curated HSDs in nuclear genomes from various model animals, land plants and algae and indexed them in an online, open-access sequence repository called HSDatabase. Currently, this database contains 117 864 curated HSDs from 40 distinct genomes; it includes statistics on the total number of HSDs per genome as well as individual HSD copy numbers/lengths and provides sequence alignments of the duplicate gene copies. HSDatabase also allows users to download sequences of gene copies, access genome browsers, and link out to other databases, such as Pfam and Kyoto Encyclopedia of Genes and Genomes. What is more, a built-in Basic Local Alignment Search Tool option is available to conveniently explore potential homologous sequences of interest within and across species. HSDatabase has a user-friendly interface and provides easy access to the source data. It can be used on its own for comparative analyses of gene duplicates or in conjunction with HSDFinder, a newly developed bioinformatics tool for identifying, annotating, categorizing and visualizing HSDs. Database URL: http://hsdfinder.com/database/.
Collapse
Affiliation(s)
- Xi Zhang
- Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia B3H 4R2, Canada
| | - Yining Hu
- Department of Computer Science, University of Western Ontario, London, Ontario N6A 3K7, Canada
| | - David Roy Smith
- Department of Biology, University of Western Ontario, London, Ontario N6A 3K7, Canada
| |
Collapse
|
8
|
Batcher K, Varney S, York D, Blacksmith M, Kidd JM, Rebhun R, Dickinson P, Bannasch D. Recent, full-length gene retrocopies are common in canids. Genome Res 2022; 32:1602-1611. [PMID: 35961775 PMCID: PMC9435743 DOI: 10.1101/gr.276828.122] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/19/2022] [Indexed: 02/03/2023]
Abstract
Gene retrocopies arise from the reverse transcription and insertion into the genome of processed mRNA transcripts. Although many retrocopies have acquired mutations that render them functionally inactive, most mammals retain active LINE-1 sequences capable of producing new retrocopies. New retrocopies, referred to as retro copy number variants (retroCNVs), may not be identified by standard variant calling techniques in high-throughput sequencing data. Although multiple functional FGF4 retroCNVs have been associated with skeletal dysplasias in dogs, the full landscape of canid retroCNVs has not been characterized. Here, retroCNV discovery was performed on a whole-genome sequencing data set of 293 canids from 76 breeds. We identified retroCNV parent genes via the presence of mRNA-specific 30-mers, and then identified retroCNV insertion sites through discordant read analysis. In total, we resolved insertion sites for 1911 retroCNVs from 1179 parent genes, 1236 of which appeared identical to their parent genes. Dogs had on average 54.1 total retroCNVs and 1.4 private retroCNVs. We found evidence of expression in testes for 12% (14/113) of the retroCNVs identified in six Golden Retrievers, including four chimeric transcripts, and 97 retroCNVs also had significantly elevated F ST across dog breeds, possibly indicating selection. We applied our approach to a subset of human genomes and detected an average of 4.2 retroCNVs per sample, highlighting a 13-fold relative increase of retroCNV frequency in dogs. Particularly in canids, retroCNVs are a largely unexplored source of genetic variation that can contribute to genome plasticity and that should be considered when investigating traits and diseases.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| | - Scarlett Varney
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| | - Daniel York
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Matthew Blacksmith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA
| | - Robert Rebhun
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California, Davis, Davis, California 95616, USA
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California, Davis, Davis, California 95616, USA
| |
Collapse
|
9
|
Zhang W, Tautz D. Tracing the origin and evolutionary fate of recent gene retrocopies in natural populations of the house mouse. Mol Biol Evol 2021; 39:6481550. [PMID: 34940842 PMCID: PMC8826619 DOI: 10.1093/molbev/msab360] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Although the contribution of retrogenes to the evolution of genes and genomes has long been recognized, the evolutionary patterns of very recently derived retrocopies that are still polymorphic within natural populations have not been much studied so far. We use here a set of 2,025 such retrocopies in nine house mouse populations from three subspecies (Mus musculus domesticus, M. m. musculus, and M. m. castaneus) to trace their origin and evolutionary fate. We find that ancient house-keeping genes are significantly more likely to generate retrocopies than younger genes and that the propensity to generate a retrocopy depends on its level of expression in the germline. Although most retrocopies are detrimental and quickly purged, we focus here on the subset that appears to be neutral or even adaptive. We show that retrocopies from X-chromosomal parental genes have a higher likelihood to reach elevated frequencies in the populations, confirming the notion of adaptive effects for “out-of-X” retrogenes. Also, retrocopies in intergenic regions are more likely to reach higher population frequencies than those in introns of genes, implying a more detrimental effect when they land within transcribed regions. For a small subset of retrocopies, we find signatures of positive selection, indicating they were involved in a recent adaptation process. We show that the population-specific distribution pattern of retrocopies is phylogenetically informative and can be used to infer population history with a better resolution than with SNP markers.
Collapse
Affiliation(s)
- Wenyu Zhang
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, D-24306, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, D-24306, Germany
| |
Collapse
|
10
|
Zhang X, Hu Y, Smith DR. HSDFinder: A BLAST-Based Strategy for Identifying Highly Similar Duplicated Genes in Eukaryotic Genomes. FRONTIERS IN BIOINFORMATICS 2021; 1:803176. [PMID: 36303740 PMCID: PMC9580922 DOI: 10.3389/fbinf.2021.803176] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 11/25/2021] [Indexed: 01/01/2023] Open
Abstract
Gene duplication is an important evolutionary mechanism capable of providing new genetic material for adaptive and nonadaptive evolution. However, bioinformatics tools for identifying duplicate genes are often limited to the detection of paralogs in multiple species or to specific types of gene duplicates, such as retrocopies. Here, we present a user-friendly, BLAST-based web tool, called HSDFinder, which can identify, annotate, categorize, and visualize highly similar duplicate genes (HSDs) in eukaryotic nuclear genomes. HSDFinder includes an online heatmap plotting option, allowing users to compare HSDs among different species and visualize the results in different Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway functional categories. The external software requirements are BLAST, InterProScan, and KEGG. The utility of HSDFinder was tested on various model eukaryotic species, including Chlamydomonas reinhardtii, Arabidopsis thaliana, Oryza sativa, and Zea mays as well as the psychrophilic green alga Chlamydomonas sp. UWO241, and was proven to be a practical and accurate tool for gene duplication analyses. The web tool is free to use at http://hsdfinder.com. Documentation and tutorials can be found via the GitHub: https://github.com/zx0223winner/HSDFinder.
Collapse
Affiliation(s)
- Xi Zhang
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada
- Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
- *Correspondence: Xi Zhang, ; David Roy Smith,
| | - Yining Hu
- Department of Computer Science, Western University, London, ON, Canada
| | - David Roy Smith
- Department of Biology, Western University, London, ON, Canada
- *Correspondence: Xi Zhang, ; David Roy Smith,
| |
Collapse
|
11
|
Wei Z, Sun J, Li Q, Yao T, Zeng H, Wang Y. RetroScan: An Easy-to-Use Pipeline for Retrocopy Annotation and Visualization. Front Genet 2021; 12:719204. [PMID: 34484306 PMCID: PMC8415311 DOI: 10.3389/fgene.2021.719204] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
Retrocopies, which are considered “junk genes,” are occasionally formed via the insertion of reverse-transcribed mRNAs at new positions in the genome. However, an increasing number of recent studies have shown that some retrocopies exhibit new biological functions and may contribute to genome evolution. Hence, the identification of retrocopies has become very meaningful for studying gene duplication and new gene generation. Current pipelines identify retrocopies through complex operations using alignment programs and filter scripts in a step-by-step manner. Therefore, there is an urgent need for a simple and convenient retrocopy annotation tool. Here, we report the development of RetroScan, a publicly available and easy-to-use tool for scanning, annotating and displaying retrocopies, consisting of two components: an analysis pipeline and a visual interface. The pipeline integrates a series of bioinformatics software programs and scripts for identifying retrocopies in just one line of command. Compared with previous methods, RetroScan increases accuracy and reduces false-positive results. We also provide a Shiny app for visualization. It displays information on retrocopies and their parental genes that can be used for the study of retrocopy structure and evolution. RetroScan is available at https://github.com/Vicky123wzy/RetroScan.
Collapse
Affiliation(s)
- Zhaoyuan Wei
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China.,Biological Science Research Center, Southwest University, Chongqing, China
| | - Jiahe Sun
- Biological Science Research Center, Southwest University, Chongqing, China
| | - Qinhui Li
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China
| | - Ting Yao
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China
| | - Haiyue Zeng
- Biological Science Research Center, Southwest University, Chongqing, China
| | - Yi Wang
- State Key Laboratory of Silkworm Genome Biology, Biological Science Research Center, Southwest University, Chongqing, China.,Biological Science Research Center, Southwest University, Chongqing, China
| |
Collapse
|
12
|
Rowley PA, Ellahi A, Han K, Patel JS, Van Leuven JT, Sawyer SL. Nuku, a family of primate retrocopies derived from KU70. G3 (BETHESDA, MD.) 2021; 11:jkab163. [PMID: 34849803 PMCID: PMC8496227 DOI: 10.1093/g3journal/jkab163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 04/30/2021] [Indexed: 11/16/2022]
Abstract
The gene encoding the ubiquitous DNA repair protein, Ku70p, has undergone extensive copy number expansion during primate evolution. Gene duplications of KU70 have the hallmark of long interspersed element-1 mediated retrotransposition with evidence of target-site duplications, the poly-A tails, and the absence of introns. Evolutionary analysis of this expanded family of KU70-derived "NUKU" retrocopies reveals that these genes are both ancient and also actively being created in extant primate species. NUKU retrocopies show evidence of functional divergence away from KU70, as evinced by their altered pattern of tissue expression and possible tissue-specific translation. Molecular modeling predicted that amino acid changes in Nuku2p at the interaction interface with Ku80p would prevent the assembly of the Ku heterodimer. The lack of Nuku2p-Ku80p interaction was confirmed by yeast two-hybrid assay, which contrasts the robust interaction of Ku70p-Ku80p. While several NUKU retrocopies appear to have been degraded by mutation, NUKU2 shows evidence of positive natural selection, suggesting that this retrocopy is undergoing neofunctionalization. Although Nuku proteins do not appear to antagonize retrovirus transduction in cell culture, the observed expansion and rapid evolution of NUKUs could be being driven by alternative selective pressures related to infectious disease or an undefined role in primate physiology.
Collapse
Affiliation(s)
- Paul A Rowley
- Department of Biological Sciences, University of Idaho, Moscow, ID 83844, USA
| | - Aisha Ellahi
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78751, USA
| | - Kyudong Han
- Department of Microbiology, College of Science & Technology, Dankook University, Cheonan 31116, Republic of Korea
- Center for Bio- Medical Engineering Core Facility, Dankook University, Cheonan 31116, Republic of Korea
| | - Jagdish Suresh Patel
- Center for Modeling Complex Interactions, University of Idaho, Moscow, ID 83844, USA and
| | - James T Van Leuven
- Center for Modeling Complex Interactions, University of Idaho, Moscow, ID 83844, USA and
| | - Sara L Sawyer
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado Boulder, Boulder, CO 80302, USA
| |
Collapse
|
13
|
Garewal N, Goyal N, Pathania S, Kaur J, Singh K. Gauging the trends of pseudogenes in plants. Crit Rev Biotechnol 2021; 41:1114-1129. [PMID: 33993808 DOI: 10.1080/07388551.2021.1901648] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Pseudogenes, the debilitated parts of ancient genes, were previously scrapped off as junk or discarded genes with no functional significance. Pseudogenes have come under scrutiny for their functionality, since recent studies have unveiled their importance in the regulation of their corresponding parent genes and various biological mechanisms. Despite the enormous occurrence of pseudogenes in plants, the lack of experimental validation has contributed toward their unresolved roles in gene regulation. Contrarily, most of the studies associated with gene regulation have been mainly reported for humans, mice, and other mammalian genomes. Consequently, in order to present a cumulative report on plant-based pseudogenes research, an attempt has been made to assemble multiple studies presenting the pseudogene classification, the prediction and the determination of comparative accuracies of various computational pipelines, and recent trends in analyzing their biological functions, and regulatory mechanisms. This review represents the classical, as well as the recent advances on pseudogene identification and their potential roles in transcriptional regulation, which could possibly invigorate the quality of genome annotation, evolutionary analysis, and complexity surrounding the regulatory pathways in plants. Thus, when the ambiguous boundary girdling the pseudogenes eventually recedes on account of their explicit orchestration role, research in flora would no longer saunter compared to that on fauna.
Collapse
Affiliation(s)
- Naina Garewal
- Department of Biotechnology, Panjab University, Chandigarh, India
| | - Neetu Goyal
- Department of Biotechnology, Panjab University, Chandigarh, India
| | | | - Jagdeep Kaur
- Department of Biotechnology, Panjab University, Chandigarh, India
| | - Kashmir Singh
- Department of Biotechnology, Panjab University, Chandigarh, India
| |
Collapse
|
14
|
Ciomborowska-Basheer J, Staszak K, Kubiak MR, Makałowska I. Not So Dead Genes-Retrocopies as Regulators of Their Disease-Related Progenitors and Hosts. Cells 2021; 10:cells10040912. [PMID: 33921034 PMCID: PMC8071448 DOI: 10.3390/cells10040912] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 03/30/2021] [Accepted: 04/13/2021] [Indexed: 12/12/2022] Open
Abstract
Retroposition is RNA-based gene duplication leading to the creation of single exon nonfunctional copies. Nevertheless, over time, many of these duplicates acquire transcriptional capabilities. In human in most cases, these so-called retrogenes do not code for proteins but function as regulatory long noncoding RNAs (lncRNAs). The mechanisms by which they can regulate other genes include microRNA sponging, modulation of alternative splicing, epigenetic regulation and competition for stabilizing factors, among others. Here, we summarize recent findings related to lncRNAs originating from retrocopies that are involved in human diseases such as cancer and neurodegenerative, mental or cardiovascular disorders. Special attention is given to retrocopies that regulate their progenitors or host genes. Presented evidence from the literature and our bioinformatics analyses demonstrates that these retrocopies, often described as unimportant pseudogenes, are significant players in the cell’s molecular machinery.
Collapse
|
15
|
The mutational load in natural populations is significantly affected by high primary rates of retroposition. Proc Natl Acad Sci U S A 2021; 118:2013043118. [PMID: 33526666 PMCID: PMC8017666 DOI: 10.1073/pnas.2013043118] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The phenomenon of retroposition (the reintegration of reverse-transcribed RNA into the genome) has been well studied in comparisons between species and has been identified as a source of evolutionary innovation. However, less attention has been paid to possible negative effects of retroposition. To trace the evolutionary dynamics of these negative effects, our study uses a unique genomic dataset of house mouse populations. It reveals that the initial retroposition rate is very high and that most of these newly transposed retrocopies have a deleterious impact, apparently through modifying the expression of their parental genes. In humans, this effect is expected to cause disease alleles, and we propose that genetic screening should include the search for newly transposed retrocopies. Gene retroposition is known to contribute to patterns of gene evolution and adaptations. However, possible negative effects of gene retroposition remain largely unexplored since most previous studies have focused on between-species comparisons where negatively selected copies are mostly not observed, as they are quickly lost from populations. Here, we show for natural house mouse populations that the primary rate of retroposition is orders of magnitude higher than the long-term rate. Comparisons with single-nucleotide polymorphism distribution patterns in the same populations show that most retroposition events are deleterious. Transcriptomic profiling analysis shows that new retroposed copies become easily subject to transcription and have an influence on the expression levels of their parental genes, especially when transcribed in the antisense direction. Our results imply that the impact of retroposition on the mutational load has been highly underestimated in natural populations. This has additional implications for strategies of disease allele detection in humans.
Collapse
|
16
|
Jorquera R, González C, Clausen PTLC, Petersen B, Holmes DS. SinEx DB 2.0 update 2020: database for eukaryotic single-exon coding sequences. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6122466. [PMID: 33507271 PMCID: PMC7904048 DOI: 10.1093/database/baab002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 12/01/2020] [Accepted: 01/05/2021] [Indexed: 11/27/2022]
Abstract
Single-exon coding sequences (CDSs), also known as ‘single-exon genes’ (SEGs), are defined as nuclear, protein-coding genes that lack introns in their CDSs. They have been studied not only to determine their origin and evolution but also because their expression has been linked to several types of human cancers and neurological/developmental disorders, and many exhibit tissue-specific transcription. We developed SinEx DB that houses DNA and protein sequence information of SEGs from 10 mammalian genomes including human. SinEx DB includes their functional predictions (KOG (euKaryotic Orthologous Groups)) and the relative distribution of these functions within species. Here, we report SinEx 2.0, a major update of SinEx DB that includes information of the occurrence, distribution and functional prediction of SEGs from 60 completely sequenced eukaryotic genomes, representing animals, fungi, protists and plants. The information is stored in a relational database built with MySQL Server 5.7, and the complete dataset of SEG sequences and their GO (Gene Ontology) functional assignations are available for downloading. SinEx DB 2.0 was built with a novel pipeline that helps disambiguate single-exon isoforms from SEGs. SinEx DB 2.0 is the largest available database for SEGs and provides a rich source of information for advancing our understanding of the evolution, function of SEGs and their associations with disorders including cancers and neurological and developmental diseases. Database URL:http://v2.sinex.cl/
Collapse
Affiliation(s)
- R Jorquera
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Zañartu 1482, Ñuñoa Santiago 7780132, Chile
- Laboratorio Medicina Traslacional, Fundación Arturo López Pérez, José Manuel Infante 805, Providencia, Santiago 7500691, Chile
| | - C González
- Center for Bioinformatics and Genome Biology, Fundacion Ciencia & Vida, Zañartu 1482, Ñuñoa Santiago 7780132, Chile
- Centro de Genómica y Bioinformática, Universidad Mayor, Camino la pirámide 5750, Huechuraba, Santiago 8580745, Chile
| | - P T L C Clausen
- Department of Global Surveillance, Technical University of Denmark, Kemitorvet building 204, 2800 Kgs. Lyngby, Denmark
| | - B Petersen
- Section for Evolutionary Genomics, The GLOBE Institute, University of Copenhagen, Hovedstaden, Øster Voldgade 5–7, Copenhagen 1350, Denmark
- Centre of Excellence for Omics-Driven Computational Biodiscovery (COMBio), AIMST University, Batu 3 1/2, Jalan Bukit Air Nasi, 08100 Bedong, Kedah, Malaysia
| | - D S Holmes
- *Corresponding author: Tel: +56 2 22398969;
| |
Collapse
|
17
|
Cancer, Retrogenes, and Evolution. Life (Basel) 2021; 11:life11010072. [PMID: 33478113 PMCID: PMC7835786 DOI: 10.3390/life11010072] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 01/14/2021] [Accepted: 01/15/2021] [Indexed: 12/18/2022] Open
Abstract
This review summarizes the knowledge about retrogenes in the context of cancer and evolution. The retroposition, in which the processed mRNA from parental genes undergoes reverse transcription and the resulting cDNA is integrated back into the genome, results in additional copies of existing genes. Despite the initial misconception, retroposition-derived copies can become functional, and due to their role in the molecular evolution of genomes, they have been named the “seeds of evolution”. It is convincing that retrogenes, as important elements involved in the evolution of species, also take part in the evolution of neoplastic tumors at the cell and species levels. The occurrence of specific “resistance mechanisms” to neoplastic transformation in some species has been noted. This phenomenon has been related to additional gene copies, including retrogenes. In addition, the role of retrogenes in the evolution of tumors has been described. Retrogene expression correlates with the occurrence of specific cancer subtypes, their stages, and their response to therapy. Phylogenetic insights into retrogenes show that most cancer-related retrocopies arose in the lineage of primates, and the number of identified cancer-related retrogenes demonstrates that these duplicates are quite important players in human carcinogenesis.
Collapse
|
18
|
Zeng H, Chen X, Li H, Zhang J, Wei Z, Wang Y. Interpopulation differences of retroduplication variations (RDVs) in rice retrogenes and their phenotypic correlations. Comput Struct Biotechnol J 2021; 19:600-611. [PMID: 33510865 PMCID: PMC7811064 DOI: 10.1016/j.csbj.2020.12.046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 12/29/2020] [Accepted: 12/31/2020] [Indexed: 11/21/2022] Open
Abstract
Retroduplication variation (RDV), a type of retrocopy polymorphism, is considered to have essential biological significance, but its effect on gene function and species phenotype is still poorly understood. To this end, we analyzed the retrocopies and RDVs in 3,010 rice genomes. We calculated the RDV frequencies in the genome of each rice population; detected the mutated, ancestral and expressed retrogenes in rice genomes; and analyzed their RDV influence on rice phenotypic traits. Collectively, 73 RDVs were identified, and 14 RDVs in ancestral retrogenes can significantly affect rice phenotypes. Our research reveals that RDV plays an important role in rice migration, domestication and evolution. We think that RDV is a good molecular breeding marker candidate. To our knowledge, this is the first study on the relationship between retrogene function, expression, RDV and species phenotype.
Collapse
Affiliation(s)
- Haiyue Zeng
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
- Shennong Class, Southwest University, Chongqing 400715, China
| | - Xingyu Chen
- Shennong Class, Southwest University, Chongqing 400715, China
| | - Hongbo Li
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715
| | - Jun Zhang
- College of Computer & Information Science, Southwest University, Chongqing 400715, China
| | - Zhaoyuan Wei
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
| | - Yi Wang
- State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China
- Biological Science Research Center, Southwest University, Chongqing 400715, China
| |
Collapse
|
19
|
Abstract
The number of complete genome sequences explodes more and more with each passing year. Thus, methods for genome annotation need to be honed constantly to handle the deluge of information. Annotation of pseudogenes (i.e., gene copies that appear not to make a functional protein) in genomes is a persistent problem; here, we overview pseudogene annotation methods that are based on the detection of sequence homology in genomic DNA.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada.
| |
Collapse
|
20
|
Palazzo AF, Kang YM. GC-content biases in protein-coding genes act as an "mRNA identity" feature for nuclear export. Bioessays 2020; 43:e2000197. [PMID: 33165929 DOI: 10.1002/bies.202000197] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 09/30/2020] [Accepted: 10/01/2020] [Indexed: 01/11/2023]
Abstract
It has long been observed that human protein-coding genes have a particular distribution of GC-content: the 5' end of these genes has high GC-content while the 3' end has low GC-content. In 2012, it was proposed that this pattern of GC-content could act as an mRNA identity feature that would lead to it being better recognized by the cellular machinery to promote its nuclear export. In contrast, junk RNA, which largely lacks this feature, would be retained in the nucleus and targeted for decay. Now two recent papers have provided evidence that GC-content does promote the nuclear export of many mRNAs in human cells.
Collapse
Affiliation(s)
- Alexander F Palazzo
- Department of Biochemistry, University of Toronto, Toronto, ON, M5G 1M1, Canada
| | - Yoon Mo Kang
- Department of Biochemistry, University of Toronto, Toronto, ON, M5G 1M1, Canada
| |
Collapse
|
21
|
Batcher K, Dickinson P, Maciejczyk K, Brzeski K, Rasouliha SH, Letko A, Drögemüller C, Leeb T, Bannasch D. Multiple FGF4 Retrocopies Recently Derived within Canids. Genes (Basel) 2020; 11:genes11080839. [PMID: 32717834 PMCID: PMC7465015 DOI: 10.3390/genes11080839] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 07/21/2020] [Accepted: 07/21/2020] [Indexed: 12/17/2022] Open
Abstract
Two transcribed retrocopies of the fibroblast growth factor 4 (FGF4) gene have previously been described in the domestic dog. An FGF4 retrocopy on chr18 is associated with disproportionate dwarfism, while an FGF4 retrocopy on chr12 is associated with both disproportionate dwarfism and intervertebral disc disease (IVDD). In this study, whole-genome sequencing data were queried to identify other FGF4 retrocopies that could be contributing to phenotypic diversity in canids. Additionally, dogs with surgically confirmed IVDD were assayed for novel FGF4 retrocopies. Five additional and distinct FGF4 retrocopies were identified in canids including a copy unique to red wolves (Canis rufus). The FGF4 retrocopies identified in domestic dogs were identical to domestic dog FGF4 haplotypes, which are distinct from modern wolf FGF4 haplotypes, indicating that these retrotransposition events likely occurred after domestication. The identification of multiple, full length FGF4 retrocopies with open reading frames in canids indicates that gene retrotransposition events occur much more frequently than previously thought and provide a mechanism for continued genetic and phenotypic diversity in canids.
Collapse
Affiliation(s)
- Kevin Batcher
- Department of Population Health and Reproduction, University of California-Davis, Davis, CA 95616, USA; (K.B.); (K.M.)
| | - Peter Dickinson
- Department of Surgical and Radiological Sciences, University of California-Davis, Davis, CA 95616, USA;
| | - Kimberly Maciejczyk
- Department of Population Health and Reproduction, University of California-Davis, Davis, CA 95616, USA; (K.B.); (K.M.)
| | - Kristin Brzeski
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA;
| | - Sheida Hadji Rasouliha
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Anna Letko
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Cord Drögemüller
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, 3012 Bern, Switzerland; (S.H.R.); (A.L.); (C.D.); (T.L.)
| | - Danika Bannasch
- Department of Population Health and Reproduction, University of California-Davis, Davis, CA 95616, USA; (K.B.); (K.M.)
- Correspondence:
| |
Collapse
|
22
|
Complex Analysis of Retroposed Genes' Contribution to Human Genome, Proteome and Transcriptome. Genes (Basel) 2020; 11:genes11050542. [PMID: 32408516 PMCID: PMC7290577 DOI: 10.3390/genes11050542] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/06/2020] [Accepted: 05/08/2020] [Indexed: 02/07/2023] Open
Abstract
Gene duplication is a major driver of organismal evolution. One of the main mechanisms of gene duplications is retroposition, a process in which mRNA is first transcribed into DNA and then reintegrated into the genome. Most gene retrocopies are depleted of the regulatory regions. Nevertheless, examples of functional retrogenes are rapidly increasing. These functions come from the gain of new spatio-temporal expression patterns, imposed by the content of the genomic sequence surrounding inserted cDNA and/or by selectively advantageous mutations, which may lead to the switch from protein coding to regulatory RNA. As recent studies have shown, these genes may lead to new protein domain formation through fusion with other genes, new regulatory RNAs or other regulatory elements. We utilized existing data from high-throughput technologies to create a complex description of retrogenes functionality. Our analysis led to the identification of human retroposed genes that substantially contributed to transcriptome and proteome. These retrocopies demonstrated the potential to encode proteins or short peptides, act as cis- and trans- Natural Antisense Transcripts (NATs), regulate their progenitors’ expression by competing for the same microRNAs, and provide a sequence to lncRNA and novel exons to existing protein-coding genes. Our study also revealed that retrocopies, similarly to retrotransposons, may act as recombination hot spots. To our best knowledge this is the first complex analysis of these functions of retrocopies.
Collapse
|
23
|
Mordstein C, Savisaar R, Young RS, Bazile J, Talmane L, Luft J, Liss M, Taylor MS, Hurst LD, Kudla G. Codon Usage and Splicing Jointly Influence mRNA Localization. Cell Syst 2020; 10:351-362.e8. [PMID: 32275854 PMCID: PMC7181179 DOI: 10.1016/j.cels.2020.03.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 12/19/2019] [Accepted: 03/05/2020] [Indexed: 12/11/2022]
Abstract
In the human genome, most genes undergo splicing, and patterns of codon usage are splicing dependent: guanine and cytosine (GC) content is the highest within single-exon genes and within first exons of multi-exon genes. However, the effects of codon usage on gene expression are typically characterized in unspliced model genes. Here, we measured the effects of splicing on expression in a panel of synonymous reporter genes that varied in nucleotide composition. We found that high GC content increased protein yield, mRNA yield, cytoplasmic mRNA localization, and translation of unspliced reporters. Splicing did not affect the expression of GC-rich variants. However, splicing promoted the expression of AT-rich variants by increasing their steady-state protein and mRNA levels, in part through promoting cytoplasmic localization of mRNA. We propose that splicing promotes the nuclear export of AU-rich mRNAs and that codon- and splicing-dependent effects on expression are under evolutionary pressure in the human genome.
Collapse
Affiliation(s)
- Christine Mordstein
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK; Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Rosina Savisaar
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK; Instituto de Medicina Molecular, João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| | - Robert S Young
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK; Centre for Global Health Research, Usher Institute, The University of Edinburgh, Edinburgh, UK
| | - Jeanne Bazile
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Lana Talmane
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Juliet Luft
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Michael Liss
- Thermo Fisher Scientific, GENEART GmbH, Regensburg, Germany
| | - Martin S Taylor
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
24
|
Aoto S, Katagiri S, Wang Y, Pagnamenta AT, Sakamoto-Abutani R, Toyoda M, Umezawa A, Okamura K. Frequent retrotransposition of endogenous genes in ERCC2-deficient cells derived from a patient with xeroderma pigmentosum. Stem Cell Res Ther 2019; 10:273. [PMID: 31455402 PMCID: PMC6712803 DOI: 10.1186/s13287-019-1381-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 08/04/2019] [Accepted: 08/12/2019] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Retrotransposition of protein-coding genes is thought to occur due to the existence of numerous processed pseudogenes in both animals and plants. Unlike retrotransposons including Alu and LINE-1, direct evidence of such retrotransposition events has not been reported to date. Even if such an event occurs in a somatic cell, it is almost impossible to detect it using bulk of cells as a sample. Single-cell analyses or other techniques are needed. METHODS In order to examine genetic stability of stem cells, we have established induced pluripotent stem cell (iPSC) lines from several patients with DNA repair-deficiency disorders, such as ataxia telangiectasia and xeroderma pigmentosum, along with healthy controls. Performing whole-exome sequencing analyses of these parental and iPSC lines, we compiled somatic mutations accumulated by the deficiency of DNA repair mechanisms. Whereas most somatic mutations cannot be detected in bulk, cell reprogramming enabled us to observe all the somatic mutations which had occurred in the cell line. Patterns of somatic mutations should be distinctive depending on which DNA repair gene is impaired. RESULTS The comparison revealed that deficiency of ATM and XPA preferentially gives rise to indels and single-nucleotide substitutions, respectively. On the other hand, deficiency of ERCC2 caused not only single-nucleotide mutations but also many retrotranspositions of endogenous genes, which were readily identified by examining removal of introns in whole-exome sequencing. Although the number was limited, those events were also detected in healthy control samples. CONCLUSIONS The present study exploits clonality of iPSCs to unveil somatic mutation sets that are usually hidden in bulk cell analysis. Whole-exome sequencing analysis facilitated the detection of retrotransposition mutations. The results suggest that retrotranspositions of human endogenous genes are more frequent than expected in somatic cells and that ERCC2 plays a defensive role against transposition of endogenous and exogenous DNA fragments.
Collapse
Affiliation(s)
- Saki Aoto
- Medical Genome Center, National Center for Child Health and Development Research Institute, Setagaya, Tokyo, Japan
| | - Saki Katagiri
- Department of Biology, Faculty of Science, Ochanomizu University, Bunkyo, Tokyo, Japan
- Present address: Division of Embryology, National Institute for Basic Biology, Okazaki, Aichi Japan
| | - Yi Wang
- Ministry of Education Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, Shanghai, China
- Human Phenome Institute, Fudan University, Shanghai, China
| | | | - Rie Sakamoto-Abutani
- Department of Reproductive Biology, National Center for Child Health and Development Research Institute, Setagaya, Tokyo, Japan
| | - Masashi Toyoda
- Research team for Geriatric Medicine, Tokyo Metropolitan Institute of Gerontology, Setagaya, Tokyo, Japan
| | - Akihiro Umezawa
- Department of Reproductive Biology, National Center for Child Health and Development Research Institute, Setagaya, Tokyo, Japan
- Center for Regenerative Medicine, National Center for Child Health and Development Research Institute, 2-10-1 Okura, Setagaya, Tokyo, 157-8535 Japan
| | - Kohji Okamura
- Department of Systems BioMedicine, National Center for Child Health and Development Research Institute, Tokyo, Japan
| |
Collapse
|
25
|
Szcześniak MW, Wanowska E, Mukherjee N, Ohler U, Makałowska I. Towards a deeper annotation of human lncRNAs. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194385. [PMID: 31128317 DOI: 10.1016/j.bbagrm.2019.05.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 05/13/2019] [Accepted: 05/14/2019] [Indexed: 01/05/2023]
Abstract
A substantial fraction of the human transcriptome is composed of the so-called long noncoding RNAs (lncRNAs), yet the available catalogs of known lncRNAs are far from complete. Moreover, functional studies of these RNAs are challenged by several factors, such as their tissue-specific expression and functional heterogeneity, resulting in only ca. 1% of them being well characterized. Here, we describe a set of 41,400 novel lncRNAs discovered with RNA-Seq data from 1463 samples encompassing diverse tissues and cell lines. We utilized publicly available transcriptomic and genomic data to provide their characteristics, such as tissue specificity, cellular abundance, polyA status, cellular localization, evolutionary conservation and transcript stability, which allowed us to speculate on their possible biological roles. We also pinpointed 24 novel lncRNAs as candidates for breast cancer biomarkers. The results bring us closer to a comprehensive annotation of human lncRNAs, though vast amounts of further work are needed to validate the predictions and fully decipher their biology. This article is part of a Special Issue entitled: ncRNA in control of gene expression edited by Kotb Abdelmohsen.
Collapse
Affiliation(s)
- Michał Wojciech Szcześniak
- Adam Mickiewicz University in Poznan, Institute of Anthropology, Laboratory of Integrative Genomics, Uniwersytetu Poznańskiego 6, 61-614 Poznan, Poland; Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Hannoversche Str. 28, 10115 Berlin, Germany.
| | - Elżbieta Wanowska
- Adam Mickiewicz University in Poznan, Institute of Anthropology, Laboratory of Integrative Genomics, Uniwersytetu Poznańskiego 6, 61-614 Poznan, Poland
| | - Neelanjan Mukherjee
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Hannoversche Str. 28, 10115 Berlin, Germany; Department of Biochemistry and Molecular Genetics, RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Uwe Ohler
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Hannoversche Str. 28, 10115 Berlin, Germany; Humboldt University, Department of Computer Science, Unter den Linden 6, 10099 Berlin, Germany
| | - Izabela Makałowska
- Adam Mickiewicz University in Poznan, Institute of Anthropology, Laboratory of Integrative Genomics, Uniwersytetu Poznańskiego 6, 61-614 Poznan, Poland.
| |
Collapse
|
26
|
Tsai KL, Evans JM, Noorai RE, Starr-Moss AN, Clark LA. Novel Y Chromosome Retrocopies in Canids Revealed through a Genome-Wide Association Study for Sex. Genes (Basel) 2019; 10:genes10040320. [PMID: 31027231 PMCID: PMC6523286 DOI: 10.3390/genes10040320] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 04/17/2019] [Accepted: 04/18/2019] [Indexed: 12/12/2022] Open
Abstract
The lack of an annotated reference sequence for the canine Y chromosome has limited evolutionary studies, as well as our understanding of the role of Y-linked sequences in phenotypes with a sex bias. In genome-wide association studies (GWASs), we observed spurious associations with autosomal SNPs when sex was unbalanced in case-control cohorts and hypothesized that a subset of SNPs mapped to autosomes are in fact sex-linked. Using the Illumina 230K CanineHD array in a GWAS for sex, we identified SNPs that amplify in both sexes but possess significant allele frequency differences between males and females. We found 48 SNPs mapping to 14 regions of eight autosomes and the X chromosome that are Y-linked, appearing heterozygous in males and monomorphic in females. Within these 14 regions are eight genes: three autosomal and five X-linked. We investigated the autosomal genes (MITF, PPP2CB, and WNK1) and determined that the SNPs are diverged nucleotides in retrocopies that have transposed to the Y chromosome. MITFY and WNK1Y are expressed and appeared recently in the Canidae lineage, whereas PPP2CBY represents a much older insertion with no evidence of expression in the dog. This work reveals novel canid Y chromosome sequences and provides evidence for gene transposition to the Y from autosomes and the X.
Collapse
Affiliation(s)
- Kate L Tsai
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.
| | - Jacquelyn M Evans
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-2152, USA.
| | - Rooksana E Noorai
- Clemson University Genomics and Bioinformatics Facility, Clemson University, Clemson, SC 29634, USA.
| | - Alison N Starr-Moss
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.
| | - Leigh Anne Clark
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.
| |
Collapse
|
27
|
Protein-Coding Genes' Retrocopies and Their Functions. Viruses 2017; 9:v9040080. [PMID: 28406439 PMCID: PMC5408686 DOI: 10.3390/v9040080] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2017] [Revised: 04/07/2017] [Accepted: 04/11/2017] [Indexed: 12/11/2022] Open
Abstract
Transposable elements, often considered to be not important for survival, significantly contribute to the evolution of transcriptomes, promoters, and proteomes. Reverse transcriptase, encoded by some transposable elements, can be used in trans to produce a DNA copy of any RNA molecule in the cell. The retrotransposition of protein-coding genes requires the presence of reverse transcriptase, which could be delivered by either non-long terminal repeat (non-LTR) or LTR transposons. The majority of these copies are in a state of “relaxed” selection and remain “dormant” because they are lacking regulatory regions; however, many become functional. In the course of evolution, they may undergo subfunctionalization, neofunctionalization, or replace their progenitors. Functional retrocopies (retrogenes) can encode proteins, novel or similar to those encoded by their progenitors, can be used as alternative exons or create chimeric transcripts, and can also be involved in transcriptional interference and participate in the epigenetic regulation of parental gene expression. They can also act in trans as natural antisense transcripts, microRNA (miRNA) sponges, or a source of various small RNAs. Moreover, many retrocopies of protein-coding genes are linked to human diseases, especially various types of cancer.
Collapse
|