1
|
Zhao P, Peng C, Fang L, Wang Z, Liu GE. Taming transposable elements in livestock and poultry: a review of their roles and applications. Genet Sel Evol 2023; 55:50. [PMID: 37479995 PMCID: PMC10362595 DOI: 10.1186/s12711-023-00821-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/30/2023] [Indexed: 07/23/2023] Open
Abstract
Livestock and poultry play a significant role in human nutrition by converting agricultural by-products into high-quality proteins. To meet the growing demand for safe animal protein, genetic improvement of livestock must be done sustainably while minimizing negative environmental impacts. Transposable elements (TE) are important components of livestock and poultry genomes, contributing to their genetic diversity, chromatin states, gene regulatory networks, and complex traits of economic value. However, compared to other species, research on TE in livestock and poultry is still in its early stages. In this review, we analyze 72 studies published in the past 20 years, summarize the TE composition in livestock and poultry genomes, and focus on their potential roles in functional genomics. We also discuss bioinformatic tools and strategies for integrating multi-omics data with TE, and explore future directions, feasibility, and challenges of TE research in livestock and poultry. In addition, we suggest strategies to apply TE in basic biological research and animal breeding. Our goal is to provide a new perspective on the importance of TE in livestock and poultry genomes.
Collapse
Affiliation(s)
- Pengju Zhao
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Chen Peng
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Lingzhao Fang
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark.
| | - Zhengguang Wang
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China.
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, 20705, USA.
| |
Collapse
|
2
|
Eskier D, Arıbaş A, Karakülah G. PlanTEnrichment: A How-to Guide on Rapid Identification of Transposable Elements Associated with Regions of Interest in Select Plant Genomes. Methods Mol Biol 2023; 2703:59-70. [PMID: 37646937 DOI: 10.1007/978-1-0716-3389-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Transposable elements (TEs) are repeat elements that can relocate or create novel copies of themselves in the genome and contribute to genomic complexity and expansion, via events such as chromosome recombination or regulation of gene expression. However, given the large number of such repeats across the genome, identifying repeats of interest can be a challenge in even well-annotated genomes, especially in more complex, TE-rich plant genomes. Here, we describe a protocol for PlanTEnrichment, a database we created comprising information on 11 plant genomes to analyze stress-associated TEs using publicly available data. By selecting a genome and providing a list of genes or genomic regions whose TE associations the user wants to identify, the user can rapidly obtain TE subfamilies found near the provided regions, as well as their superfamily and class, and the enrichment values of the repeats. The results also provide the locations of individual repeat instances found, alongside the input regions or genes they are associated with, and a bar graph of the top ten most significant repeat subfamilies identified. PlanTEnrichment is freely available at http://tools.ibg.deu.edu.tr/plantenrichment/ and can be used by researchers with rudimentary or no proficiency in computational analysis of TE elements, allowing for expedience in the identification of TEs of interest and helping further our understanding of the potential contributions of TEs in plant genomes.
Collapse
Affiliation(s)
- Doğa Eskier
- İzmir International Biomedicine and Genome Institute, Dokuz Eylül University, İnciraltı, İzmir, Turkey
- Bioinformatics Platform, İzmir Biomedicine and Genome Center (IBG), İnciraltı, İzmir, Turkey
| | - Alirıza Arıbaş
- Bioinformatics Platform, İzmir Biomedicine and Genome Center (IBG), İnciraltı, İzmir, Turkey
| | - Gökhan Karakülah
- İzmir International Biomedicine and Genome Institute, Dokuz Eylül University, İnciraltı, İzmir, Turkey.
- Bioinformatics Platform, İzmir Biomedicine and Genome Center (IBG), İnciraltı, İzmir, Turkey.
| |
Collapse
|
3
|
Valdebenito-Maturana B, Rojas-Tapia MI, Carrasco M, Tapia JC. Dysregulated Expression of Transposable Elements in TDP-43 M337V Human Motor Neurons That Recapitulate Amyotrophic Lateral Sclerosis In Vitro. Int J Mol Sci 2022; 23:ijms232416222. [PMID: 36555863 PMCID: PMC9784876 DOI: 10.3390/ijms232416222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/24/2022] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a disease that progressively annihilates spinal cord motor neurons, causing severe motor decline and death. The disease is divided into familial and sporadic ALS. Mutations in the TAR DNA binding protein 43 (TDP-43) have been involved in the pathological emergence and progression of ALS, although the molecular mechanisms eliciting the disease are unknown. Transposable elements (TEs) and DNA sequences capable of transposing within the genome become dysregulated and transcribed in the presence of TDP-43 mutations. We performed RNA-Seq in human motor neurons (iMNs) derived from induced pluripotent stem cells (iPSCs) from TDP-43 wild-type-iMNs-TDP-43WT-and mutant-iMNs-TDP-43M337V-genotypes at 7 and 14 DIV, and, with state-of-the-art bioinformatic tools, analyzed whether TDP-43M337V alters both gene expression and TE activity. Our results show that TDP-43M337V induced global changes in the gene expression and TEs levels at all in vitro stages studied. Interestingly, many genetic pathways overlapped with that of the TEs activity, suggesting that TEs control the expression of several genes. TEs correlated with genes that played key roles in the extracellular matrix and RNA processing: all the regulatory pathways affected in ALS. Thus, the loss of TE regulation is present in TDP-43 mutations and is a critical determinant of the disease in human motor neurons. Overall, our results support the evidence that indicates TEs are critical regulatory sequences contributing to ALS neurodegeneration.
Collapse
Affiliation(s)
- Braulio Valdebenito-Maturana
- Instituto de Investigación Interdisciplinaria, Vicerrectoría Académica, Universidad de Talca, Campus Talca, Talca 3460000, Chile
| | | | - Mónica Carrasco
- Escuela de Medicina, Universidad de Talca, Campus Talca, Talca 3460000, Chile
- Correspondence: (M.C.); (J.C.T.)
| | - Juan Carlos Tapia
- Escuela de Medicina, Universidad de Talca, Campus Talca, Talca 3460000, Chile
- Correspondence: (M.C.); (J.C.T.)
| |
Collapse
|
4
|
Rodríguez-Quiroz R, Valdebenito-Maturana B. SoloTE for improved analysis of transposable elements in single-cell RNA-Seq data using locus-specific expression. Commun Biol 2022; 5:1063. [PMID: 36202992 DOI: 10.1038/s42003-022-04020-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 09/21/2022] [Indexed: 11/08/2022] Open
Abstract
Transposable Elements (TEs) contribute to the repetitive fraction in almost every eukaryotic genome known to date, and their transcriptional activation can influence the expression of neighboring genes in healthy and disease states. Single cell RNA-Seq (scRNA-Seq) is a technical advance that allows the study of gene expression on a cell-by-cell basis. Although a current computational approach is available for the single cell analysis of TE expression, it omits their genomic location. Here we show SoloTE, a pipeline that outperforms the previous approach in terms of computational resources and by allowing the inclusion of locus-specific TE activity in scRNA-Seq expression matrixes. We then apply SoloTE to several datasets to reveal the repertoire of TEs that become transcriptionally active in different cell groups, and based on their genomic location, we predict their potential impact on gene expression. As our tool takes as input the resulting files from standard scRNA-Seq processing pipelines, we expect it to be widely adopted in single cell studies to help researchers discover patterns of cellular diversity associated with TE expression.
Collapse
|
5
|
Powell AF, Feder A, Li J, Schmidt MHW, Courtney L, Alseekh S, Jobson EM, Vogel A, Xu Y, Lyon D, Dumschott K, McHale M, Sulpice R, Bao K, Lal R, Duhan A, Hallab A, Denton AK, Bolger ME, Fernie AR, Hind SR, Mueller LA, Martin GB, Fei Z, Martin C, Giovannoni JJ, Strickler SR, Usadel B. A Solanum lycopersicoides reference genome facilitates insights into tomato specialized metabolism and immunity. Plant J 2022; 110:1791-1810. [PMID: 35411592 DOI: 10.1111/tpj.15770] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/10/2022] [Accepted: 03/27/2022] [Indexed: 06/14/2023]
Abstract
Wild relatives of tomato are a valuable source of natural variation in tomato breeding, as many can be hybridized to the cultivated species (Solanum lycopersicum). Several, including Solanum lycopersicoides, have been crossed to S. lycopersicum for the development of ordered introgression lines (ILs), facilitating breeding for desirable traits. Despite the utility of these wild relatives and their associated ILs, few finished genome sequences have been produced to aid genetic and genomic studies. Here we report a chromosome-scale genome assembly for S. lycopersicoides LA2951, which contains 37 938 predicted protein-coding genes. With the aid of this genome assembly, we have precisely delimited the boundaries of the S. lycopersicoides introgressions in a set of S. lycopersicum cv. VF36 × LA2951 ILs. We demonstrate the usefulness of the LA2951 genome by identifying several quantitative trait loci for phenolics and carotenoids, including underlying candidate genes, and by investigating the genome organization and immunity-associated function of the clustered Pto gene family. In addition, syntenic analysis of R2R3MYB genes sheds light on the identity of the Aubergine locus underlying anthocyanin production. The genome sequence and IL map provide valuable resources for studying fruit nutrient/quality traits, pathogen resistance, and environmental stress tolerance. We present a new genome resource for the wild species S. lycopersicoides, which we use to shed light on the Aubergine locus responsible for anthocyanin production. We also provide IL boundary mappings, which facilitated identifying novel carotenoid quantitative trait loci of which one was likely driven by an uncharacterized lycopene β-cyclase whose function we demonstrate.
Collapse
Affiliation(s)
| | - Ari Feder
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
| | - Jie Li
- Department of Biochemistry and Metabolism, The John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| | - Maximilian H-W Schmidt
- Institute for Biology I, BioSC, RWTH Aachen University, 52474, Aachen, Germany
- IBG-4 Bioinformatics, Forschungszentrum Jülich, 52428, Jülich, Germany
| | - Lance Courtney
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
- Plant Biology Section, School of Integrative Plant Sciences, Cornell University, Ithaca, NY, 14853, USA
| | - Saleh Alseekh
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology, 4000, Plovdiv, Bulgaria
| | - Emma M Jobson
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
| | - Alexander Vogel
- Institute for Biology I, BioSC, RWTH Aachen University, 52474, Aachen, Germany
| | - Yimin Xu
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
| | - David Lyon
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, 94720, USA
| | - Kathryn Dumschott
- IBG-4 Bioinformatics, Forschungszentrum Jülich, 52428, Jülich, Germany
| | - Marcus McHale
- Plant Systems Biology Lab, Ryan Institute, National University of Ireland, H91 TK33, Galway, Ireland
| | - Ronan Sulpice
- Plant Systems Biology Lab, Ryan Institute, National University of Ireland, H91 TK33, Galway, Ireland
| | - Kan Bao
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
| | - Rohit Lal
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
| | - Asha Duhan
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
| | - Asis Hallab
- IBG-4 Bioinformatics, Forschungszentrum Jülich, 52428, Jülich, Germany
| | - Alisandra K Denton
- Institute for Biology I, BioSC, RWTH Aachen University, 52474, Aachen, Germany
| | - Marie E Bolger
- IBG-4 Bioinformatics, Forschungszentrum Jülich, 52428, Jülich, Germany
| | - Alisdair R Fernie
- Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476, Potsdam-Golm, Germany
- Center of Plant Systems Biology and Biotechnology, 4000, Plovdiv, Bulgaria
| | - Sarah R Hind
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | | | - Gregory B Martin
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
- Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, 14853, USA, and
| | - Zhangjun Fei
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
- US Department of Agriculture-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY, 14853, USA
| | - Cathie Martin
- Department of Biochemistry and Metabolism, The John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK
| | - James J Giovannoni
- Boyce Thompson Institute, Ithaca, New York, 14853, USA
- US Department of Agriculture-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY, 14853, USA
| | | | - Björn Usadel
- Institute for Biology I, BioSC, RWTH Aachen University, 52474, Aachen, Germany
- IBG-4 Bioinformatics, Forschungszentrum Jülich, 52428, Jülich, Germany
| |
Collapse
|
6
|
Yandım C, Karakülah G. Repeat expression is linked to patient survival and exhibits single nucleotide variation in pancreatic cancer revealing LTR70:r.879A>G. Gene X 2022; 822:146344. [PMID: 35183687 DOI: 10.1016/j.gene.2022.146344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/03/2022] [Accepted: 02/14/2022] [Indexed: 11/04/2022] Open
Abstract
Despite an overwhelming number of cancer literature reporting the links between patient survival and the expression levels of genes or mutations/single nucleotide variations (SNVs) on them, there is only limited information on repeat elements, which make at least half the human genome. Here, we analysed RNA-seq data obtained from primary pancreatic cancer tissues of 51 patients and revealed that two transposons, HERVI-int and X6A_LINE, showed an upregulation trend in the patients who lived shorter, along with 56 other potential repeats which were linked to survival. We also detected expressed single nucleotide variations (SNVs) on repeats, among which LTR70:r.879A>G stands out with the effect of its presence on this particular repeat's expression levels and a significant link to overall patient survival. Interestingly, the expression of LTR70:r.879A>G correlated with different cancer genes in comparison to its reference version highlighting the involvement of BRAF and Fumerate Hydratase with this expressed SNV. This is one of the first studies revealing possible links between repeat expression and survival in cancer and it warrants further research in this avenue.
Collapse
Affiliation(s)
- Cihangir Yandım
- İzmir University of Economics, Faculty of Engineering, Department of Genetics and Bioengineering, 35330 Balçova, İzmir, Turkey; İzmir Biomedicine and Genome Center (IBG), Dokuz Eylül University Health Campus, 35340 İnciraltı, İzmir, Turkey
| | - Gökhan Karakülah
- İzmir Biomedicine and Genome Center (IBG), Dokuz Eylül University Health Campus, 35340 İnciraltı, İzmir, Turkey; İzmir International Biomedicine and Genome Institute, Dokuz Eylül University, 35340 İnciraltı, İzmir, Turkey.
| |
Collapse
|
7
|
Lerat E. Recent Bioinformatic Progress to Identify Epigenetic Changes Associated to Transposable Elements. Front Genet 2022; 13:891194. [PMID: 35646069 PMCID: PMC9140218 DOI: 10.3389/fgene.2022.891194] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 04/25/2022] [Indexed: 11/13/2022] Open
Abstract
Transposable elements (TEs) are recognized for their great impact on the functioning and evolution of their host genomes. They are associated to various deleterious effects, which has led to the evolution of regulatory epigenetic mechanisms to control their activity. Despite these negative effects, TEs are also important actors in the evolution of genomes by promoting genetic diversity and new regulatory elements. Consequently, it is important to study the epigenetic modifications associated to TEs especially at a locus-specific level to determine their individual influence on gene functioning. To this aim, this short review presents the current bioinformatic tools to achieve this task.
Collapse
|
8
|
Valdebenito-Maturana B, Torres F, Carrasco M, Tapia JC. Differential regulation of transposable elements (TEs) during the murine submandibular gland development. Mob DNA 2021; 12:23. [PMID: 34686213 PMCID: PMC8540199 DOI: 10.1186/s13100-021-00251-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 10/05/2021] [Indexed: 11/29/2022] Open
Abstract
The submandibular gland (SG) is a relatively simple organ formed by three cell types: acinar, myoepithelial, and an intricate network of duct-forming epithelial cells, that together fulfills several physiological functions from assisting food digestion to acting as an immune barrier against pathogens. Successful SG organogenesis is the product of highly controlled and orchestrated genetic and transcriptional programs. Mounting evidence links Transposable Elements (TEs), originally thought to be selfish genetic elements, to different aspects of gene regulation in mammalian development and disease. To our knowledge, the role of TEs during murine SG organogenesis has not been studied. Using novel bioinformatic tools and publicly available RNA-Seq datasets, our results indicate that a significant number of genic and intergenic TEs are differentially expressed during the SG development. Furthermore, changes in expression of specific TEs correlated with that of genes involved in cellular division and differentiation, critical aspects for SG maturation. Altogether, we propose that TEs modulate gene networks that operate during SG development.
Collapse
Affiliation(s)
| | - Francisca Torres
- Stem Cells and Neuroscience Center, School of Medicine, University of Talca, Campus Talca, Talca, Chile
| | - Mónica Carrasco
- Stem Cells and Neuroscience Center, School of Medicine, University of Talca, Campus Talca, Talca, Chile.
| | - Juan Carlos Tapia
- Stem Cells and Neuroscience Center, School of Medicine, University of Talca, Campus Talca, Talca, Chile.
| |
Collapse
|
9
|
Wang Y, Zhao B, Choi J, Lee EA. Genomic approaches to trace the history of human brain evolution with an emerging opportunity for transposon profiling of ancient humans. Mob DNA 2021; 12:22. [PMID: 34663455 PMCID: PMC8525043 DOI: 10.1186/s13100-021-00250-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 09/27/2021] [Indexed: 12/17/2022] Open
Abstract
Transposable elements (TEs) significantly contribute to shaping the diversity of the human genome, and lines of evidence suggest TEs as one of driving forces of human brain evolution. Existing computational approaches, including cross-species comparative genomics and population genetic modeling, can be adapted for the study of the role of TEs in evolution. In particular, diverse ancient and archaic human genome sequences are increasingly available, allowing reconstruction of past human migration events and holding the promise of identifying and tracking TEs among other evolutionarily important genetic variants at an unprecedented spatiotemporal resolution. However, highly degraded short DNA templates and other unique challenges presented by ancient human DNA call for major changes in current experimental and computational procedures to enable the identification of evolutionarily important TEs. Ancient human genomes are valuable resources for investigating TEs in the evolutionary context, and efforts to explore ancient human genomes will potentially provide a novel perspective on the genetic mechanism of human brain evolution and inspire a variety of technological and methodological advances. In this review, we summarize computational and experimental approaches that can be adapted to identify and validate evolutionarily important TEs, especially for human brain evolution. We also highlight strategies that leverage ancient genomic data and discuss unique challenges in ancient transposon genomics.
Collapse
Affiliation(s)
- Yilan Wang
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
| | - Boxun Zhao
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA
| | - Jaejoon Choi
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA.
- The Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, MA, USA.
| |
Collapse
|
10
|
Valdebenito-Maturana B, Arancibia E, Riadi G, Tapia JC, Carrasco M. Locus-specific analysis of Transposable Elements during the progression of ALS in the SOD1G93A mouse model. PLoS One 2021; 16:e0258291. [PMID: 34614020 PMCID: PMC8494334 DOI: 10.1371/journal.pone.0258291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 09/24/2021] [Indexed: 11/19/2022] Open
Abstract
Transposable Elements (TEs) are ubiquitous genetic elements with the ability to move within a genome. TEs contribute to a large fraction of the repetitive elements of a genome, and because of their nature, they are not routinely analyzed in RNA-Seq gene expression studies. Amyotrophic Lateral Sclerosis (ALS) is a lethal neurodegenerative disease, and a well-accepted model for its study is the mouse harboring the human SOD1G93A mutant. In this model, landmark stages of the disease can be recapitulated at specific time points, making possible to understand changes in gene expression across time. While there are several works reporting TE activity in ALS models, they have not explored their activity through the disease progression. Moreover, they have done it at the expense of losing their locus of expression. Depending on their genomic location, TEs can regulate genes in cis and in trans, making locus-specific analysis of TEs of importance in order to understand their role in modulating gene expression. Particularly, the locus-specific role of TEs in ALS has not been fully elucidated. In this work, we analyzed publicly available RNA-Seq datasets of the SOD1G93A mouse model, to understand the locus-specific role of TEs. We show that TEs become up-regulated at the early stages of the disease, and via statistical associations, we speculate that they can regulate several genes, which in turn might be contributing to the genetic dysfunction observed in ALS.
Collapse
Affiliation(s)
| | - Esteban Arancibia
- Centre for Bioinformatics, Simulation and Modelling, CBSM, Department of Bioinformatics, Faculty of Engineering, University of Talca, Talca, Chile
| | - Gonzalo Riadi
- ANID – Millennium Science Initiative Program Millennium Nucleus of Ion Channels-Associated Diseases (MiNICAD), Centre for Bioinformatics, Simulation and Modelling, CBSM, Department of Bioinformatics, Faculty of Engineering, University of Talca, Talca, Chile
| | - Juan Carlos Tapia
- School of Medicine, Universidad de Talca, Talca, Chile
- * E-mail: (JCT); (MC)
| | - Mónica Carrasco
- School of Medicine, Universidad de Talca, Talca, Chile
- * E-mail: (JCT); (MC)
| |
Collapse
|
11
|
Stockwell PA, Lynch-Sutherland CF, Chatterjee A, Macaulay EC, Eccles MR. RepExpress: A Novel Pipeline for the Quantification and Characterization of Transposable Element Expression from RNA-seq Data. Curr Protoc 2021; 1:e206. [PMID: 34387946 DOI: 10.1002/cpz1.206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Transposable elements (TEs) are key regulators of both development and disease; however, their repetitive nature presents substantial computational challenges to their analysis. Due to a lack of computational tools and suitable analysis frameworks, TE expression is often not quantified at the locus level. Therefore, we have developed RepExpress, a novel pipeline that enables locus-level TE quantification and characterization. RepExpress enables the characterization of TE expression in a genomic context, and is the first tool focusing on the identification of tissue-specific TE-derived and TE-regulated genes. RepExpress identifies expressed TEs overlapping with annotated genomic features and enables tissue-specific profiles of TE-derived genes. TEs that are expressed with no overlap with any known genomic features are characterized by the closest downstream genomic feature enabling identification of novel TE-gene regulatory relationships. RepExpress takes standard RNA-seq data as input and performs genomic alignment optimized for TEs. Our novel pipeline quantifies expression of both TEs and genes using featureCounts and Stringtie, respectively. RepExpress then filters expressed repeats and characterizes their genomic context, enabling the identification of TEs that overlap with genes, or that may be influencing gene expression. Here, we describe RepExpress, and provide a step-by-step protocol detailing its workflow. We also discuss other TE analysis tools and their applicability to addressing different biological questions. © 2021 Wiley Periodicals LLC. Basic Protocol: RepExpress workflow.
Collapse
Affiliation(s)
- Peter A Stockwell
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | | | - Aniruddha Chatterjee
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand.,Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
| | - Erin C Macaulay
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | - Michael R Eccles
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand.,Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
| |
Collapse
|
12
|
Rostami MR, Bradic M. The derepression of transposable elements in lung cells is associated with the inflammatory response and gene activation in idiopathic pulmonary fibrosis. Mob DNA 2021; 12:14. [PMID: 34108012 PMCID: PMC8191028 DOI: 10.1186/s13100-021-00241-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 04/26/2021] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Transposable elements (TEs) are repetitive sequences of viral origin that compose almost half of the human genome. These elements are tightly controlled within cells, and if activated, they can cause changes in both gene regulation and immune viral responses that have been associated with several chronic inflammatory diseases in humans. As oxidants are potent activators of TEs, and because oxidative injury is a major risk factor in relation to idiopathic pulmonary fibrosis (IPF), we hypothesized that TEs might be involved in the regulation of gene expression and so contribute to inflammation in cases of IPF. IPF is a fatal lung disease that involves the gradual replacement of the alveolar tissue with fibrotic scars as well as the accumulation of inflammatory cells in the lower respiratory tract. Although IPF is known to occur as a result of the complex interaction between age, environmental risk factors (i.e., oxidative stress) and genetics, the relative contributions of these factors to the disease remain unclear. To determine whether TEs are associated with IPF, we compared the transcriptional profiles of the genes and TEs of lung cells obtained from both healthy donors and IPF patients. RESULTS We quantified TE and gene expression levels using a published bulk RNA-seq dataset containing 24 subjects (16 donors and eight IPF patients), including three lung-cell types per subject, as well as an scRNA-seq dataset concerning 16 subjects (eight donors and eight IPF patients). We found evidence of TE dysregulation in the alveolar type II lung cells and alveolar macrophages of the IPF patients. In addition, the activation of the LINE1 family of elements in IPF is associated with the increased expression of TE cellular regulators (MOV10, IFI16, SAMHD1, and APOBECG3), interferon-stimulating genes (ISG15, IFI6, IFI27, IFI44, and OAS1), chemokines (CX3CL1 and CXCL9), and interleukins (IL15RA). We also propose that TE derepression might be involved in the regulation of previously reported IPF candidate genes (MUC5B, CHL1, SPP1, and MMP7). CONCLUSION Based on our findings, we propose that TE derepression plays an important role in the regulation of gene expression and can also prompt both the recruitment of inflammatory processes and the disruption of the immunological balance, which can lead to chronic inflammation in IPF.
Collapse
Affiliation(s)
- Mahboubeh R Rostami
- Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA
| | - Martina Bradic
- Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA.
- Marie-Josee and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|
13
|
KarakÜlah G, Yandim C. Signature changes in the expressions of protein-coding genes, lncRNAs, and repeat elements in early and late cellular senescence. ACTA ACUST UNITED AC 2021; 44:356-370. [PMID: 33402863 PMCID: PMC7759191 DOI: 10.3906/biy-2005-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 08/24/2020] [Indexed: 12/13/2022]
Abstract
Replicative cellular senescence is the main cause of aging. It is important to note that early senescence is linked to tissue regeneration, whereas late senescence is known to trigger a chronically inflammatory phenotype. Despite the presence of various genome-wide studies, there is a lack of information on distinguishing early and late senescent phenotypes at the transcriptome level. Particularly, the changes in the noncoding RNA portion of the aging cell have not been fully elucidated. By utilising RNA sequencing data of fibroblasts, hereby, we are not only reporting changes in gene expression profiles and relevant biological processes in the early and late senescent phenotypes but also presenting significant differences in the expressions of many unravelled long noncoding RNAs (lncRNAs) and transcripts arisen from repetitive DNA. Our results indicate that, in addition to previously reported L1 elements, various LTR and DNA transposons, as well as members of the classical satellites including HSAT5 and α-satellites (ALR/Alpha), are expressed at higher levels in late senescence. Moreover, we revealed finer links between the expression levels of repeats with the genes located near them and known to be involved in cell cycle and senescence. Noncoding elements reported here provide a new perspective to be explored in further experimental studies.
Collapse
Affiliation(s)
- Gökhan KarakÜlah
- İzmir Biomedicine and Genome Center, İzmir Turkey.,İzmir International Biomedicine and Genome Institute, Dokuz Eylül University, İzmir Turkey
| | - Cihangir Yandim
- İzmir Biomedicine and Genome Center, İzmir Turkey.,Department of Genetics and Bioengineering, Faculty of Engineering, İzmir University of Economics, İzmir Turkey
| |
Collapse
|
14
|
Hao M, Liu W, Ding C, Peng X, Zhang Y, Chen H, Dong L, Liu X, Zhao Y, Chen X, Khatoon S, Zheng Y. Identification of hub genes and small molecule therapeutic drugs related to breast cancer with comprehensive bioinformatics analysis. PeerJ 2020; 8:e9946. [PMID: 33083112 PMCID: PMC7556247 DOI: 10.7717/peerj.9946] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 08/25/2020] [Indexed: 12/21/2022] Open
Abstract
Breast cancer is one of the most common malignant tumors among women worldwide and has a high morbidity and mortality. This research aimed to identify hub genes and small molecule drugs for breast cancer by integrated bioinformatics analysis. After downloading multiple gene expression datasets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) database, 283 overlapping differentially expressed genes (DEGs) significantly enriched in different cancer-related functions and pathways were obtained using LIMMA, VennDiagram and ClusterProfiler packages of R. We then analyzed the topology of protein–protein interaction (PPI) network with overlapping DEGs and further obtained six hub genes (RRM2, CDC20, CCNB2, BUB1B, CDK1, and CCNA2) from the network via STRING and Cytoscape. Subsequently, we conducted genes expression verification, genetic alterations evaluation, immune infiltration prediction, clinicopathological parameters analysis, identification of transcriptional and post-transcriptional regulatory molecules, and survival analysis for these hub genes. Meanwhile, 29 possible drug candidates (e.g., Cladribine, Gallium nitrate, Alvocidib, 1β-hydroxyalantolactone, Berberine hydrochloride, Nitidine chloride) were identified from the DGIdb database and the GSE85871 dataset. In addition, some transcription factors and miRNAs (e.g., E2F1, PTTG1, TP53, ZBTB16, hsa-miR-130a-3p, hsa-miR-204-5p) targeting hub genes were identified as key regulators in the progression of breast cancer. In conclusion, our study identified six hub genes and 29 potential drug candidates for breast cancer. These findings may advance understanding regarding the diagnosis, prognosis and treatment of breast cancer.
Collapse
Affiliation(s)
- Mingqian Hao
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Wencong Liu
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Chuanbo Ding
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Xiaojuan Peng
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Yue Zhang
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Huiying Chen
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Ling Dong
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Xinglong Liu
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Yingchun Zhao
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Xueyan Chen
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Sadia Khatoon
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| | - Yinan Zheng
- School of Chinese Medicinal Materials, Jilin Agricultural University, Changchun, Jilin, China
| |
Collapse
|
15
|
Xu LL, Qi M, Ye FY. Identifying Scientific and Technical “Unicorns”. Journal of Data and Information Science 2021; 6:96-115. [DOI: 10.2478/jdis-2021-0002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Abstract
Purpose
Using the metaphor of “unicorn,” we identify the scientific papers and technical patents characterized by the informetric feature of very high citations in the first ten years after publishing, which may provide a new pattern to understand very high impact works in science and technology.
Design/methodology/approach
When we set CT as the total citations of papers or patents in the first ten years after publication, with CT≥ 5,000 for scientific “unicorn” and CT≥ 500 for technical “unicorn,” we have an absolute standard for identifying scientific and technical “unicorn” publications.
Findings
We identify 165 scientific “unicorns” in 14,301,875 WoS papers and 224 technical “unicorns” in 13,728,950 DII patents during 2001–2012. About 50% of “unicorns” belong to biomedicine, in which selected cases are individually discussed. The rare “unicorns” increase following linear model, the fitting data show 95% confidence with the RMSE of scientific “unicorn” is 0.2127 while the RMSE of technical “unicorn” is 0.0923.
Research limitations
A “unicorn” is a pure quantitative consideration without concerning its quality, and “potential unicorns” as CT≤5,000 for papers and CT≤500 for patents are left in future studies.
Practical implications
Scientific and technical “unicorns” provide a new pattern to understand high-impact works in science and technology. The “unicorn” pattern supplies a concise approach to identify very high-impact scientific papers and technical patents.
Originality/value
The “unicorn” pattern supplies a concise approach to identify very high impact scientific papers and technical patents.
Collapse
|
16
|
Zhai X, Yang Z, Liu X, Dong Z, Zhou D. Identification of NUF2 and FAM83D as potential biomarkers in triple-negative breast cancer. PeerJ 2020; 8:e9975. [PMID: 33005492 PMCID: PMC7513746 DOI: 10.7717/peerj.9975] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 08/26/2020] [Indexed: 12/12/2022] Open
Abstract
Background Breast cancer is a heterogeneous disease. Compared with other subtypes of breast cancer, triple-negative breast cancer (TNBC) is easy to metastasize and has a short survival time, less choice of treatment options. Here, we aimed to identify the potential biomarkers to TNBC diagnosis and prognosis. Material/Methods Three independent data sets (GSE45827, GSE38959, GSE65194) were downloaded from the Gene Expression Omnibus (GEO). The R software packages were used to integrate the gene profiles and identify differentially expressed genes (DEGs). A variety of bioinformatics tools were used to explore the hub genes, including the DAVID database, STRING database and Cytoscape software. Reverse transcription quantitative PCR (RT-qPCR) was used to verify the hub genes in 14 pairs of TNBC paired tissues. Results In this study, we screened out 161 DEGs between 222 non-TNBC and 126 TNBC samples, of which 105 genes were up-regulated and 56 were down-regulated. These DEGs were enriched for 27 GO terms and two pathways. GO analysis enriched mainly in “cell division”, “chromosome, centromeric region” and “microtubule motor activity”. KEGG pathway analysis enriched mostly in “Cell cycle” and “Oocyte meiosis”. PPI network was constructed and then 10 top hub genes were screened. According to the analysis results of the Kaplan-Meier survival curve, the expression levels of only NUF2, FAM83D and CENPH were associated with the recurrence-free survival in TNBC samples (P < 0.05). RT-qPCR confirmed that the expression levels of NUF2 and FAM83D in TNBC tissues were indeed up-regulated significantly. Conclusions The comprehensive analysis showed that NUF2 and FAM83D could be used as potential biomarkers for diagnosis and prognosis of TNBC.
Collapse
Affiliation(s)
- Xiuming Zhai
- Department of Laboratory Medicine, The Third Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Zhaowei Yang
- Department of Breast and Thyroid, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Xiji Liu
- Department of Laboratory Medicine, The Third Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Zihe Dong
- Department of Laboratory Medicine, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| | - Dandan Zhou
- Department of Laboratory Medicine, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, China
| |
Collapse
|
17
|
Abstract
Transposable elements (TEs) are insertional mutagens that contribute greatly to the plasticity of eukaryotic genomes, influencing the evolution and adaptation of species as well as physiology or disease in individuals. Measuring TE expression helps to understand not only when and where TE mobilization can occur but also how this process alters gene expression, chromatin accessibility or cellular signalling pathways. Although genome-wide gene expression assays such as RNA sequencing include transposon-derived transcripts, most computational analytical tools discard or misinterpret TE-derived reads. Emerging approaches are improving the identification of expressed TE loci and helping to discriminate TE transcripts that permit TE mobilization from chimeric gene-TE transcripts or pervasive transcription. Here we review the main challenges associated with the detection of TE expression, including mappability, insertional and internal sequence polymorphisms, and the diversity of the TE transcriptional landscape, as well as the different experimental and computational strategies to solve them.
Collapse
|
18
|
Abstract
Breast cancer is a disease with high heterogeneity. Cancer is not usually caused by a single gene, but by multiple genes and their interactions with others and surroundings. Estimating breast cancer-specific gene–gene interaction networks is critical to elucidate the mechanisms of breast cancer from a biological network perspective. In this study, sample-specific gene–gene interaction networks of breast cancer samples were established by using a sample-specific network analysis method based on gene expression profiles. Then, gene–gene interaction networks and pathways related to breast cancer and its subtypes and stages were further identified. The similarity and difference among these subtype-related (and stage-related) networks and pathways were studied, which showed highly specific for subtype Basal-like and Stages IV and V. Finally, gene pairwise interactions associated with breast cancer prognosis were identified by a Cox proportional hazards regression model, and a risk prediction model based on the gene pairs was established, which also performed very well on an independent validation data set. This work will help us to better understand the mechanism underlying the occurrence of breast cancer from the sample-specific network perspective.
Collapse
Affiliation(s)
- Ke Zhu
- College of Science, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Cong Pian
- College of Science, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Qiong Xiang
- College of Science, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Xin Liu
- College of Science, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Yuanyuan Chen
- College of Science, Nanjing Agricultural University, Nanjing, Jiangsu, China.,State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, Jiangsu, China
| |
Collapse
|