1
|
Vendrell-Mir P, Leduque B, Quadrana L. Ultra-sensitive detection of transposon insertions across multiple families by transposable element display sequencing. Genome Biol 2025; 26:48. [PMID: 40050910 PMCID: PMC11887134 DOI: 10.1186/s13059-025-03512-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Accepted: 02/24/2025] [Indexed: 03/09/2025] Open
Abstract
BACKGROUND Mobilization of transposable elements (TEs) can generate large effect mutations. However, due to the difficulty of detecting new TE insertions in genomes and the typically rare occurrence of transposition, the actual rate, distribution, and population dynamics of new insertions remain largely unexplored. RESULTS We present a TE display sequencing approach that leverages target amplification of TE extremities to detect non-reference TE insertions with high specificity and sensitivity, enabling the detection of insertions at frequencies as low as 1 in 250,000 within a DNA sample. Moreover, this method allows the simultaneous detection of insertions for distinct TE families, including both retrotransposons and DNA transposons, enhancing its versatility and cost-effectiveness for investigating complex "mobilomes." When combined with nanopore sequencing, this approach enables the identification of insertions using long-read information and achieves a turnaround time from DNA extraction to insertion identification of less than 24 h, significantly reducing the time-to-answer. By analyzing a population of Arabidopsis thaliana plants undergoing a transposition burst, we demonstrate the power of the multiplex TE display sequencing to analyze "evolve and resequence" experiments. Notably, we find that 3-4% of de novo TE insertions exhibit recurrent allele frequency changes indicative of either positive or negative selection. CONCLUSIONS TE display sequencing is an ultra-sensitive, specific, simple, and cost-effective approach for investigating the rate and landscape of new TE insertions across multiple families in large-scale population experiments. We provide a step-by-step experimental protocol and ready-to-use bioinformatic pipelines to facilitate its straightforward implementation.
Collapse
Affiliation(s)
- Pol Vendrell-Mir
- Institute of Plant Sciences Paris-Saclay (IPS2), Centre National de la Recherche Scientifique, Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Université Evry, Université Paris-Saclay, Gif Sur Yvette, 91190, France
| | - Basile Leduque
- Institute of Plant Sciences Paris-Saclay (IPS2), Centre National de la Recherche Scientifique, Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Université Evry, Université Paris-Saclay, Gif Sur Yvette, 91190, France
| | - Leandro Quadrana
- Institute of Plant Sciences Paris-Saclay (IPS2), Centre National de la Recherche Scientifique, Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Université Evry, Université Paris-Saclay, Gif Sur Yvette, 91190, France.
| |
Collapse
|
2
|
Daigle A, Whitehouse LS, Zhao R, Emerson JJ, Schrider DR. Leveraging long-read assemblies and machine learning to enhance short-read transposable element detection and genotyping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.11.637720. [PMID: 39990489 PMCID: PMC11844559 DOI: 10.1101/2025.02.11.637720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Transposable elements (TEs) are parasitic genomic elements that are ubiquitous across the tree of life and play a crucial role in genome evolution. Advances in long-read sequencing have allowed highly accurate TE detection, though at a higher cost than short-read sequencing. Recent studies using long reads have shown that existing short-read TE detection methods perform inadequately when applied to real data. In this study, we use a machine learning approach (called TEforest) to discover and genotype TE insertions and deletions with short-read data by using TEs detected from long-read genome assemblies as training data. Our method first uses a highly sensitive algorithm to discover potential TE insertion or deletion sites in the genome, extracting relevant features from short-read alignments. To discriminate between true and false TE insertions, we train a random forest model with a labeled ground-truth dataset for which we have calculated the same set of short-read features. We conduct a comprehensive benchmark of TEforest and traditional TE detection methods using real data, finding that TEforest identifies more true positives and fewer false positives across datasets with different read lengths and coverages, while also accurately inferring genotypes and the precise breakpoints of insertions. By learning short-read signatures of TEs previously only discoverable using long reads, our approach bridges the gap between large-scale population genetic studies and the accuracy of long-read assemblies. This work provides a user-friendly tool to study the prevalence and phenotypic effects of TE insertions across the genome.
Collapse
Affiliation(s)
- Austin Daigle
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC 27599
| | - Logan S. Whitehouse
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina, Chapel Hill, NC 27599
| | - Roy Zhao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697
| | - JJ Emerson
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697
| | - Daniel R. Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
3
|
Azad MF, Tong T, Lau NC. Transposable Element (TE) insertion predictions from RNAseq inputs and TE impact on RNA splicing and gene expression in Drosophila brain transcriptomes. Mob DNA 2024; 15:20. [PMID: 39385293 PMCID: PMC11462757 DOI: 10.1186/s13100-024-00330-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Accepted: 09/23/2024] [Indexed: 10/12/2024] Open
Abstract
Recent studies have suggested that Transposable Elements (TEs) residing in introns frequently splice into and alter primary gene-coding transcripts. To re-examine the exonization frequency of TEs into protein-coding gene transcripts, we re-analyzed a Drosophila neuron circadian rhythm RNAseq dataset and a deep long RNA fly midbrain RNAseq dataset using our Transposon Insertion and Depletion Analyzer (TIDAL) program. Our TIDAL results were able to predict several TE insertions from RNAseq data that were consistent with previous published studies. However, we also uncovered many discrepancies in TE-exonization calls, such as reads that mainly support intron retention of the TE and little support for chimeric mRNA spliced to the TE. We then deployed rigorous genomic DNA-PCR (gDNA-PCR) and RT-PCR procedures on TE-mRNA fusion candidates to see how many of bioinformatics predictions could be validated. By testing a w1118 strain from which the deeper long RNAseq data was derived and comparing to an OreR strain, only 9 of 23 TIDAL candidates (< 40%) could be validated as a novel TE insertion by gDNA-PCR, indicating that deeper study is needed when using RNAseq data as inputs into current TE-insertion prediction programs. Of these validated calls, our RT-PCR results only supported TE-intron retention. Lastly, in the Dscam2 and Bx genes of the w1118 strain that contained intronic TEs, gene expression was 23 times higher than the OreR genes lacking the TEs. This study's validation approach indicates that chimeric TE-mRNAs are infrequent and cautions that more optimization is required in bioinformatics programs to call TE insertions using RNAseq datasets.
Collapse
Affiliation(s)
- Md Fakhrul Azad
- Department of Biochemistry and Cell Biology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA
| | - Tong Tong
- Graduate Program in Bioinformatics, Boston University, Boston, MA, 02118, USA
| | - Nelson C Lau
- Department of Biochemistry and Cell Biology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA.
- Graduate Program in Bioinformatics, Boston University, Boston, MA, 02118, USA.
- Genome Science Institute, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, 02118, USA.
| |
Collapse
|
4
|
Jansen G, Gebert D, Kumar TR, Simmons E, Murphy S, Teixeira FK. Tolerance thresholds underlie responses to DNA damage during germline development. Genes Dev 2024; 38:631-654. [PMID: 39054057 PMCID: PMC11368186 DOI: 10.1101/gad.351701.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 07/05/2024] [Indexed: 07/27/2024]
Abstract
Selfish DNA modules like transposable elements (TEs) are particularly active in the germline, the lineage that passes genetic information across generations. New TE insertions can disrupt genes and impair the functionality and viability of germ cells. However, we found that in P-M hybrid dysgenesis in Drosophila, a sterility syndrome triggered by the P-element DNA transposon, germ cells harbor unexpectedly few new TE insertions despite accumulating DNA double-strand breaks (DSBs) and inducing cell cycle arrest. Using an engineered CRISPR-Cas9 system, we show that generating DSBs at silenced P-elements or other noncoding sequences is sufficient to induce germ cell loss independently of gene disruption. Indeed, we demonstrate that both developing and adult mitotic germ cells are sensitive to DSBs in a dosage-dependent manner. Following the mitotic-to-meiotic transition, however, germ cells become more tolerant to DSBs, completing oogenesis regardless of the accumulated genome damage. Our findings establish DNA damage tolerance thresholds as crucial safeguards of genome integrity during germline development.
Collapse
Affiliation(s)
- Gloria Jansen
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | - Daniel Gebert
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| | | | - Emily Simmons
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Sarah Murphy
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Felipe Karam Teixeira
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom;
- Department of Physiology, Development, and Neuroscience, University of Cambridge, Cambridge CB2 3DY, United Kingdom
| |
Collapse
|
5
|
Chu C, Ljungström V, Tran A, Jin H, Park PJ. Contribution of de novo retroelements to birth defects and childhood cancers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305733. [PMID: 38699361 PMCID: PMC11065029 DOI: 10.1101/2024.04.15.24305733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Insertion of active retroelements-L1s, Alus, and SVAs-can disrupt proper genome function and lead to various disorders including cancer. However, the role of de novo retroelements (DNRTs) in birth defects and childhood cancers has not been well characterized due to the lack of adequate data and efficient computational tools. Here, we examine whole-genome sequencing data of 3,244 trios from 12 birth defect and childhood cancer cohorts in the Gabriella Miller Kids First Pediatric Research Program. Using an improved version of our tool xTea (x-Transposable element analyzer) that incorporates a deep-learning module, we identified 162 DNRTs, as well as 2 pseudogene insertions. Several variants are likely to be causal, such as a de novo Alu insertion that led to the ablation of a whole exon in the NF1 gene in a proband with brain tumor. We observe a high de novo SVA insertion burden in both high-intolerance loss-of-function genes and exons as well as more frequent de novo Alu insertions of paternal origin. We also identify potential mosaic DNRTs from embryonic stages. Our study reveals the important roles of DNRTs in causing birth defects and predisposition to childhood cancers.
Collapse
Affiliation(s)
- Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Viktor Ljungström
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Antuan Tran
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Hu Jin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter J. Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
6
|
Fukuda K. The role of transposable elements in human evolution and methods for their functional analysis: current status and future perspectives. Genes Genet Syst 2024; 98:289-304. [PMID: 37866889 DOI: 10.1266/ggs.23-00140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023] Open
Abstract
Transposable elements (TEs) are mobile DNA sequences that can insert themselves into various locations within the genome, causing mutations that may provide advantages or disadvantages to individuals and species. The insertion of TEs can result in genetic variation that may affect a wide range of human traits including genetic disorders. Understanding the role of TEs in human biology is crucial for both evolutionary and medical research. This review discusses the involvement of TEs in human traits and disease susceptibility, as well as methods for functional analysis of TEs.
Collapse
Affiliation(s)
- Kei Fukuda
- Integrative Genomics Unit, The University of Melbourne
| |
Collapse
|
7
|
Devine SE. Emerging Opportunities to Study Mobile Element Insertions and Their Source Elements in an Expanding Universe of Sequenced Human Genomes. Genes (Basel) 2023; 14:1923. [PMID: 37895272 PMCID: PMC10606232 DOI: 10.3390/genes14101923] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 09/29/2023] [Accepted: 09/30/2023] [Indexed: 10/29/2023] Open
Abstract
Three mobile element classes, namely Alu, LINE-1 (L1), and SVA elements, remain actively mobile in human genomes and continue to produce new mobile element insertions (MEIs). Historically, MEIs have been discovered and studied using several methods, including: (1) Southern blots, (2) PCR (including PCR display), and (3) the detection of MEI copies from young subfamilies. We are now entering a new phase of MEI discovery where these methods are being replaced by whole genome sequencing and bioinformatics analysis to discover novel MEIs. We expect that the universe of sequenced human genomes will continue to expand rapidly over the next several years, both with short-read and long-read technologies. These resources will provide unprecedented opportunities to discover MEIs and study their impact on human traits and diseases. They also will allow the MEI community to discover and study the source elements that produce these new MEIs, which will facilitate our ability to study source element regulation in various tissue contexts and disease states. This, in turn, will allow us to better understand MEI mutagenesis in humans and the impact of this mutagenesis on human biology.
Collapse
Affiliation(s)
- Scott E Devine
- Institute for Genome Sciences, Department of Medicine, and Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
8
|
Yin Z, Yang Q, Shen D, Liu J, Huang W, Dou D. Online data resource for exploring transposon insertion polymorphisms in public soybean germplasm accessions. PLANT PHYSIOLOGY 2023; 193:1036-1044. [PMID: 37399251 DOI: 10.1093/plphys/kiad386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 05/30/2023] [Accepted: 06/11/2023] [Indexed: 07/05/2023]
Abstract
Soybean (Glycine max L. Merrill) is one of the most important economical crops. A large number of whole-genome resequencing datasets have been generated and are increasingly expanded for exploring genetic diversity and mining important quantitative trait loci. Most genome-wide association studies have focused on single-nucleotide polymorphisms, short insertions, and deletions. Nevertheless, structure variants mainly caused by transposon element mobilization are not fully considered. To fill this gap, we uniformly processed the publicly available whole-genome resequencing data from 5,521 soybean germplasm accessions and built an online soybean transposon insertion polymorphisms database named Soybean Transposon Insertion Polymorphisms Database (SoyTIPdb) (https://biotec.njau.edu.cn/soytipdb). The collected germplasm accessions derived from more than 45 countries and 160 regions representing the most comprehensive genetic diversity of soybean. SoyTIPdb implements easy-to-use query, analysis, and browse functions to help understand and find meaningful structural variations from TE insertions. In conclusion, SoyTIPdb is a valuable data resource and will help soybean breeders/researchers take advantage of the whole-genome sequencing datasets available in the public depositories.
Collapse
Affiliation(s)
- Zhiyuan Yin
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Qingjie Yang
- Bioinformatics Center, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Danyu Shen
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| | - Jinding Liu
- Bioinformatics Center, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Department of Animal Science, Michigan State University, East Lansing, MI 48824, USA
| | - Wen Huang
- Department of Animal Science, Michigan State University, East Lansing, MI 48824, USA
| | - Daolong Dou
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
- Bioinformatics Center, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, Jiangsu 210095, China
| |
Collapse
|
9
|
Chen J, Basting PJ, Han S, Garfinkel DJ, Bergman CM. Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast. Mob DNA 2023; 14:8. [PMID: 37452430 PMCID: PMC10347736 DOI: 10.1186/s13100-023-00296-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 06/09/2023] [Indexed: 07/18/2023] Open
Abstract
BACKGROUND Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute, or evaluate multiple TE insertion detectors. RESULTS We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae, we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide consistent estimates of [Formula: see text]50 non-reference TE insertions per strain and that Ty2 has the highest number of non-reference TE insertions in a species-wide panel of [Formula: see text]1000 yeast genomes. Finally, we show that best-in-class predictors for yeast applied to resequencing data have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge about fine-scale target preferences revealed previously for experimentally-induced Ty1 insertions to spontaneous insertions for other copia-superfamily retrotransposons in yeast. CONCLUSION McClintock ( https://github.com/bergmanlab/mcclintock/ ) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors in other species.
Collapse
Affiliation(s)
- Jingxuan Chen
- Institute of Bioinformatics, University of Georgia, Athens, GA USA
| | | | - Shunhua Han
- Institute of Bioinformatics, University of Georgia, Athens, GA USA
| | - David J. Garfinkel
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA USA
| | - Casey M. Bergman
- Institute of Bioinformatics, University of Georgia, Athens, GA USA
- Department of Genetics, University of Georgia, Athens, GA USA
| |
Collapse
|
10
|
Chen J, Basting PJ, Han S, Garfinkel DJ, Bergman CM. Reproducible evaluation of transposable element detectors with McClintock 2 guides accurate inference of Ty insertion patterns in yeast. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.13.528343. [PMID: 36824955 PMCID: PMC9948991 DOI: 10.1101/2023.02.13.528343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
BACKGROUND Many computational methods have been developed to detect non-reference transposable element (TE) insertions using short-read whole genome sequencing data. The diversity and complexity of such methods often present challenges to new users seeking to reproducibly install, execute, or evaluate multiple TE insertion detectors. RESULTS We previously developed the McClintock meta-pipeline to facilitate the installation, execution, and evaluation of six first-generation short-read TE detectors. Here, we report a completely re-implemented version of McClintock written in Python using Snakemake and Conda that improves its installation, error handling, speed, stability, and extensibility. McClintock 2 now includes 12 short-read TE detectors, auxiliary pre-processing and analysis modules, interactive HTML reports, and a simulation framework to reproducibly evaluate the accuracy of component TE detectors. When applied to the model microbial eukaryote Saccharomyces cerevisiae, we find substantial variation in the ability of McClintock 2 components to identify the precise locations of non-reference TE insertions, with RelocaTE2 showing the highest recall and precision in simulated data. We find that RelocaTE2, TEMP, TEMP2 and TEBreak provide a consistent and biologically meaningful view of non-reference TE insertions in a species-wide panel of ∼1000 yeast genomes, as evaluated by coverage-based abundance estimates and expected patterns of tRNA promoter targeting. Finally, we show that best-in-class predictors for yeast have sufficient resolution to reveal a dyad pattern of integration in nucleosome-bound regions upstream of yeast tRNA genes for Ty1, Ty2, and Ty4, allowing us to extend knowledge about fine-scale target preferences first revealed experimentally for Ty1 to natural insertions and related copia-superfamily retrotransposons in yeast. CONCLUSION McClintock (https://github.com/bergmanlab/mcclintock/) provides a user-friendly pipeline for the identification of TEs in short-read WGS data using multiple TE detectors, which should benefit researchers studying TE insertion variation in a wide range of different organisms. Application of the improved McClintock system to simulated and empirical yeast genome data reveals best-in-class methods and novel biological insights for one of the most widely-studied model eukaryotes and provides a paradigm for evaluating and selecting non-reference TE detectors for other species.
Collapse
Affiliation(s)
- Jingxuan Chen
- Institute of Bioinformatics, University of Georgia, Athens, GA
| | | | - Shunhua Han
- Institute of Bioinformatics, University of Georgia, Athens, GA
| | - David J. Garfinkel
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA
| | - Casey M. Bergman
- Institute of Bioinformatics, University of Georgia, Athens, GA
- Department of Genetics, University of Georgia, Athens, GA
| |
Collapse
|
11
|
Chen X, Bourque G, Goubert C. Genotyping of Transposable Element Insertions Segregating in Human Populations Using Short-Read Realignments. Methods Mol Biol 2023; 2607:63-83. [PMID: 36449158 DOI: 10.1007/978-1-0716-2883-6_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Transposable element (TE) insertions are a major source of structural variation in the human genome. Due to the repetitive nature and biological importance of TEs, many bioinformatic tools have been developed to identify and genotype TE insertion polymorphisms using high-throughput short-reads. In this chapter, we outline recently developed methods to characterize TE insertion polymorphisms in human populations. We also provide detailed protocols to tackle this question primarily using three software: MELT2, ERVcaller, and TypeREF.
Collapse
Affiliation(s)
- Xun Chen
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan.
| | - Guillaume Bourque
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
- Canadian Centre for Computational Genomics, McGill University, Montreal, QC, Canada
- McGill Genome Centre, Montreal, QC, Canada
- Human Genetics, McGill University, Montreal, QC, Canada
| | - Clément Goubert
- Canadian Centre for Computational Genomics, McGill University, Montreal, QC, Canada.
- McGill Genome Centre, Montreal, QC, Canada.
- Human Genetics, McGill University, Montreal, QC, Canada.
| |
Collapse
|
12
|
Han S, Dias GB, Basting PJ, Viswanatha R, Perrimon N, Bergman C. Local assembly of long reads enables phylogenomics of transposable elements in a polyploid cell line. Nucleic Acids Res 2022; 50:e124. [PMID: 36156149 PMCID: PMC9757076 DOI: 10.1093/nar/gkac794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 07/21/2022] [Accepted: 09/16/2022] [Indexed: 12/24/2022] Open
Abstract
Animal cell lines often undergo extreme genome restructuring events, including polyploidy and segmental aneuploidy that can impede de novo whole-genome assembly (WGA). In some species like Drosophila, cell lines also exhibit massive proliferation of transposable elements (TEs). To better understand the role of transposition during animal cell culture, we sequenced the genome of the tetraploid Drosophila S2R+ cell line using long-read and linked-read technologies. WGAs for S2R+ were highly fragmented and generated variable estimates of TE content across sequencing and assembly technologies. We therefore developed a novel WGA-independent bioinformatics method called TELR that identifies, locally assembles, and estimates allele frequency of TEs from long-read sequence data (https://github.com/bergmanlab/telr). Application of TELR to a ∼130x PacBio dataset for S2R+ revealed many haplotype-specific TE insertions that arose by transposition after initial cell line establishment and subsequent tetraploidization. Local assemblies from TELR also allowed phylogenetic analysis of paralogous TEs, which revealed that proliferation of TE families in vitro can be driven by single or multiple source lineages. Our work provides a model for the analysis of TEs in complex heterozygous or polyploid genomes that are recalcitrant to WGA and yields new insights into the mechanisms of genome evolution in animal cell culture.
Collapse
Affiliation(s)
| | | | - Preston J Basting
- Institute of Bioinformatics, University of Georgia, 120 E. Green St., Athens, GA, USA
| | - Raghuvir Viswanatha
- Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA, USA
| | - Norbert Perrimon
- Department of Genetics, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA, USA,Howard Hughes Medical Institute, Boston, MA, USA
| | - Casey M Bergman
- To whom correspondence should be addressed. Tel: +1 706 542 1764; Fax: +1 706 542 3910;
| |
Collapse
|
13
|
Bhat A, Ghatage T, Bhan S, Lahane GP, Dhar A, Kumar R, Pandita RK, Bhat KM, Ramos KS, Pandita TK. Role of Transposable Elements in Genome Stability: Implications for Health and Disease. Int J Mol Sci 2022; 23:7802. [PMID: 35887150 PMCID: PMC9319628 DOI: 10.3390/ijms23147802] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 07/07/2022] [Accepted: 07/12/2022] [Indexed: 12/11/2022] Open
Abstract
Most living organisms have in their genome a sizable proportion of DNA sequences capable of mobilization; these sequences are commonly referred to as transposons, transposable elements (TEs), or jumping genes. Although long thought to have no biological significance, advances in DNA sequencing and analytical technologies have enabled precise characterization of TEs and confirmed their ubiquitous presence across all forms of life. These findings have ignited intense debates over their biological significance. The available evidence now supports the notion that TEs exert major influence over many biological aspects of organismal life. Transposable elements contribute significantly to the evolution of the genome by giving rise to genetic variations in both active and passive modes. Due to their intrinsic nature of mobility within the genome, TEs primarily cause gene disruption and large-scale genomic alterations including inversions, deletions, and duplications. Besides genomic instability, growing evidence also points to many physiologically important functions of TEs, such as gene regulation through cis-acting control elements and modulation of the transcriptome through epigenetic control. In this review, we discuss the latest evidence demonstrating the impact of TEs on genome stability and the underling mechanisms, including those developed to mitigate the deleterious impact of TEs on genomic stability and human health. We have also highlighted the potential therapeutic application of TEs.
Collapse
Affiliation(s)
- Audesh Bhat
- Centre for Molecular Biology, Central University of Jammu, Jammu 181143, India;
| | - Trupti Ghatage
- Department of Pharmacy, BITS-Pilani Hyderabad Campus, Hyderabad 500078, India; (T.G.); (G.P.L.); (A.D.)
| | - Sonali Bhan
- Centre for Molecular Biology, Central University of Jammu, Jammu 181143, India;
| | - Ganesh P. Lahane
- Department of Pharmacy, BITS-Pilani Hyderabad Campus, Hyderabad 500078, India; (T.G.); (G.P.L.); (A.D.)
| | - Arti Dhar
- Department of Pharmacy, BITS-Pilani Hyderabad Campus, Hyderabad 500078, India; (T.G.); (G.P.L.); (A.D.)
| | - Rakesh Kumar
- Department of Biotechnology, Shri Mata Vaishnav Devi University, Katra 182320, India;
| | - Raj K. Pandita
- Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA;
| | - Krishna M. Bhat
- Department of Molecular Medicine, University of South Florida, Tampa, FL 33612, USA;
| | - Kenneth S. Ramos
- Center for Genomics and Precision Medicine, Texas A&M College of Medicine, Houston, TX 77030, USA;
| | - Tej K. Pandita
- Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA;
- Center for Genomics and Precision Medicine, Texas A&M College of Medicine, Houston, TX 77030, USA;
| |
Collapse
|
14
|
Han S, Dias GB, Basting PJ, Nelson MG, Patel S, Marzo M, Bergman CM. Ongoing transposition in cell culture reveals the phylogeny of diverse Drosophila S2 sublines. Genetics 2022; 221:iyac077. [PMID: 35536183 PMCID: PMC9252272 DOI: 10.1093/genetics/iyac077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 04/28/2022] [Indexed: 11/13/2022] Open
Abstract
Cultured cells are widely used in molecular biology despite poor understanding of how cell line genomes change in vitro over time. Previous work has shown that Drosophila cultured cells have a higher transposable element content than whole flies, but whether this increase in transposable element content resulted from an initial burst of transposition during cell line establishment or ongoing transposition in cell culture remains unclear. Here, we sequenced the genomes of 25 sublines of Drosophila S2 cells and show that transposable element insertions provide abundant markers for the phylogenetic reconstruction of diverse sublines in a model animal cell culture system. DNA copy number evolution across S2 sublines revealed dramatically different patterns of genome organization that support the overall evolutionary history reconstructed using transposable element insertions. Analysis of transposable element insertion site occupancy and ancestral states support a model of ongoing transposition dominated by episodic activity of a small number of retrotransposon families. Our work demonstrates that substantial genome evolution occurs during long-term Drosophila cell culture, which may impact the reproducibility of experiments that do not control for subline identity.
Collapse
Affiliation(s)
- Shunhua Han
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Guilherme B Dias
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Preston J Basting
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Michael G Nelson
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Sanjai Patel
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Mar Marzo
- Faculty of Life Sciences, University of Manchester, Manchester M13 9PT, UK
| | - Casey M Bergman
- Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
15
|
Yan H, Haak DC, Li S, Huang L, Bombarely A. Exploring transposable element-based markers to identify allelic variations underlying agronomic traits in rice. PLANT COMMUNICATIONS 2022; 3:100270. [PMID: 35576152 PMCID: PMC9251385 DOI: 10.1016/j.xplc.2021.100270] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 10/29/2021] [Accepted: 12/16/2021] [Indexed: 06/10/2023]
Abstract
Transposable elements (TEs) are a major force in the production of new alleles during domestication; nevertheless, their use in association studies has been limited because of their complexity. We have developed a TE genotyping pipeline (TEmarker) and applied it to whole-genome genome-wide association study (GWAS) data from 176 Oryza sativa subsp. japonica accessions to identify genetic elements associated with specific agronomic traits. TE markers recovered a large proportion (69%) of single-nucleotide polymorphism (SNP)-based GWAS peaks, and these TE peaks retained ca. 25% of the SNPs. The use of TEs in GWASs may reduce false positives associated with linkage disequilibrium (LD) among SNP markers. A genome scan revealed positive selection on TEs associated with agronomic traits. We found several cases of insertion and deletion variants that potentially resulted from the direct action of TEs, including an allele of LOC_Os11g08410 associated with plant height and panicle length traits. Together, these findings reveal the utility of TE markers for connecting genotype to phenotype and suggest a potential role for TEs in influencing phenotypic variations in rice that impact agronomic traits.
Collapse
Affiliation(s)
- Haidong Yan
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - David C Haak
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA; Graduate Program in Genetics, Bioinformatics and Computational Biology (GBCB), Virginia Tech, Blacksburg, VA 24061, USA
| | - Song Li
- School of Plant and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA; Graduate Program in Genetics, Bioinformatics and Computational Biology (GBCB), Virginia Tech, Blacksburg, VA 24061, USA
| | - Linkai Huang
- Department of Grassland Science, Animal Science and Technology College, Sichuan Agricultural University, Chengdu 611130, China
| | - Aureliano Bombarely
- Department of Bioscience, Universita degli Studi di Milano (UNIMI), 20133 Milano, Italy; Instituto de Biologıa Molecular y Celular de Plantas (IBMCP), UPV-CSIC, 46022 Valencia, Spain.
| |
Collapse
|
16
|
Navarro-Dominguez B, Chang CH, Brand CL, Muirhead CA, Presgraves DC, Larracuente AM. Epistatic selection on a selfish Segregation Distorter supergene - drive, recombination, and genetic load. eLife 2022; 11:e78981. [PMID: 35486424 PMCID: PMC9122502 DOI: 10.7554/elife.78981] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 04/20/2022] [Indexed: 11/13/2022] Open
Abstract
Meiotic drive supergenes are complexes of alleles at linked loci that together subvert Mendelian segregation resulting in preferential transmission. In males, the most common mechanism of drive involves the disruption of sperm bearing one of a pair of alternative alleles. While at least two loci are important for male drive-the driver and the target-linked modifiers can enhance drive, creating selection pressure to suppress recombination. In this work, we investigate the evolution and genomic consequences of an autosomal, multilocus, male meiotic drive system, Segregation Distorter (SD) in the fruit fly, Drosophila melanogaster. In African populations, the predominant SD chromosome variant, SD-Mal, is characterized by two overlapping, paracentric inversions on chromosome arm 2R and nearly perfect (~100%) transmission. We study the SD-Mal system in detail, exploring its components, chromosomal structure, and evolutionary history. Our findings reveal a recent chromosome-scale selective sweep mediated by strong epistatic selection for haplotypes carrying Sd, the main driving allele, and one or more factors within the double inversion. While most SD-Mal chromosomes are homozygous lethal, SD-Mal haplotypes can recombine with other, complementing haplotypes via crossing over, and with wildtype chromosomes via gene conversion. SD-Mal chromosomes have nevertheless accumulated lethal mutations, excess non-synonymous mutations, and excess transposable element insertions. Therefore, SD-Mal haplotypes evolve as a small, semi-isolated subpopulation with a history of strong selection. These results may explain the evolutionary turnover of SD haplotypes in different populations around the world and have implications for supergene evolution broadly.
Collapse
Affiliation(s)
| | - Ching-Ho Chang
- Department of Biology, University of RochesterRochesterUnited States
| | - Cara L Brand
- Department of Biology, University of RochesterRochesterUnited States
| | - Christina A Muirhead
- Department of Biology, University of RochesterRochesterUnited States
- Ronin InstituteMontclairUnited States
| | | | | |
Collapse
|
17
|
Rech GE, Radío S, Guirao-Rico S, Aguilera L, Horvath V, Green L, Lindstadt H, Jamilloux V, Quesneville H, González J. Population-scale long-read sequencing uncovers transposable elements associated with gene expression variation and adaptive signatures in Drosophila. Nat Commun 2022; 13:1948. [PMID: 35413957 PMCID: PMC9005704 DOI: 10.1038/s41467-022-29518-8] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 03/15/2022] [Indexed: 12/16/2022] Open
Abstract
High quality reference genomes are crucial to understanding genome function, structure and evolution. The availability of reference genomes has allowed us to start inferring the role of genetic variation in biology, disease, and biodiversity conservation. However, analyses across organisms demonstrate that a single reference genome is not enough to capture the global genetic diversity present in populations. In this work, we generate 32 high-quality reference genomes for the well-known model species D. melanogaster and focus on the identification and analysis of transposable element variation as they are the most common type of structural variant. We show that integrating the genetic variation across natural populations from five climatic regions increases the number of detected insertions by 58%. Moreover, 26% to 57% of the insertions identified using long-reads were missed by short-reads methods. We also identify hundreds of transposable elements associated with gene expression variation and new TE variants likely to contribute to adaptive evolution in this species. Our results highlight the importance of incorporating the genetic variation present in natural populations to genomic studies, which is essential if we are to understand how genomes function and evolve.
Collapse
Affiliation(s)
- Gabriel E Rech
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Santiago Radío
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Sara Guirao-Rico
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Laura Aguilera
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Vivien Horvath
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Llewellyn Green
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | - Hannah Lindstadt
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain
| | | | | | - Josefa González
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), 08003, Barcelona, Spain.
| |
Collapse
|
18
|
Yang N, Srivastav SP, Rahman R, Ma Q, Dayama G, Li S, Chinen M, Lei EP, Rosbash M, Lau NC. Transposable element landscapes in aging Drosophila. PLoS Genet 2022; 18:e1010024. [PMID: 35239675 PMCID: PMC8893327 DOI: 10.1371/journal.pgen.1010024] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 01/10/2022] [Indexed: 11/28/2022] Open
Abstract
Genetic mechanisms that repress transposable elements (TEs) in young animals decline during aging, as reflected by increased TE expression in aged animals. Does increased TE expression during aging lead to more genomic TE copies in older animals? To address this question, we quantified TE Landscapes (TLs) via whole genome sequencing of young and aged Drosophila strains of wild-type and mutant backgrounds. We quantified TLs in whole flies and dissected brains and validated the feasibility of our approach in detecting new TE insertions in aging Drosophila genomes when small RNA and RNA interference (RNAi) pathways are compromised. We also describe improved sequencing methods to quantify extra-chromosomal DNA circles (eccDNAs) in Drosophila as an additional source of TE copies that accumulate during aging. Lastly, to combat the natural progression of aging-associated TE expression, we show that knocking down PAF1, a conserved transcription elongation factor that antagonizes RNAi pathways, may bolster suppression of TEs during aging and extend lifespan. Our study suggests that in addition to a possible influence by different genetic backgrounds, small RNA and RNAi mechanisms may mitigate genomic TL expansion despite the increase in TE transcripts during aging. Transposable elements, also called transposons, are genetic parasites found in all animal genomes. Normally, transposons are compacted away in silent chromatin in young animals. But, as animals age and transposon-silencing defense mechanisms break down, transposon RNAs accumulate to significant levels in old animals like fruit flies. An open question is whether the increased levels of transposon RNAs in older animals also correspond to increased genomic copies of transposons. This study approached this question by sequencing the whole genomes of young and old wild-type and mutant flies lacking a functional RNA interference (RNAi) pathway, which naturally silences transposon RNAs. Although the wild-type flies with intact RNAi activity had little new accumulation of transposon copies, the sequencing approach was able to detect several transposon accumulation occurrences in some RNAi mutants. In addition, we found that some fly transposon families can also accumulate as extra-chromosomal circular DNA copies. Lastly, we showed that genetically augmenting the expression of RNAi factors can counteract the rising transposon RNA levels in aging and promote longevity. This study improves our understanding of the animal host genome relationship with transposons during natural aging processes.
Collapse
Affiliation(s)
- Nachen Yang
- Boston University School of Medicine, Department of Biochemistry, Boston, Massachusetts, United States of America
| | - Satyam P. Srivastav
- Boston University School of Medicine, Department of Biochemistry, Boston, Massachusetts, United States of America
| | - Reazur Rahman
- Brandeis University, Department of Biology and Howard Hughes Medical Institute, Waltham, Massachusetts, United States of America
| | - Qicheng Ma
- Boston University School of Medicine, Department of Biochemistry, Boston, Massachusetts, United States of America
| | - Gargi Dayama
- Boston University School of Medicine, Department of Biochemistry, Boston, Massachusetts, United States of America
| | - Sizheng Li
- Boston University School of Medicine, Department of Biochemistry, Boston, Massachusetts, United States of America
| | - Madoka Chinen
- Nuclear Organization and Gene Expression Section, NIDDK, NIH, Bethesda, Maryland, United States of America
| | - Elissa P. Lei
- Nuclear Organization and Gene Expression Section, NIDDK, NIH, Bethesda, Maryland, United States of America
| | - Michael Rosbash
- Brandeis University, Department of Biology and Howard Hughes Medical Institute, Waltham, Massachusetts, United States of America
| | - Nelson C. Lau
- Boston University School of Medicine, Department of Biochemistry, Boston, Massachusetts, United States of America
- Boston University Genome Science Institute, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
19
|
Finding and Characterizing Repeats in Plant Genomes. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2443:327-385. [PMID: 35037215 DOI: 10.1007/978-1-0716-2067-0_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of the available software that can help biologists to scan automatically for these repeats in sequence data or check hypothetical models intended to characterize their structures. Since transposable elements (TEs) are a major source of repeats in plants, many methods have been used or developed for this broad class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided two sections on this topic (for the analysis of genomes or directly of sequenced reads), as well as a selection of the main existing software. It may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of an efficient search for repeats and more complex patterns. We first introduce the key concepts of the art of indexing and mapping or querying sequences. We end the chapter with the more prospective issue of building models of repeat families. We present the Machine Learning approach first, seeking to build predictors automatically for some families of ET, from a set of sequences known to belong to this family. A second approach, the linguistic (or syntactic) approach, allows biologists to describe themselves and check the validity of models of their favorite repeat family.
Collapse
|
20
|
Gebert D, Neubert LK, Lloyd C, Gui J, Lehmann R, Teixeira FK. Large Drosophila germline piRNA clusters are evolutionarily labile and dispensable for transposon regulation. Mol Cell 2021; 81:3965-3978.e5. [PMID: 34352205 PMCID: PMC8516431 DOI: 10.1016/j.molcel.2021.07.011] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 05/23/2021] [Accepted: 07/10/2021] [Indexed: 12/13/2022]
Abstract
PIWI proteins and their guiding Piwi-interacting small RNAs (piRNAs) are crucial for fertility and transposon defense in the animal germline. In most species, the majority of piRNAs are produced from distinct large genomic loci, called piRNA clusters. It is assumed that germline-expressed piRNA clusters, particularly in Drosophila, act as principal regulators to control transposons dispersed across the genome. Here, using synteny analysis, we show that large clusters are evolutionarily labile, arise at loci characterized by recurrent chromosomal rearrangements, and are mostly species-specific across the Drosophila genus. By engineering chromosomal deletions in D. melanogaster, we demonstrate that the three largest germline clusters, which account for the accumulation of >40% of all transposon-targeting piRNAs in ovaries, are neither required for fertility nor for transposon regulation in trans. We provide further evidence that dispersed elements, rather than the regulatory action of large Drosophila germline clusters in trans, may be central for transposon defense.
Collapse
Affiliation(s)
- Daniel Gebert
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Lena K Neubert
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Catrin Lloyd
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Jinghua Gui
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| | - Ruth Lehmann
- Howard Hughes Medical Institute (HHMI) and Kimmel Center for Biology and Medicine of the Skirball Institute, Department of Cell Biology, New York University School of Medicine, New York, NY 10016, USA.
| | | |
Collapse
|
21
|
Han S, Basting PJ, Dias GB, Luhur A, Zelhof AC, Bergman CM. Transposable element profiles reveal cell line identity and loss of heterozygosity in Drosophila cell culture. Genetics 2021; 219:6321957. [PMID: 34849875 PMCID: PMC8633141 DOI: 10.1093/genetics/iyab113] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 07/01/2021] [Indexed: 11/28/2022] Open
Abstract
Cell culture systems allow key insights into biological mechanisms yet suffer from irreproducible outcomes in part because of cross-contamination or mislabeling of cell lines. Cell line misidentification can be mitigated by the use of genotyping protocols, which have been developed for human cell lines but are lacking for many important model species. Here, we leverage the classical observation that transposable elements (TEs) proliferate in cultured Drosophila cells to demonstrate that genome-wide TE insertion profiles can reveal the identity and provenance of Drosophila cell lines. We identify multiple cases where TE profiles clarify the origin of Drosophila cell lines (Sg4, mbn2, and OSS_E) relative to published reports, and also provide evidence that insertions from only a subset of long-terminal repeat retrotransposon families are necessary to mark Drosophila cell line identity. We also develop a new bioinformatics approach to detect TE insertions and estimate intra-sample allele frequencies in legacy whole-genome sequencing data (called ngs_te_mapper2), which revealed loss of heterozygosity as a mechanism shaping the unique TE profiles that identify Drosophila cell lines. Our work contributes to the general understanding of the forces impacting metazoan genomes as they evolve in cell culture and paves the way for high-throughput protocols that use TE insertions to authenticate cell lines in Drosophila and other organisms.
Collapse
Affiliation(s)
- Shunhua Han
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Preston J Basting
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Guilherme B Dias
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.,Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | - Arthur Luhur
- Drosophila Genomics Resource Center, Indiana University, Bloomington, IN 47405, USA.,Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Andrew C Zelhof
- Drosophila Genomics Resource Center, Indiana University, Bloomington, IN 47405, USA.,Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | - Casey M Bergman
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.,Department of Genetics, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
22
|
Zhang G, Yu T, Parhad SS, Ho S, Weng Z, Theurkauf WE. piRNA-independent transposon silencing by the Drosophila THO complex. Dev Cell 2021; 56:2623-2635.e5. [PMID: 34547226 DOI: 10.1016/j.devcel.2021.08.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 06/18/2021] [Accepted: 08/27/2021] [Indexed: 12/19/2022]
Abstract
piRNAs guide Piwi/Panoramix-dependent H3K9me3 chromatin modification and transposon silencing during Drosophila germline development. The THO RNA export complex is composed of Hpr1, Tho2, and Thoc5-7. Null thoc7 mutations, which displace Thoc5 and Thoc6 from a Tho2-Hpr1 subcomplex, reduce expression of a subset of germline piRNAs and increase transposon expression, suggesting that THO silences transposons by promoting piRNA biogenesis. Here, we show that the thoc7-null mutant combination increases transposon transcription but does not reduce anti-sense piRNAs targeting half of the transcriptionally activated transposon families. These mutations also fail to reduce piRNA-guided H3K9me3 chromatin modification or block Panoramix-dependent silencing of a reporter transgene, and unspliced transposon transcripts co-precipitate with THO through a Piwi- and Panoramix-independent mechanism. Mutations in piwi also dominantly enhance germline defects associated with thoc7-null alleles. THO thus functions in a piRNA-independent transposon-silencing pathway, which acts cooperatively with Piwi to support germline development.
Collapse
Affiliation(s)
- Gen Zhang
- Program in Molecular Medicine, University of Massachusetts Medical School, 373 Plantation Street, Worcester, MA 01605, USA
| | - Tianxiong Yu
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 373 Plantation Street, Worcester, MA 01605, USA; Department of Bioinformatics, School of Life Sciences and Technology, Tongji University, Shanghai, People's Republic of China
| | - Swapnil S Parhad
- Program in Molecular Medicine, University of Massachusetts Medical School, 373 Plantation Street, Worcester, MA 01605, USA
| | - Samantha Ho
- Program in Molecular Medicine, University of Massachusetts Medical School, 373 Plantation Street, Worcester, MA 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 373 Plantation Street, Worcester, MA 01605, USA.
| | - William E Theurkauf
- Program in Molecular Medicine, University of Massachusetts Medical School, 373 Plantation Street, Worcester, MA 01605, USA.
| |
Collapse
|
23
|
Tan S, Ma H, Wang J, Wang M, Wang M, Yin H, Zhang Y, Zhang X, Shen J, Wang D, Banes GL, Zhang Z, Wu J, Huang X, Chen H, Ge S, Chen CL, Zhang YE. DNA transposons mediate duplications via transposition-independent and -dependent mechanisms in metazoans. Nat Commun 2021; 12:4280. [PMID: 34257290 PMCID: PMC8277862 DOI: 10.1038/s41467-021-24585-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Accepted: 06/23/2021] [Indexed: 01/06/2023] Open
Abstract
Despite long being considered as "junk", transposable elements (TEs) are now accepted as catalysts of evolution. One example is Mutator-like elements (MULEs, one type of terminal inverted repeat DNA TEs, or TIR TEs) capturing sequences as Pack-MULEs in plants. However, their origination mechanism remains perplexing, and whether TIR TEs mediate duplication in animals is almost unexplored. Here we identify 370 Pack-TIRs in 100 animal reference genomes and one Pack-TIR (Ssk-FB4) family in fly populations. We find that single-copy Pack-TIRs are mostly generated via transposition-independent gap filling, and multicopy Pack-TIRs are likely generated by transposition after replication fork switching. We show that a proportion of Pack-TIRs are transcribed and often form chimeras with hosts. We also find that Ssk-FB4s represent a young protein family, as supported by proteomics and signatures of positive selection. Thus, TIR TEs catalyze new gene structures and new genes in animals via both transposition-independent and -dependent mechanisms.
Collapse
Affiliation(s)
- Shengjun Tan
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Huijing Ma
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jinbo Wang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Man Wang
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, Beijing, China
| | - Mengxia Wang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Haodong Yin
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yaqiong Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Xinying Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jieyu Shen
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Danyang Wang
- University of Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, China
| | - Graham L Banes
- Wisconsin National Primate Research Center, University of Wisconsin-Madison, Madison, WI, USA
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Zhihua Zhang
- University of Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, China
| | - Jianmin Wu
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Center for Cancer Bioinformatics, Peking University Cancer Hospital & Institute, Beijing, China
| | - Xun Huang
- University of Chinese Academy of Sciences, Beijing, China
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Hua Chen
- University of Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genomics and Precision Medicine, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, China
- CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Siqin Ge
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Chun-Long Chen
- Curie Institute, PSL Research University, CNRS UMR 3244, Paris, France.
- Sorbonne University, Paris, France.
| | - Yong E Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
- CAS Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China.
- Chinese Institute for Brain Research, Beijing, China.
| |
Collapse
|
24
|
Fabry MH, Falconio FA, Joud F, Lythgoe EK, Czech B, Hannon GJ. Maternally inherited piRNAs direct transient heterochromatin formation at active transposons during early Drosophila embryogenesis. eLife 2021; 10:e68573. [PMID: 34236313 PMCID: PMC8352587 DOI: 10.7554/elife.68573] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 07/07/2021] [Indexed: 12/12/2022] Open
Abstract
The PIWI-interacting RNA (piRNA) pathway controls transposon expression in animal germ cells, thereby ensuring genome stability over generations. In Drosophila, piRNAs are intergenerationally inherited through the maternal lineage, and this has demonstrated importance in the specification of piRNA source loci and in silencing of I- and P-elements in the germ cells of daughters. Maternally inherited Piwi protein enters somatic nuclei in early embryos prior to zygotic genome activation and persists therein for roughly half of the time required to complete embryonic development. To investigate the role of the piRNA pathway in the embryonic soma, we created a conditionally unstable Piwi protein. This enabled maternally deposited Piwi to be cleared from newly laid embryos within 30 min and well ahead of the activation of zygotic transcription. Examination of RNA and protein profiles over time, and correlation with patterns of H3K9me3 deposition, suggests a role for maternally deposited Piwi in attenuating zygotic transposon expression in somatic cells of the developing embryo. In particular, robust deposition of piRNAs targeting roo, an element whose expression is mainly restricted to embryonic development, results in the deposition of transient heterochromatic marks at active roo insertions. We hypothesize that roo, an extremely successful mobile element, may have adopted a lifestyle of expression in the embryonic soma to evade silencing in germ cells.
Collapse
Affiliation(s)
- Martin H Fabry
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Federica A Falconio
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Fadwa Joud
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Emily K Lythgoe
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Benjamin Czech
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Gregory J Hannon
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| |
Collapse
|
25
|
Comprehensive identification of transposable element insertions using multiple sequencing technologies. Nat Commun 2021; 12:3836. [PMID: 34158502 PMCID: PMC8219666 DOI: 10.1038/s41467-021-24041-8] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 05/27/2021] [Indexed: 02/05/2023] Open
Abstract
Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea .
Collapse
|
26
|
Yu T, Huang X, Dou S, Tang X, Luo S, Theurkauf WE, Lu J, Weng Z. A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies. Nucleic Acids Res 2021; 49:e44. [PMID: 33511407 PMCID: PMC8096211 DOI: 10.1093/nar/gkab010] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 12/28/2020] [Accepted: 01/06/2021] [Indexed: 02/01/2023] Open
Abstract
Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Although many algorithms are available for detecting transposon insertions, the task remains challenging and existing tools are not designed for identifying de novo insertions. Here, we present a new benchmark fly dataset based on PacBio long-read sequencing and a new method TEMP2 for detecting germline insertions and measuring de novo ‘singleton’ insertion frequencies in eukaryotic genomes. TEMP2 achieves high sensitivity and precision for detecting germline insertions when compared with existing tools using both simulated data in fly and experimental data in fly and human. Furthermore, TEMP2 can accurately assess the frequencies of de novo transposon insertions even with high levels of chimeric reads in simulated datasets; such chimeric reads often occur during the construction of short-read sequencing libraries. By applying TEMP2 to published data on hybrid dysgenic flies inflicted by de-repressed P-elements, we confirmed the continuous new insertions of P-elements in dysgenic offspring before they regain piRNAs for P-element repression. TEMP2 is freely available at Github: https://github.com/weng-lab/TEMP2.
Collapse
Affiliation(s)
- Tianxiong Yu
- Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai 200092, China.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Xiao Huang
- Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Shengqian Dou
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Xiaolu Tang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Shiqi Luo
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - William E Theurkauf
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, School of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Zhiping Weng
- Department of Thoracic Surgery, Clinical Translational Research Center, Shanghai Pulmonary Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai 200092, China.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| |
Collapse
|
27
|
Chu C, Zhao B, Park PJ, Lee EA. Identification and Genotyping of Transposable Element Insertions From Genome Sequencing Data. ACTA ACUST UNITED AC 2021; 107:e102. [PMID: 32662945 DOI: 10.1002/cphg.102] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Transposable element (TE) mobilization is a significant source of genomic variation and has been associated with various human diseases. The exponential growth of population-scale whole-genome sequencing and rapid innovations in long-read sequencing technologies provide unprecedented opportunities to study TE insertions and their functional impact in human health and disease. Identifying TE insertions, however, is challenging due to the repetitive nature of the TE sequences. Here, we review computational approaches to detecting and genotyping TE insertions using short- and long-read sequencing and discuss the strengths and weaknesses of different approaches. © 2020 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Chong Chu
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Boxun Zhao
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Peter J Park
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts
| | - Eunjung Alice Lee
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| |
Collapse
|
28
|
Parhad SS, Yu T, Zhang G, Rice NP, Weng Z, Theurkauf WE. Adaptive Evolution Targets a piRNA Precursor Transcription Network. Cell Rep 2021; 30:2672-2685.e5. [PMID: 32101744 PMCID: PMC7061269 DOI: 10.1016/j.celrep.2020.01.109] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 10/23/2019] [Accepted: 01/30/2020] [Indexed: 12/18/2022] Open
Abstract
In Drosophila, transposon-silencing piRNAs are derived from heterochromatic clusters and a subset of euchromatic transposon insertions, which are bound by the Rhino-Deadlock-Cutoff complex. The HP1 homolog Rhino binds to Deadlock, which recruits TRF2 to promote non-canonical transcription from both genomic strands. Cuff function is less well understood, but this Rai1 homolog shows hallmarks of adaptive evolution, which can remodel functional interactions within host defense systems. Supporting this hypothesis, Drosophila simulans Cutoff is a dominant-negative allele when expressed in Drosophila melanogaster, in which it traps Deadlock, TRF2, and the conserved transcriptional co-repressor CtBP in stable complexes. Cutoff functions with Rhino and Deadlock to drive non-canonical transcription. In contrast, CtBP suppresses canonical transcription of transposons and promoters flanking the major germline clusters, and canonical transcription interferes with downstream non-canonical transcription and piRNA production. Adaptive evolution thus targets interactions among Cutoff, TRF2, and CtBP that balance canonical and non-canonical piRNA precursor transcription.
Collapse
Affiliation(s)
- Swapnil S Parhad
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Tianxiong Yu
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA; School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Gen Zhang
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Nicholas P Rice
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| | - William E Theurkauf
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
29
|
Chen P, Zhang J. Asexual Experimental Evolution of Yeast Does Not Curtail Transposable Elements. Mol Biol Evol 2021; 38:2831-2842. [PMID: 33720342 PMCID: PMC8233515 DOI: 10.1093/molbev/msab073] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Compared with asexual reproduction, sex facilitates the transmission of transposable elements (TEs) from one genome to another, but boosts the efficacy of selection against deleterious TEs. Thus, theoretically, it is unclear whether sex has a positive net effect on TE’s proliferation. An empirical study concluded that sex is at the root of TE’s evolutionary success because the yeast TE load was found to decrease rapidly in approximately 1,000 generations of asexual but not sexual experimental evolution. However, this finding contradicts the maintenance of TEs in natural yeast populations where sexual reproduction occurs extremely infrequently. Here, we show that the purported TE load reduction during asexual experimental evolution is likely an artifact of low genomic sequencing coverages. We observe stable TE loads in both sexual and asexual experimental evolution from multiple yeast data sets with sufficient coverages. To understand the evolutionary dynamics of yeast TEs, we turn to asexual mutation accumulation lines that have been under virtually no selection. We find that both TE transposition and excision rates per generation, but not their difference, tend to be higher in environments where yeast grows more slowly. However, the transposition rate is not significantly higher than the excision rate and the variance of the TE number among natural strains is close to its neutral expectation, suggesting that selection against TEs is at best weak in yeast. We conclude that the yeast TE load is maintained largely by a transposition–excision balance and that the influence of sex remains unclear.
Collapse
Affiliation(s)
- Piaopiao Chen
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
30
|
Sohrab V, López-Díaz C, Di Pietro A, Ma LJ, Ayhan DH. TEfinder: A Bioinformatics Pipeline for Detecting New Transposable Element Insertion Events in Next-Generation Sequencing Data. Genes (Basel) 2021; 12:genes12020224. [PMID: 33557410 PMCID: PMC7914406 DOI: 10.3390/genes12020224] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 01/30/2021] [Accepted: 01/31/2021] [Indexed: 11/16/2022] Open
Abstract
Transposable elements (TEs) are mobile elements capable of introducing genetic changes rapidly. Their importance has been documented in many biological processes, such as introducing genetic instability, altering patterns of gene expression, and accelerating genome evolution. Increasing appreciation of TEs has resulted in a growing number of bioinformatics software to identify insertion events. However, the application of existing tools is limited by either narrow-focused design of the package, too many dependencies on other tools, or prior knowledge required as input files that may not be readily available to all users. Here, we reported a simple pipeline, TEfinder, developed for the detection of new TE insertions with minimal software and input file dependencies. The external software requirements are BEDTools, SAMtools, and Picard. Necessary input files include the reference genome sequence in FASTA format, an alignment file from paired-end reads, existing TEs in GTF format, and a text file of TE names. We tested TEfinder among several evolving populations of Fusarium oxysporum generated through a short-term adaptation study. Our results demonstrate that this easy-to-use tool can effectively detect new TE insertion events, making it accessible and practical for TE analysis.
Collapse
Affiliation(s)
- Vista Sohrab
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA; (V.S.); (L.-J.M.)
| | - Cristina López-Díaz
- Departamento de Genética, Universidad de Córdoba, 14071 Córdoba, Spain; (C.L.-D.); (A.D.P.)
| | - Antonio Di Pietro
- Departamento de Genética, Universidad de Córdoba, 14071 Córdoba, Spain; (C.L.-D.); (A.D.P.)
| | - Li-Jun Ma
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA; (V.S.); (L.-J.M.)
- Molecular and Cellular Biology Graduate Program, University of Massachusetts Amherst, Amherst, MA 01003, USA
| | - Dilay Hazal Ayhan
- Department of Biochemistry and Molecular Biology, University of Massachusetts Amherst, Amherst, MA 01003, USA; (V.S.); (L.-J.M.)
- Molecular and Cellular Biology Graduate Program, University of Massachusetts Amherst, Amherst, MA 01003, USA
- Correspondence:
| |
Collapse
|
31
|
Bogaerts-Márquez M, Barrón MG, Fiston-Lavier AS, Vendrell-Mir P, Castanera R, Casacuberta JM, González J. T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data. Bioinformatics 2020; 36:1191-1197. [PMID: 31580402 PMCID: PMC7703783 DOI: 10.1093/bioinformatics/btz727] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 09/16/2019] [Accepted: 09/25/2019] [Indexed: 12/22/2022] Open
Abstract
Motivation Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete identification of the genetic differences among individuals, populations and species. Results In this work, we present a new version of T-lex, a computational pipeline that accurately genotypes and estimates the population frequencies of reference TE insertions using short-read high-throughput sequencing data. In this new version, we have re-designed the T-lex algorithm to integrate the BWA-MEM short-read aligner, which is one of the most accurate short-read mappers and can be launched on longer short-reads (e.g. reads >150 bp). We have added new filtering steps to increase the accuracy of the genotyping, and new parameters that allow the user to control both the minimum and maximum number of reads, and the minimum number of strains to genotype a TE insertion. We also showed for the first time that T-lex3 provides accurate TE calls in a plant genome. Availability and implementation To test the accuracy of T-lex3, we called 1630 individual TE insertions in Drosophila melanogaster, 1600 individual TE insertions in humans, and 3067 individual TE insertions in the rice genome. We showed that this new version of T-lex is a broadly applicable and accurate tool for genotyping and estimating TE frequencies in organisms with different genome sizes and different TE contents. T-lex3 is available at Github: https://github.com/GonzalezLab/T-lex3. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- María Bogaerts-Márquez
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Paseo Maritimo Barceloneta 37-49, Barcelona, Spain
| | - Maite G Barrón
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Paseo Maritimo Barceloneta 37-49, Barcelona, Spain
| | - Anna-Sophie Fiston-Lavier
- Institut des Sciences de l'Evolution de Montpellier (UMR 5554, CNRS-UM-IRD-EPHE), 11 Université de Motpellier, Place Eugène Bataillon, Montpellier, France
| | - Pol Vendrell-Mir
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Raúl Castanera
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Josep M Casacuberta
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | - Josefa González
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Paseo Maritimo Barceloneta 37-49, Barcelona, Spain
| |
Collapse
|
32
|
Chen X, Li D. ERVcaller: identifying polymorphic endogenous retrovirus and other transposable element insertions using whole-genome sequencing data. Bioinformatics 2020; 35:3913-3922. [PMID: 30895294 DOI: 10.1093/bioinformatics/btz205] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 02/28/2019] [Accepted: 03/19/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Approximately 8% of the human genome is derived from endogenous retroviruses (ERVs). In recent years, an increasing number of human diseases have been found to be associated with ERVs. However, it remains challenging to accurately detect the full spectrum of polymorphic (unfixed) ERVs using whole-genome sequencing (WGS) data. RESULTS We designed a new tool, ERVcaller, to detect and genotype transposable element (TE) insertions, including ERVs, in the human genome. We evaluated ERVcaller using both simulated and real benchmark WGS datasets. Compared to existing tools, ERVcaller consistently obtained both the highest sensitivity and precision for detecting simulated ERV and other TE insertions derived from real polymorphic TE sequences. For the WGS data from the 1000 Genomes Project, ERVcaller detected the largest number of TE insertions per sample based on consensus TE loci. By analyzing the experimentally verified TE insertions, ERVcaller had 94.0% TE detection sensitivity and 96.6% genotyping accuracy. Polymerase chain reaction and Sanger sequencing in a small sample set verified 86.7% of examined insertion statuses and 100% of examined genotypes. In conclusion, ERVcaller is capable of detecting and genotyping TE insertions using WGS data with both high sensitivity and precision. This tool can be applied broadly to other species. AVAILABILITY AND IMPLEMENTATION http://www.uvm.edu/genomics/software/ERVcaller.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xun Chen
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT, USA
| | - Dawei Li
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT, USA.,Neuroscience, Behavior, and Health Initiative, University of Vermont, Burlington, VT, USA.,Department of Computer Science, University of Vermont, Burlington, VT, USA
| |
Collapse
|
33
|
Zhang S, Pointer B, Kelleher ES. Rapid evolution of piRNA-mediated silencing of an invading transposable element was driven by abundant de novo mutations. Genome Res 2020; 30:566-575. [PMID: 32238416 PMCID: PMC7197473 DOI: 10.1101/gr.251546.119] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 03/24/2020] [Indexed: 11/24/2022]
Abstract
The regulation of transposable element (TE) activity by small RNAs is a ubiquitous feature of germlines. However, despite the obvious benefits to the host in terms of ensuring the production of viable gametes and maintaining the integrity of the genomes they carry, it remains controversial whether TE regulation evolves adaptively. We examined the emergence and evolutionary dynamics of repressor alleles after P-elements invaded the Drosophila melanogaster genome in the mid-twentieth century. In many animals including Drosophila, repressor alleles are produced by transpositional insertions into piRNA clusters, genomic regions encoding the Piwi-interacting RNAs (piRNAs) that regulate TEs. We discovered that ∼94% of recently collected isofemale lines in the Drosophila melanogaster Genetic Reference Panel (DGRP) contain at least one P-element insertion in a piRNA cluster, indicating that repressor alleles are produced by de novo insertion at an exceptional rate. Furthermore, in our sample of approximately 200 genomes, we uncovered no fewer than 80 unique P-element insertion alleles in at least 15 different piRNA clusters. Finally, we observe no footprint of positive selection on P-element insertions in piRNA clusters, suggesting that the rapid evolution of piRNA-mediated repression in D. melanogaster was driven primarily by mutation. Our results reveal for the first time how the unique genetic architecture of piRNA production, in which numerous piRNA clusters can encode regulatory small RNAs upon transpositional insertion, facilitates the nonadaptive rapid evolution of repression.
Collapse
Affiliation(s)
- Shuo Zhang
- Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204, USA
| | - Beverly Pointer
- Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204, USA
| | - Erin S Kelleher
- Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204, USA
| |
Collapse
|
34
|
Ellison CE, Cao W. Nanopore sequencing and Hi-C scaffolding provide insight into the evolutionary dynamics of transposable elements and piRNA production in wild strains of Drosophila melanogaster. Nucleic Acids Res 2020; 48:290-303. [PMID: 31754714 PMCID: PMC6943127 DOI: 10.1093/nar/gkz1080] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 10/29/2019] [Accepted: 11/01/2019] [Indexed: 01/29/2023] Open
Abstract
Illumina sequencing has allowed for population-level surveys of transposable element (TE) polymorphism via split alignment approaches, which has provided important insight into the population dynamics of TEs. However, such approaches are not able to identify insertions of uncharacterized TEs, nor can they assemble the full sequence of inserted elements. Here, we use nanopore sequencing and Hi-C scaffolding to produce de novo genome assemblies for two wild strains of Drosophila melanogaster from the Drosophila Genetic Reference Panel (DGRP). Ovarian piRNA populations and Illumina split-read TE insertion profiles have been previously produced for both strains. We find that nanopore sequencing with Hi-C scaffolding produces highly contiguous, chromosome-length scaffolds, and we identify hundreds of TE insertions that were missed by Illumina-based methods, including a novel micropia-like element that has recently invaded the DGRP population. We also find hundreds of piRNA-producing loci that are specific to each strain. Some of these loci are created by strain-specific TE insertions, while others appear to be epigenetically controlled. Our results suggest that Illumina approaches reveal only a portion of the repetitive sequence landscape of eukaryotic genomes and that population-level resequencing using long reads is likely to provide novel insight into the evolutionary dynamics of repetitive elements.
Collapse
Affiliation(s)
- Christopher E Ellison
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Weihuan Cao
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| |
Collapse
|
35
|
Miniature inverted-repeat transposable elements (MITEs), derived insertional polymorphism as a tool of marker systems for molecular plant breeding. Mol Biol Rep 2020; 47:3155-3167. [PMID: 32162128 DOI: 10.1007/s11033-020-05365-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 02/29/2020] [Indexed: 12/20/2022]
Abstract
Plant molecular breeding is expected to give significant gains in cultivar development through development and utilization of suitable molecular marker systems for genetic diversity analysis, rapid DNA fingerprinting, identification of true hybrids, trait mapping and marker-assisted selection. Transposable elements (TEs) are the most abundant component in a genome and being used as genetic markers in the plant molecular breeding. Here, we review on the high copious transposable element belonging to class-II DNA TEs called "miniature inverted-repeat transposable elements" (MITEs). MITEs are ubiquitous, short and non-autonomous DNA transposable elements which have a tendency to insert into genes and genic regions have paved a way for the development of functional DNA marker systems in plant genomes. This review summarises the characteristics of MITEs, principles and methodologies for development of MITEs based DNA markers, bioinformatics tools and resources for plant MITE discovery and their utilization in crop improvement.
Collapse
|
36
|
Jiménez-Ruiz J, Ramírez-Tejero JA, Fernández-Pozo N, Leyva-Pérez MDLO, Yan H, Rosa RDL, Belaj A, Montes E, Rodríguez-Ariza MO, Navarro F, Barroso JB, Beuzón CR, Valpuesta V, Bombarely A, Luque F. Transposon activation is a major driver in the genome evolution of cultivated olive trees (Olea europaea L.). THE PLANT GENOME 2020; 13:e20010. [PMID: 33016633 DOI: 10.1002/tpg2.20010] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 01/15/2020] [Indexed: 05/25/2023]
Abstract
The primary domestication of olive (Olea europaea L.) in the Levant dates back to the Neolithic period, around 6,000-5,500 BC, as some archeological remains attest. Cultivated olive trees are reproduced clonally, with sexual crosses being the sporadic events that drive the development of new varieties. In order to determine the genomic changes which have occurred in a modern olive cultivar, the genome of the Picual cultivar, one of the most popular olive varieties, was sequenced. Additional 40 cultivated and 10 wild accessions were re-sequenced to elucidate the evolution of the olive genome during the domestication process. It was found that the genome of the 'Picual' cultivar contains 79,667 gene models, of which 78,079 were protein-coding genes and 1,588 were tRNA. Population analyses support two independent events in olive domestication, including an early possible genetic bottleneck. Despite genetic bottlenecks, cultivated accessions showed a high genetic diversity driven by the activation of transposable elements (TE). A high TE gene expression was observed in presently cultivated olives, which suggests a current activity of TEs in domesticated olives. Several TEs families were expanded in the last 5,000 or 6,000 years and produced insertions near genes that may have been involved in selected traits during domestication as reproduction, photosynthesis, seed development, and oil production. Therefore, a great genetic variability has been found in cultivated olive as a result of a significant activation of TEs during the domestication process.
Collapse
Affiliation(s)
- Jaime Jiménez-Ruiz
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Jorge A Ramírez-Tejero
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Noé Fernández-Pozo
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - María de la O Leyva-Pérez
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Haidong Yan
- School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Raúl de la Rosa
- Centro de Investigación y Formación Agraria de Alameda del Obispo, Instituto de Investigación y Formación Agraria y Pesquera (IFAPA), Córdoba, Spain
| | - Angjelina Belaj
- Centro de Investigación y Formación Agraria de Alameda del Obispo, Instituto de Investigación y Formación Agraria y Pesquera (IFAPA), Córdoba, Spain
| | - Eva Montes
- Instituto Universitario de Investigación en Arqueología Ibérica, University. Jaén, Jaén, 23071, Spain
| | - Mª Oliva Rodríguez-Ariza
- Instituto Universitario de Investigación en Arqueología Ibérica, University. Jaén, Jaén, 23071, Spain
| | - Francisco Navarro
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Juan Bautista Barroso
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Carmen R Beuzón
- Departamento de Biología Celular, Genética y Fisiología, Facultad de Ciencias, Instituto de Hortofruticultura Subtropical y Mediterránea, Universidad de Málaga - Consejo Superior de Investigaciones Científicas, Málaga, Spain
| | - Victoriano Valpuesta
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Instituto de Hortofruticultura Subtropical y Mediterránea, Universidad de Málaga - Consejo Superior de Investigaciones Científicas, Málaga, Spain
| | - Aureliano Bombarely
- School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
- present address, Department of Bioscience, Universita degli Studi di Milano, Milan, 20133, Italy
| | - Francisco Luque
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| |
Collapse
|
37
|
Luo S, Zhang H, Duan Y, Yao X, Clark AG, Lu J. The evolutionary arms race between transposable elements and piRNAs in Drosophila melanogaster. BMC Evol Biol 2020; 20:14. [PMID: 31992188 PMCID: PMC6988346 DOI: 10.1186/s12862-020-1580-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Accepted: 01/13/2020] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND The piwi-interacting RNAs (piRNAs) are small non-coding RNAs that specifically repress transposable elements (TEs) in the germline of Drosophila. Despite our expanding understanding of TE:piRNA interaction, whether there is an evolutionary arms race between TEs and piRNAs was unclear. RESULTS Here, we studied the population genomics of TEs and piRNAs in the worldwide strains of D. melanogaster. By conducting a correlation analysis between TE contents and the abundance of piRNAs from ovaries of representative strains of D. melanogaster, we find positive correlations between TEs and piRNAs in six TE families. Our simulations further highlight that TE activities and the strength of purifying selection against TEs are important factors shaping the interactions between TEs and piRNAs. Our studies also suggest that the de novo generation of piRNAs is an important mechanism to repress the newly invaded TEs. CONCLUSIONS Our results revealed the existence of an evolutionary arms race between the copy numbers of TEs and the abundance of antisense piRNAs at the population level. Although the interactions between TEs and piRNAs are complex and many factors should be considered to impact their interaction dynamics, our results suggest the emergence, repression specificity and strength of piRNAs on TEs should be considered in studying the landscapes of TE insertions in Drosophila. These results deepen our understanding of the interactions between piRNAs and TEs, and also provide novel insights into the nature of genomic conflicts of other forms.
Collapse
Affiliation(s)
- Shiqi Luo
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China
- College of Plant Protection, Beijing Advanced Innovation Center for Food Nutrition and Human Health, China Agricultural University, Beijing, 100193, China
| | - Hong Zhang
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China
| | - Yuange Duan
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Xinmin Yao
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA.
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, 100871, China.
| |
Collapse
|
38
|
Vendrell-Mir P, Barteri F, Merenciano M, González J, Casacuberta JM, Castanera R. A benchmark of transposon insertion detection tools using real data. Mob DNA 2019; 10:53. [PMID: 31892957 PMCID: PMC6937713 DOI: 10.1186/s13100-019-0197-9] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 12/17/2019] [Indexed: 02/01/2023] Open
Abstract
Background Transposable elements (TEs) are an important source of genomic variability in eukaryotic genomes. Their activity impacts genome architecture and gene expression and can lead to drastic phenotypic changes. Therefore, identifying TE polymorphisms is key to better understand the link between genotype and phenotype. However, most genotype-to-phenotype analyses have concentrated on single nucleotide polymorphisms as they are easier to reliable detect using short-read data. Many bioinformatic tools have been developed to identify transposon insertions from resequencing data using short reads. Nevertheless, the performance of most of these tools has been tested using simulated insertions, which do not accurately reproduce the complexity of natural insertions. Results We have overcome this limitation by building a dataset of insertions from the comparison of two high-quality rice genomes, followed by extensive manual curation. This dataset contains validated insertions of two very different types of TEs, LTR-retrotransposons and MITEs. Using this dataset, we have benchmarked the sensitivity and precision of 12 commonly used tools, and our results suggest that in general their sensitivity was previously overestimated when using simulated data. Our results also show that, increasing coverage leads to a better sensitivity but with a cost in precision. Moreover, we found important differences in tool performance, with some tools performing better on a specific type of TEs. We have also used two sets of experimentally validated insertions in Drosophila and humans and show that this trend is maintained in genomes of different size and complexity. Conclusions We discuss the possible choice of tools depending on the goals of the study and show that the appropriate combination of tools could be an option for most approaches, increasing the sensitivity while maintaining a good precision.
Collapse
Affiliation(s)
- Pol Vendrell-Mir
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| | - Fabio Barteri
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| | - Miriam Merenciano
- 2Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Maritim Barceloneta 37-49, 08003 Barcelona, Spain
| | - Josefa González
- 2Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Maritim Barceloneta 37-49, 08003 Barcelona, Spain
| | - Josep M Casacuberta
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| | - Raúl Castanera
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| |
Collapse
|
39
|
The piRNA Response to Retroviral Invasion of the Koala Genome. Cell 2019; 179:632-643.e12. [PMID: 31607510 DOI: 10.1016/j.cell.2019.09.002] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 07/19/2019] [Accepted: 08/30/2019] [Indexed: 12/20/2022]
Abstract
Antisense Piwi-interacting RNAs (piRNAs) guide silencing of established transposons during germline development, and sense piRNAs drive ping-pong amplification of the antisense pool, but how the germline responds to genome invasion is not understood. The KoRV-A gammaretrovirus infects the soma and germline and is sweeping through wild koalas by a combination of horizontal and vertical transfer, allowing direct analysis of retroviral invasion of the germline genome. Gammaretroviruses produce spliced Env mRNAs and unspliced transcripts encoding Gag, Pol, and the viral genome, but KoRV-A piRNAs are almost exclusively derived from unspliced genomic transcripts and are strongly sense-strand biased. Significantly, selective piRNA processing of unspliced proviral transcripts is conserved from insects to placental mammals. We speculate that bypassed splicing generates a conserved molecular pattern that directs proviral genomic transcripts to the piRNA biogenesis machinery and that this "innate" piRNA response suppresses transposition until antisense piRNAs are produced, establishing sequence-specific adaptive immunity.
Collapse
|
40
|
Rajaby R, Sung WK. TranSurVeyor: an improved database-free algorithm for finding non-reference transpositions in high-throughput sequencing data. Nucleic Acids Res 2019; 46:e122. [PMID: 30137425 PMCID: PMC6237741 DOI: 10.1093/nar/gky685] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 07/19/2018] [Indexed: 01/21/2023] Open
Abstract
Transpositions transfer DNA segments between different loci within a genome; in particular, when a transposition is found in a sample but not in a reference genome, it is called a non-reference transposition. They are important structural variations that have clinical impact. Transpositions can be called by analyzing second generation high-throughput sequencing datasets. Current methods follow either a database-based or a database-free approach. Database-based methods require a database of transposable elements. Some of them have good specificity; however this approach cannot detect novel transpositions, and it requires a good database of transposable elements, which is not yet available for many species. Database-free methods perform de novo calling of transpositions, but their accuracy is low. We observe that this is due to the misalignment of the reads; since reads are short and the human genome has many repeats, false alignments create false positive predictions while missing alignments reduce the true positive rate. This paper proposes new techniques to improve database-free non-reference transposition calling: first, we propose a realignment strategy called one-end remapping that corrects the alignments of reads in interspersed repeats; second, we propose a SNV-aware filter that removes some incorrectly aligned reads. By combining these two techniques and other techniques like clustering and positive-to-negative ratio filter, our proposed transposition caller TranSurVeyor shows at least 3.1-fold improvement in terms of F1-score over existing database-free methods. More importantly, even though TranSurVeyor does not use databases of prior information, its performance is at least as good as existing database-based methods such as MELT, Mobster and Retroseq. We also illustrate that TranSurVeyor can discover transpositions that are not known in the current database.
Collapse
Affiliation(s)
- Ramesh Rajaby
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, 28 Medical Drive, 117456, Singapore
| | - Wing-Kin Sung
- School of Computing, National University of Singapore, 13 Computing Drive, 117417, Singapore.,Genome Institute of Singapore, 60 Biopolis Street, Genome, 138672, Singapore
| |
Collapse
|
41
|
Bourgeois Y, Boissinot S. On the Population Dynamics of Junk: A Review on the Population Genomics of Transposable Elements. Genes (Basel) 2019; 10:genes10060419. [PMID: 31151307 PMCID: PMC6627506 DOI: 10.3390/genes10060419] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 05/05/2019] [Accepted: 05/21/2019] [Indexed: 01/18/2023] Open
Abstract
Transposable elements (TEs) play an important role in shaping genomic organization and structure, and may cause dramatic changes in phenotypes. Despite the genetic load they may impose on their host and their importance in microevolutionary processes such as adaptation and speciation, the number of population genetics studies focused on TEs has been rather limited so far compared to single nucleotide polymorphisms (SNPs). Here, we review the current knowledge about the dynamics of transposable elements at recent evolutionary time scales, and discuss the mechanisms that condition their abundance and frequency. We first discuss non-adaptive mechanisms such as purifying selection and the variable rates of transposition and elimination, and then focus on positive and balancing selection, to finally conclude on the potential role of TEs in causing genomic incompatibilities and eventually speciation. We also suggest possible ways to better model TEs dynamics in a population genomics context by incorporating recent advances in TEs into the rich information provided by SNPs about the demography, selection, and intrinsic properties of genomes.
Collapse
Affiliation(s)
- Yann Bourgeois
- New York University Abu Dhabi, P.O. 129188, Saadiyat Island, Abu Dhabi, United Arab Emirates.
| | - Stéphane Boissinot
- New York University Abu Dhabi, P.O. 129188, Saadiyat Island, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
42
|
Lerat E, Casacuberta J, Chaparro C, Vieira C. On the Importance to Acknowledge Transposable Elements in Epigenomic Analyses. Genes (Basel) 2019; 10:genes10040258. [PMID: 30935103 PMCID: PMC6523952 DOI: 10.3390/genes10040258] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 03/27/2019] [Accepted: 03/27/2019] [Indexed: 12/21/2022] Open
Abstract
Eukaryotic genomes comprise a large proportion of repeated sequences, an important fraction of which are transposable elements (TEs). TEs are mobile elements that have a significant impact on genome evolution and on gene functioning. Although some TE insertions could provide adaptive advantages to species, transposition is a highly mutagenic event that has to be tightly controlled to ensure its viability. Genomes have evolved sophisticated mechanisms to control TE activity, the most important being epigenetic silencing. However, the epigenetic control of TEs can also affect genes located nearby that can become epigenetically regulated. It has been proposed that the combination of TE mobilization and the induced changes in the epigenetic landscape could allow a rapid phenotypic adaptation to global environmental changes. In this review, we argue the crucial need to take into account the repeated part of genomes when studying the global impact of epigenetic modifications on an organism. We emphasize more particularly why it is important to carefully consider TEs and what bioinformatic tools can be used to do so.
Collapse
Affiliation(s)
- Emmanuelle Lerat
- CNRS, Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, UMR 5558, F-69622 Villeurbanne, France.
| | - Josep Casacuberta
- Center for Research in Agricultural Genomics, CRAG (CSIC-IRTA-UAB-UB), Campus UAB, Cerdanyola del Vallès, 08193 Barcelona, Spain.
| | - Cristian Chaparro
- CNRS, IHPE UMR 5244, University of Perpignan Via Domitia, IFREMER, University Montpellier, F-66860 Perpignan, France.
| | - Cristina Vieira
- CNRS, Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, UMR 5558, F-69622 Villeurbanne, France.
| |
Collapse
|
43
|
Liu Y, El-Kassaby YA. Novel Insights into Plant Genome Evolution and Adaptation as Revealed through Transposable Elements and Non-Coding RNAs in Conifers. Genes (Basel) 2019; 10:genes10030228. [PMID: 30889931 PMCID: PMC6470726 DOI: 10.3390/genes10030228] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 03/08/2019] [Accepted: 03/11/2019] [Indexed: 01/03/2023] Open
Abstract
Plant genomes are punctuated by repeated bouts of proliferation of transposable elements (TEs), and these mobile bursts are followed by silencing and decay of most of the newly inserted elements. As such, plant genomes reflect TE-related genome expansion and shrinkage. In general, these genome activities involve two mechanisms: small RNA-mediated epigenetic repression and long-term mutational decay and deletion, that is, genome-purging. Furthermore, the spatial relationships between TE insertions and genes are an important force in shaping gene regulatory networks, their downstream metabolic and physiological outputs, and thus their phenotypes. Such cascading regulations finally set up a fitness differential among individuals. This brief review demonstrates factual evidence that unifies most updated conceptual frameworks covering genome size, architecture, epigenetic reprogramming, and gene expression. It aims to give an overview of the impact that TEs may have on genome and adaptive evolution and to provide novel insights into addressing possible causes and consequences of intimidating genome sizes (20⁻30 Gb) in a taxonomic group, conifers.
Collapse
Affiliation(s)
- Yang Liu
- Department of Forest and Conservation Sciences, The University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.
| | - Yousry A El-Kassaby
- Department of Forest and Conservation Sciences, The University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada.
| |
Collapse
|
44
|
Lerat E, Goubert C, Guirao‐Rico S, Merenciano M, Dufour A, Vieira C, González J. Population-specific dynamics and selection patterns of transposable element insertions in European natural populations. Mol Ecol 2019; 28:1506-1522. [PMID: 30506554 PMCID: PMC6849870 DOI: 10.1111/mec.14963] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 10/30/2018] [Accepted: 11/05/2018] [Indexed: 01/02/2023]
Abstract
Transposable elements (TEs) are ubiquitous sequences in genomes of virtually all species. While TEs have been investigated for several decades, only recently we have the opportunity to study their genome-wide population dynamics. Most of the studies so far have been restricted either to the analysis of the insertions annotated in the reference genome or to the analysis of a limited number of populations. Taking advantage of the European Drosophila population genomics consortium (DrosEU) sequencing data set, we have identified and measured the dynamics of TEs in a large sample of European Drosophila melanogaster natural populations. We showed that the mobilome landscape is population-specific and highly diverse depending on the TE family. In contrast with previous studies based on SNP variants, no geographical structure was observed for TE abundance or TE divergence in European populations. We further identified de novo individual insertions using two available programs and, as expected, most of the insertions were present at low frequencies. Nevertheless, we identified a subset of TEs present at high frequencies and located in genomic regions with a high recombination rate. These TEs are candidates for being the target of positive selection, although neutral processes should be discarded before reaching any conclusion on the type of selection acting on them. Finally, parallel patterns of association between the frequency of TE insertions and several geographical and temporal variables were found between European and North American populations, suggesting that TEs can be potentially implicated in the adaptation of populations across continents.
Collapse
Affiliation(s)
- Emmanuelle Lerat
- Laboratoire de Biométrie et Biologie EvolutiveUMR 5558Université de Lyon, Université Lyon 1, CNRSVilleurbanneFrance
| | - Clément Goubert
- Molecular Biology and GeneticsCornell UniversityIthacaNew York
| | - Sara Guirao‐Rico
- Institute of Evolutionary Biology (CSIC‐Universitat Pompeu Fabra)BarcelonaSpain
| | - Miriam Merenciano
- Institute of Evolutionary Biology (CSIC‐Universitat Pompeu Fabra)BarcelonaSpain
| | - Anne‐Béatrice Dufour
- Laboratoire de Biométrie et Biologie EvolutiveUMR 5558Université de Lyon, Université Lyon 1, CNRSVilleurbanneFrance
| | - Cristina Vieira
- Laboratoire de Biométrie et Biologie EvolutiveUMR 5558Université de Lyon, Université Lyon 1, CNRSVilleurbanneFrance
| | - Josefa González
- Institute of Evolutionary Biology (CSIC‐Universitat Pompeu Fabra)BarcelonaSpain
| |
Collapse
|
45
|
Adrion JR, Begun DJ, Hahn MW. Patterns of transposable element variation and clinality in
Drosophila. Mol Ecol 2019; 28:1523-1536. [DOI: 10.1111/mec.14961] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 11/14/2018] [Accepted: 11/15/2018] [Indexed: 01/02/2023]
Affiliation(s)
- Jeffrey R. Adrion
- Department of Biology University of Oregon Eugene Oregon
- Department of Biology Indiana University Bloomington Indiana
| | - David J. Begun
- Department of Evolution and Ecology University of California Davis, Davis California
| | - Matthew W. Hahn
- Department of Biology Indiana University Bloomington Indiana
- Department of Computer Science Indiana University Bloomington Indiana
| |
Collapse
|
46
|
Gramazio P, Yan H, Hasing T, Vilanova S, Prohens J, Bombarely A. Whole-Genome Resequencing of Seven Eggplant ( Solanum melongena) and One Wild Relative ( S. incanum) Accessions Provides New Insights and Breeding Tools for Eggplant Enhancement. FRONTIERS IN PLANT SCIENCE 2019; 10:1220. [PMID: 31649694 PMCID: PMC6791922 DOI: 10.3389/fpls.2019.01220] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Accepted: 09/04/2019] [Indexed: 05/20/2023]
Abstract
Whole-genome resequencing provides information of great relevance for crop genetics, evolution, and breeding. Here, we present the first whole-genome resequencing study using seven eggplant (Solanum melongena) and one wild relative (Solanum incanum) accessions. These eight accessions were selected for displaying a high phenotypic and genetic diversity and for being the founder parents of an eggplant multiparent advanced generation intercrosses population. By resequencing at an average depth of 19.8× and comparing to the high-quality reference genome "67/3" over 10 million high-reliable polymorphisms were discovered, of which over 9 million (84.5%) were single nucleotide polymorphisms and more than 700,000 (6.5%) InDels. However, while for the S. melongena accessions, the variants identified ranged from 0.8 to 1.3 million, over 9 million were detected for the wild S. incanum. This confirms the narrow genetic diversity of the domesticated eggplant and suggests that introgression breeding using wild relatives can efficiently contribute to broadening the genetic basis of this crop. Differences were observed among accessions for the enrichment in genes regulating important biological processes. By analyzing the distribution of the variants, we identified the potential footprints of old introgressions from wild relatives that can help to unravel the unclear domestication and breeding history. The comprehensive annotation of these eight genomes and the information provided in this study represents a landmark in eggplant genomics and allows the development of tools for eggplant genetics and breeding.
Collapse
Affiliation(s)
- Pietro Gramazio
- Faculty of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Valencia, Spain
- *Correspondence: Pietro Gramazio,
| | - Haidong Yan
- School of Plant and Environmental Sciences (SPES), Virginia Tech, Blacksburg, VA, United States
| | - Tomas Hasing
- School of Plant and Environmental Sciences (SPES), Virginia Tech, Blacksburg, VA, United States
| | - Santiago Vilanova
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Valencia, Spain
| | - Jaime Prohens
- Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Valencia, Spain
| | - Aureliano Bombarely
- School of Plant and Environmental Sciences (SPES), Virginia Tech, Blacksburg, VA, United States
- Department of Biosciences, Università degli Studi di Milano, Milan, Italy
| |
Collapse
|
47
|
Bae J, Lee KW, Islam MN, Yim HS, Park H, Rho M. iMGEins: detecting novel mobile genetic elements inserted in individual genomes. BMC Genomics 2018; 19:944. [PMID: 30563451 PMCID: PMC6299635 DOI: 10.1186/s12864-018-5290-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Accepted: 11/20/2018] [Indexed: 11/10/2022] Open
Abstract
Background Recent advances in sequencing technology have allowed us to investigate personal genomes to find structural variations, which have been studied extensively to identify their association with the physiology of diseases such as cancer. In particular, mobile genetic elements (MGEs) are one of the major constituents of the human genomes, and cause genome instability by insertion, mutation, and rearrangement. Result We have developed a new program, iMGEins, to identify such novel MGEs by using sequencing reads of individual genomes, and to explore the breakpoints with the supporting reads and MGEs detected. iMGEins is the first MGE detection program that integrates three algorithmic components: discordant read-pair mapping, split-read mapping, and insertion sequence assembly. Our evaluation results showed its outstanding performance in detecting novel MGEs from simulated genomes, as well as real personal genomes. In detail, the average recall and precision rates of iMGEins are 96.67 and 100%, respectively, which are the highest among the programs compared. In the testing with real human genomes of the NA12878 sample, iMGEins shows the highest accuracy in detecting MGEs within 20 bp proximity of the breakpoints annotated. Conclusion In order to study the dynamics of MGEs in individual genomes, iMGEins was developed to accurately detect breakpoints and report inserted MGEs. Compared with other programs, iMGEins has valuable features of identifying novel MGEs and assembling the MGEs inserted. Electronic supplementary material The online version of this article (10.1186/s12864-018-5290-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Junwoo Bae
- Department of Electronics and Computer Engineering, Hanyang University, Seoul, Korea
| | - Kyeong Won Lee
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea
| | - Mohammad Nazrul Islam
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea.,Department of Marine Biotechnology, Korea University of Science and Technology, Daejeon, Korea.,Department of Biotechnology, Sher-e-Bangla Agricultural University, Dhaka, 1207, Bangladesh
| | - Hyung-Soon Yim
- Marine Biotechnology Research Center, Korea Institute of Ocean Science and Technology, Ansan, Korea.,Department of Marine Biotechnology, Korea University of Science and Technology, Daejeon, Korea
| | - Heejin Park
- Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. .,Department of Biomedical Informatics, Hanyang University, Seoul, Korea.
| | - Mina Rho
- Department of Computer Science and Engineering, Hanyang University, Seoul, Korea. .,Department of Biomedical Informatics, Hanyang University, Seoul, Korea.
| |
Collapse
|
48
|
Moon S, Cassani M, Lin YA, Wang L, Dou K, Zhang ZZ. A Robust Transposon-Endogenizing Response from Germline Stem Cells. Dev Cell 2018; 47:660-671.e3. [PMID: 30393075 DOI: 10.1016/j.devcel.2018.10.011] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 08/06/2018] [Accepted: 10/05/2018] [Indexed: 01/30/2023]
Abstract
The heavy occupancy of transposons in the genome implies that existing organisms have survived from multiple, independent rounds of transposon invasions. However, how and which host cell types survive the initial wave of transposon invasion remain unclear. We show that the germline stem cells can initiate a robust adaptive response that rapidly endogenizes invading P element transposons by activating the DNA damage checkpoint and piRNA production. We find that temperature modulates the P element activity in germline stem cells, establishing a powerful tool to trigger transposon hyper-activation. Facing vigorous invasion, Drosophila first shut down oogenesis and induce selective apoptosis. Interestingly, a robust adaptive response occurs in ovarian stem cells through activation of the DNA damage checkpoint. Within 4 days, the hosts amplify P element-silencing piRNAs, repair DNA damage, subdue the transposon, and reinitiate oogenesis. We propose that this robust adaptive response can bestow upon organisms the ability to survive recurrent transposon invasions throughout evolution.
Collapse
Affiliation(s)
- Sungjin Moon
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA
| | - Madeline Cassani
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA
| | - Yu An Lin
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA
| | - Lu Wang
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA
| | - Kun Dou
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA
| | - Zz Zhao Zhang
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA.
| |
Collapse
|
49
|
Manee MM, Jackson J, Bergman CM. Conserved Noncoding Elements Influence the Transposable Element Landscape in Drosophila. Genome Biol Evol 2018; 10:1533-1545. [PMID: 29850787 PMCID: PMC6007792 DOI: 10.1093/gbe/evy104] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/22/2018] [Indexed: 12/15/2022] Open
Abstract
Highly conserved noncoding elements (CNEs) constitute a significant proportion of the genomes of multicellular eukaryotes. The function of most CNEs remains elusive, but growing evidence indicates they are under some form of purifying selection. Noncoding regions in many species also harbor large numbers of transposable element (TE) insertions, which are typically lineage specific and depleted in exons because of their deleterious effects on gene function or expression. However, it is currently unknown whether the landscape of TE insertions in noncoding regions is random or influenced by purifying selection on CNEs. Here, we combine comparative and population genomic data in Drosophila melanogaster to show that the abundance of TE insertions in intronic and intergenic CNEs is reduced relative to random expectation, supporting the idea that selective constraints on CNEs eliminate a proportion of TE insertions in noncoding regions. However, we find no evidence for differences in the allele frequency spectra for polymorphic TE insertions in CNEs versus those in unconstrained spacer regions, suggesting that the distribution of fitness effects acting on observable TE insertions is similar across different functional compartments in noncoding DNA. Our results provide evidence that selective constraints on CNEs contribute to shaping the landscape of TE insertion in eukaryotic genomes, and provide further evidence that CNEs are indeed functionally constrained and not simply mutational cold spots.
Collapse
Affiliation(s)
- Manee M Manee
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom.,National Center for Biotechnology, King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia.,Center of Excellence for Genomics (CEG), King Abdulaziz City for Science and Technology, Riyadh, Saudi Arabia
| | - John Jackson
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom.,Department of Animal and Plant Sciences, University of Sheffield, Sheffield, United Kingdom
| | - Casey M Bergman
- Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom.,Department of Genetics, University of Georgia, Athens, GA.,Institute of Bioinformatics, University of Georgia, Athens, GA
| |
Collapse
|
50
|
Stritt C, Gordon SP, Wicker T, Vogel JP, Roulin AC. Recent Activity in Expanding Populations and Purifying Selection Have Shaped Transposable Element Landscapes across Natural Accessions of the Mediterranean Grass Brachypodium distachyon. Genome Biol Evol 2018; 10:304-318. [PMID: 29281015 PMCID: PMC5786231 DOI: 10.1093/gbe/evx276] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/20/2017] [Indexed: 01/05/2023] Open
Abstract
Transposable element (TE) activity has emerged as a major cause of variation in genome size and structure among species. To what extent TEs contribute to genetic variation and divergence within species, however, is much less clear, mainly because population genomic data have so far only been available for the classical model organisms. In this study, we use the annual Mediterranean grass Brachypodium distachyon to investigate TE dynamics in natural populations. Using whole-genome sequencing data for 53 natural accessions, we identified more than 5,400 TE polymorphisms across the studied genomes. We found, first, that while population bottlenecks and expansions have shaped genetic diversity in B. distachyon, these events did not lead to lineage-specific activations of TE families, as observed in other species. Instead, the same families have been active across the species range and TE activity is homogeneous across populations, indicating the presence of conserved regulatory mechanisms. Second, almost half of the TE insertion polymorphisms are accession-specific, most likely because of recent activity in expanding populations and the action of purifying selection. And finally, although TE insertion polymorphisms are underrepresented in and around genes, more than 1,000 of them occur in genic regions and could thus contribute to functional divergence. Our study shows that while TEs in B. distachyon are “well-behaved” compared with TEs in other species with larger genomes, they are an abundant source of lineage-specific genetic variation and may play an important role in population divergence and adaptation.
Collapse
Affiliation(s)
- Christoph Stritt
- Institute for Plant and Microbial Biology, University of Zurich, Switzerland
| | - Sean P Gordon
- DOE Joint Genome Institute, Walnut Creek, California
| | - Thomas Wicker
- Institute for Plant and Microbial Biology, University of Zurich, Switzerland
| | - John P Vogel
- DOE Joint Genome Institute, Walnut Creek, California
| | - Anne C Roulin
- Institute for Plant and Microbial Biology, University of Zurich, Switzerland
| |
Collapse
|