1
|
Huang Y, Sahu SK, Liu X. Deciphering recent transposition patterns in plants through comparison of 811 genome assemblies. PLANT BIOTECHNOLOGY JOURNAL 2025; 23:1121-1132. [PMID: 39791953 PMCID: PMC11933835 DOI: 10.1111/pbi.14570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 10/25/2024] [Accepted: 12/23/2024] [Indexed: 01/12/2025]
Abstract
Transposable elements (TEs) are significant drivers of genome evolution, yet their recent dynamics and impacts within and among species, as well as the roles of host genes and non-coding RNAs in the transposition process, remain elusive. With advancements in large-scale pan-genome sequencing and the development of open data sharing, large-scale comparative genomics studies have become feasible. Here, we performed complete de novo TE annotations and identified active TEs in 310 plant genome assemblies across 119 species and seven crop populations. Using 811 high-quality genomes, we detected 13 844 553 TE-induced structural variants (TE-SVs), providing unprecedented resolution in delineating recent TE activities. Our integrative analysis revealed a mutual evolutionary relationship between TEs and host genomes. On one hand, host genes and ncRNAs are involved in the transposition process, as evidenced by their colocalization and coactivation with TEs, and may play a role in chromatin regulation. On the other hand, TEs drive genetic innovation by promoting the duplication of host genes and inserting into regulatory regions. Moreover, genes influenced by active TEs are linked to plant growth, nutrient absorption, storage metabolism and environmental adaptation, aiding in crop domestication and adaptation. This TE dynamics atlas not only reveals evolutionary and functional features linked to transposition activity but also highlights the role of TEs in crop domestication and adaptation, paving the way for future exploration of TE-mediated genome evolution and crop improvement strategies.
Collapse
Affiliation(s)
- Yan Huang
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
- State Key Laboratory of Agricultural GenomicsBGI ResearchShenzhenChina
- BGI Research BeijingBGI ResearchBeijingChina
| | - Sunil Kumar Sahu
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
- State Key Laboratory of Agricultural GenomicsBGI ResearchShenzhenChina
| | - Xin Liu
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
- State Key Laboratory of Agricultural GenomicsBGI ResearchShenzhenChina
- BGI Research BeijingBGI ResearchBeijingChina
| |
Collapse
|
2
|
Groza C, Chen X, Wheeler TJ, Bourque G, Goubert C. A unified framework to analyze transposable element insertion polymorphisms using graph genomes. Nat Commun 2024; 15:8915. [PMID: 39414821 PMCID: PMC11484939 DOI: 10.1038/s41467-024-53294-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 10/02/2024] [Indexed: 10/18/2024] Open
Abstract
Transposable elements are ubiquitous mobile DNA sequences generating insertion polymorphisms, contributing to genomic diversity. We present GraffiTE, a flexible pipeline to analyze polymorphic mobile elements insertions. By integrating state-of-the-art structural variant detection algorithms and graph genomes, GraffiTE identifies polymorphic mobile elements from genomic assemblies or long-read sequencing data, and genotypes these variants using short or long read sets. Benchmarking on simulated and real datasets reports high precision and recall rates. GraffiTE is designed to allow non-expert users to perform comprehensive analyses, including in models with limited transposable element knowledge and is compatible with various sequencing technologies. Here, we demonstrate the versatility of GraffiTE by analyzing human, Drosophila melanogaster, maize, and Cannabis sativa pangenome data. These analyses reveal the landscapes of polymorphic mobile elements and their frequency variations across individuals, strains, and cultivars.
Collapse
Affiliation(s)
- Cristian Groza
- Quantitative Life Sciences, McGill University, Montréal, QC, Canada
| | - Xun Chen
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
| | - Travis J Wheeler
- R. Ken Coit College of Pharmacy, University of Arizona, Tucson, AZ, USA
| | - Guillaume Bourque
- Institute for the Advanced Study of Human Biology (ASHBi), Kyoto University, Kyoto, Japan
- Canadian Centre for Computational Genomics, McGill University, Montréal, QC, Canada
- Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada
- Human Genetics, McGill University, Montréal, QC, Canada
| | - Clément Goubert
- Human Genetics, McGill University, Montréal, QC, Canada.
- R. Ken Coit College of Pharmacy, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
3
|
Xu D, Zhang X, Yuan X, Han H, Xue Y, Guo X. Hazardous risk of antibiotic resistance genes: Host occurrence, distribution, mobility and vertical transmission from different environments to corn silage. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 338:122671. [PMID: 37788797 DOI: 10.1016/j.envpol.2023.122671] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/25/2023] [Accepted: 09/30/2023] [Indexed: 10/05/2023]
Abstract
Antibiotic resistance genes (ARGs) are one of the emerging contaminants posing a great deal of hazardous risk to public health. This study employed metagenomics and deciphered the potential risk of the antibiotic resistome and their vertical transfer to ensiled whole-crop corn silage harvested from six climate zones: 1. Warm temperate-fully humid-hot summer (Cfa), 2. Arid-desert-cold arid (BWk), 3. Snow-desert-cold summer (Dwc), 4. Snow-desert-hot summer (Dwa), 5. Arid-steppe-cold arid (BSk), and 6. Equatorial-desert (Aw) based on the Köppen-Geiger climate classification in China. The findings demonstrate a high diversity of ARGs, which is related to the drug classes of tetracycline, ciprofloxacin, lincosamide, fosfomycin, and beta lactam. Resistome variations are mostly related to variations in microbial composition and fermentation characteristics of the silages from different climate zones, which are indirectly influenced by environmental conditions. The most dominating ARGs in corn silage were tetM, acrA, H-NS, lnuA, emrR, and KpnG, which is primarily hosted by Klebsiella and Lactobacilli. There were 5 high-risk ARGs (tetM, bacA, SHV-1, dfrA17, and QnrS1) in silage from different climate zones, and the tetM was the most prevalent high-risk ARG. However, throughout the ensiling process, the abundance of ARGs, and mobile ARGs were reduced. The resistome contamination in silage from Tibet (Dwc) with high altitude and harsh environment was relatively low due to the low variety and abundance of ARGs, the low abundance of mobile ARGs and high-risk ARGs. In addition, most of the bacteria responsible for the silage fermentation were also found to be the hosts to the ARGs, although their abundance decreased after 90 d of silage fermentation. Hence, we alert the existence of ARGs-related biosafety risk in silages and call for more attention to the silage ARGs, their hosts, and mobile genetic elements in order to curtail their possible risk to public health.
Collapse
Affiliation(s)
- Dongmei Xu
- School of Life Sciences, Lanzhou University, Lanzhou, 730000, PR China
| | - Xingguo Zhang
- Bioyi Biotechnology Co., Ltd., Wuhan, 430075, PR China
| | - Xianjun Yuan
- Institute of Ensiling and Processing of Grass, Nanjing Agricultural University, Nanjing, 210095, PR China
| | - Hongyan Han
- The Research Center for Laboratory Animal Science, College of Life Science, Inner Mongolia University, Hohhot, 010070, PR China
| | - Yanlin Xue
- Inner Mongolia Engineering Research Center of Development and Utilization of Microbial Resources in Silage, Inner Mongolia Academy of Agriculture and Animal Husbandry Science, Hohhot, 010031, PR China
| | - Xusheng Guo
- School of Life Sciences, Lanzhou University, Lanzhou, 730000, PR China.
| |
Collapse
|
4
|
Mohamed M, Sabot F, Varoqui M, Mugat B, Audouin K, Pélisson A, Fiston-Lavier AS, Chambeyron S. TrEMOLO: accurate transposable element allele frequency estimation using long-read sequencing data combining assembly and mapping-based approaches. Genome Biol 2023; 24:63. [PMID: 37013657 PMCID: PMC10069131 DOI: 10.1186/s13059-023-02911-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 03/23/2023] [Indexed: 04/05/2023] Open
Abstract
Transposable Element MOnitoring with LOng-reads (TrEMOLO) is a new software that combines assembly- and mapping-based approaches to robustly detect genetic elements called transposable elements (TEs). Using high- or low-quality genome assemblies, TrEMOLO can detect most TE insertions and deletions and estimate their allele frequency in populations. Benchmarking with simulated data revealed that TrEMOLO outperforms other state-of-the-art computational tools. TE detection and frequency estimation by TrEMOLO were validated using simulated and experimental datasets. Therefore, TrEMOLO is a comprehensive and suitable tool to accurately study TE dynamics. TrEMOLO is available under GNU GPL3.0 at https://github.com/DrosophilaGenomeEvolution/TrEMOLO .
Collapse
Affiliation(s)
- Mourdas Mohamed
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - François Sabot
- DIADE, University of Montpellier, CIRAD, IRD, Montpellier, France
- IFB - Southgreen Bioversity, CIRAD, INRAE, IRD, Montpellier, France
| | - Marion Varoqui
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - Bruno Mugat
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | | | - Alain Pélisson
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| | - Anna-Sophie Fiston-Lavier
- ISEM, Université Montpellier, CNRS, IRD, CIRAD, EPHE, Montpellier, France
- Institut Universitaire de France (IUF), Paris, France
| | - Séverine Chambeyron
- Institute of Human Genetics, UMR9002, CNRS and Université de Montpellier, Montpellier, France
| |
Collapse
|
5
|
Finding and Characterizing Repeats in Plant Genomes. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2443:327-385. [PMID: 35037215 DOI: 10.1007/978-1-0716-2067-0_18] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of the available software that can help biologists to scan automatically for these repeats in sequence data or check hypothetical models intended to characterize their structures. Since transposable elements (TEs) are a major source of repeats in plants, many methods have been used or developed for this broad class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided two sections on this topic (for the analysis of genomes or directly of sequenced reads), as well as a selection of the main existing software. It may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of an efficient search for repeats and more complex patterns. We first introduce the key concepts of the art of indexing and mapping or querying sequences. We end the chapter with the more prospective issue of building models of repeat families. We present the Machine Learning approach first, seeking to build predictors automatically for some families of ET, from a set of sequences known to belong to this family. A second approach, the linguistic (or syntactic) approach, allows biologists to describe themselves and check the validity of models of their favorite repeat family.
Collapse
|
6
|
Bogaerts‐Márquez M, Guirao‐Rico S, Gautier M, González J. Temperature, rainfall and wind variables underlie environmental adaptation in natural populations of Drosophila melanogaster. Mol Ecol 2021; 30:938-954. [PMID: 33350518 PMCID: PMC7986194 DOI: 10.1111/mec.15783] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 12/16/2020] [Accepted: 12/18/2020] [Indexed: 02/06/2023]
Abstract
While several studies in a diverse set of species have shed light on the genes underlying adaptation, our knowledge on the selective pressures that explain the observed patterns lags behind. Drosophila melanogaster is a valuable organism to study environmental adaptation because this species originated in Southern Africa and has recently expanded worldwide, and also because it has a functionally well-annotated genome. In this study, we aimed to decipher which environmental variables are relevant for adaptation of D. melanogaster natural populations in Europe and North America. We analysed 36 whole-genome pool-seq samples of D. melanogaster natural populations collected in 20 European and 11 North American locations. We used the BayPass software to identify single nucleotide polymorphisms (SNPs) and transposable elements (TEs) showing signature of adaptive differentiation across populations, as well as significant associations with 59 environmental variables related to temperature, rainfall, evaporation, solar radiation, wind, daylight hours, and soil type. We found that in addition to temperature and rainfall, wind related variables are also relevant for D. melanogaster environmental adaptation. Interestingly, 23%-51% of the genes that showed significant associations with environmental variables were not found overly differentiated across populations. In addition to SNPs, we also identified 10 reference transposable element insertions associated with environmental variables. Our results showed that genome-environment association analysis can identify adaptive genetic variants that are undetected by population differentiation analysis while also allowing the identification of candidate environmental drivers of adaptation.
Collapse
Affiliation(s)
- María Bogaerts‐Márquez
- Institute of Evolutionary Biology (CSIC‐Universitat Pompeu Fabra)BarcelonaSpain
- The European Drosophila Population Genomics Consortium (DrosEU)Université de MontpellierMontpellierFrance
| | - Sara Guirao‐Rico
- Institute of Evolutionary Biology (CSIC‐Universitat Pompeu Fabra)BarcelonaSpain
- The European Drosophila Population Genomics Consortium (DrosEU)Université de MontpellierMontpellierFrance
| | - Mathieu Gautier
- CBGP, INRA, CIRAD, IRD, Montpellier SupAgroUniversité de MontpellierMontpellierFrance
| | - Josefa González
- Institute of Evolutionary Biology (CSIC‐Universitat Pompeu Fabra)BarcelonaSpain
- The European Drosophila Population Genomics Consortium (DrosEU)Université de MontpellierMontpellierFrance
| |
Collapse
|
7
|
Salces-Ortiz J, Vargas-Chavez C, Guio L, Rech GE, González J. Transposable elements contribute to the genomic response to insecticides in Drosophila melanogaster. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190341. [PMID: 32075557 PMCID: PMC7061994 DOI: 10.1098/rstb.2019.0341] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Most of the genotype–phenotype analyses to date have largely centred attention on single nucleotide polymorphisms. However, transposable element (TE) insertions have arisen as a plausible addition to the study of the genotypic–phenotypic link because of to their role in genome function and evolution. In this work, we investigate the contribution of TE insertions to the regulation of gene expression in response to insecticides. We exposed four Drosophila melanogaster strains to malathion, a commonly used organophosphate insecticide. By combining information from different approaches, including RNA-seq and ATAC-seq, we found that TEs can contribute to the regulation of gene expression under insecticide exposure by rewiring cis-regulatory networks. This article is part of a discussion meeting issue ‘Crossroads between transposons and gene regulation’.
Collapse
Affiliation(s)
- Judit Salces-Ortiz
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Carlos Vargas-Chavez
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Lain Guio
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Gabriel E Rech
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| | - Josefa González
- Institute of Evolutionary Biology (IBE), CSIC-Universitat Pompeu Fabra, Barcelona, Spain
| |
Collapse
|
8
|
Vendrell-Mir P, Barteri F, Merenciano M, González J, Casacuberta JM, Castanera R. A benchmark of transposon insertion detection tools using real data. Mob DNA 2019; 10:53. [PMID: 31892957 PMCID: PMC6937713 DOI: 10.1186/s13100-019-0197-9] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 12/17/2019] [Indexed: 02/01/2023] Open
Abstract
Background Transposable elements (TEs) are an important source of genomic variability in eukaryotic genomes. Their activity impacts genome architecture and gene expression and can lead to drastic phenotypic changes. Therefore, identifying TE polymorphisms is key to better understand the link between genotype and phenotype. However, most genotype-to-phenotype analyses have concentrated on single nucleotide polymorphisms as they are easier to reliable detect using short-read data. Many bioinformatic tools have been developed to identify transposon insertions from resequencing data using short reads. Nevertheless, the performance of most of these tools has been tested using simulated insertions, which do not accurately reproduce the complexity of natural insertions. Results We have overcome this limitation by building a dataset of insertions from the comparison of two high-quality rice genomes, followed by extensive manual curation. This dataset contains validated insertions of two very different types of TEs, LTR-retrotransposons and MITEs. Using this dataset, we have benchmarked the sensitivity and precision of 12 commonly used tools, and our results suggest that in general their sensitivity was previously overestimated when using simulated data. Our results also show that, increasing coverage leads to a better sensitivity but with a cost in precision. Moreover, we found important differences in tool performance, with some tools performing better on a specific type of TEs. We have also used two sets of experimentally validated insertions in Drosophila and humans and show that this trend is maintained in genomes of different size and complexity. Conclusions We discuss the possible choice of tools depending on the goals of the study and show that the appropriate combination of tools could be an option for most approaches, increasing the sensitivity while maintaining a good precision.
Collapse
Affiliation(s)
- Pol Vendrell-Mir
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| | - Fabio Barteri
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| | - Miriam Merenciano
- 2Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Maritim Barceloneta 37-49, 08003 Barcelona, Spain
| | - Josefa González
- 2Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Maritim Barceloneta 37-49, 08003 Barcelona, Spain
| | - Josep M Casacuberta
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| | - Raúl Castanera
- 1Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, Campus UAB, Edifici CRAG, Bellaterra, 08193 Barcelona, Spain
| |
Collapse
|