1
|
Ni H, Yong-Villalobos L, Gu M, López-Arredondo DL, Chen M, Geng L, Xu G, Herrera-Estrella LR. Adaptive dynamics of extrachromosomal circular DNA in rice under nutrient stress. Nat Commun 2025; 16:4150. [PMID: 40320403 PMCID: PMC12050283 DOI: 10.1038/s41467-025-59572-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 04/23/2025] [Indexed: 05/08/2025] Open
Abstract
Extrachromosomal circular DNAs (eccDNAs) have been identified in various eukaryotic organisms and are known to play crucial roles in genomic plasticity. However, in crop plants, the role of eccDNAs in responses to environmental cues, particularly nutritional stresses, remains unexplored. Rice (Oryza sativa ssp. japonica), a vital crop for over half the world's population and an excellent model plant for genomic studies, faces numerous environmental challenges during growth. Therefore, we conduct comprehensive studies investigating the distribution, sequence, and potential responses of rice eccDNAs to nutritional stresses. We describe the changes in the eccDNA landscape at various developmental stages of rice in optimal growth. We also identify eccDNAs overlapping with genes (ecGenes), transposable elements (ecTEs), and full-length repeat units (full-length ecRepeatUnits), whose prevalence responds to nitrogen (N) and phosphorus (P) deficiency. We analyze multiple-fragment eccDNAs and propose a potential TE-mediated homologous recombination mechanism as the origin of rice's multiple-fragment eccDNAs. We provide evidence for the role of eccDNAs in the rice genome plasticity under nutritional stresses and underscore the significance of their abundance and specificity.
Collapse
Affiliation(s)
- Hanfang Ni
- National Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
- MOA Key Laboratory of Plant Nutrition and Fertilization in Lower-Middle Reaches of the Yangtze River, Nanjing, China
| | - Lenin Yong-Villalobos
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX, USA
| | - Mian Gu
- National Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
- MOA Key Laboratory of Plant Nutrition and Fertilization in Lower-Middle Reaches of the Yangtze River, Nanjing, China
| | - Damar Lizbeth López-Arredondo
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX, USA
| | - Min Chen
- National Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
- MOA Key Laboratory of Plant Nutrition and Fertilization in Lower-Middle Reaches of the Yangtze River, Nanjing, China
| | - Liyan Geng
- National Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China
- MOA Key Laboratory of Plant Nutrition and Fertilization in Lower-Middle Reaches of the Yangtze River, Nanjing, China
| | - Guohua Xu
- National Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Nanjing Agricultural University, Nanjing, China.
- MOA Key Laboratory of Plant Nutrition and Fertilization in Lower-Middle Reaches of the Yangtze River, Nanjing, China.
| | - Luis Rafael Herrera-Estrella
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX, USA.
- Unidad de Genómica Avanzada/Langebio, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Gto, Mexico.
| |
Collapse
|
2
|
Gholap AD, Omri A. Advances in artificial intelligence-envisioned technologies for protein and nucleic acid research. Drug Discov Today 2025; 30:104362. [PMID: 40252991 DOI: 10.1016/j.drudis.2025.104362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Revised: 02/04/2025] [Accepted: 04/10/2025] [Indexed: 04/21/2025]
Abstract
Artificial intelligence (AI) and machine learning (ML) have revolutionized pharmaceutical research, particularly in protein and nucleic acid studies. This review summarizes the current status of AI and ML applications in the pharmaceutical sector, focusing on innovative tools, web servers, and databases. This paper highlights how these technologies address key challenges in drug development including high costs, lengthy timelines, and the complexity of biological systems. Furthermore, the potential of AI in personalized medicine, cancer drug response prediction, and biomarker identification is discussed. The integration of AI and ML in pharmaceutical research promises to accelerate drug discovery, reduce development costs, and ultimately lead to more effective and personalized therapeutic strategies.
Collapse
Affiliation(s)
- Amol D Gholap
- Department of Pharmaceutics, St. John Institute of Pharmacy and Research, Palghar, Maharashtra 401404, India
| | - Abdelwahab Omri
- Department of Chemistry and Biochemistry, The Novel Drug and Vaccine Delivery Systems Facility, Laurentian University, Sudbury, ON P3E 2C6, Canada.
| |
Collapse
|
3
|
Huang Y, Sahu SK, Liu X. Deciphering recent transposition patterns in plants through comparison of 811 genome assemblies. PLANT BIOTECHNOLOGY JOURNAL 2025; 23:1121-1132. [PMID: 39791953 PMCID: PMC11933835 DOI: 10.1111/pbi.14570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 10/25/2024] [Accepted: 12/23/2024] [Indexed: 01/12/2025]
Abstract
Transposable elements (TEs) are significant drivers of genome evolution, yet their recent dynamics and impacts within and among species, as well as the roles of host genes and non-coding RNAs in the transposition process, remain elusive. With advancements in large-scale pan-genome sequencing and the development of open data sharing, large-scale comparative genomics studies have become feasible. Here, we performed complete de novo TE annotations and identified active TEs in 310 plant genome assemblies across 119 species and seven crop populations. Using 811 high-quality genomes, we detected 13 844 553 TE-induced structural variants (TE-SVs), providing unprecedented resolution in delineating recent TE activities. Our integrative analysis revealed a mutual evolutionary relationship between TEs and host genomes. On one hand, host genes and ncRNAs are involved in the transposition process, as evidenced by their colocalization and coactivation with TEs, and may play a role in chromatin regulation. On the other hand, TEs drive genetic innovation by promoting the duplication of host genes and inserting into regulatory regions. Moreover, genes influenced by active TEs are linked to plant growth, nutrient absorption, storage metabolism and environmental adaptation, aiding in crop domestication and adaptation. This TE dynamics atlas not only reveals evolutionary and functional features linked to transposition activity but also highlights the role of TEs in crop domestication and adaptation, paving the way for future exploration of TE-mediated genome evolution and crop improvement strategies.
Collapse
Affiliation(s)
- Yan Huang
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
- State Key Laboratory of Agricultural GenomicsBGI ResearchShenzhenChina
- BGI Research BeijingBGI ResearchBeijingChina
| | - Sunil Kumar Sahu
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
- State Key Laboratory of Agricultural GenomicsBGI ResearchShenzhenChina
| | - Xin Liu
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijingChina
- State Key Laboratory of Agricultural GenomicsBGI ResearchShenzhenChina
- BGI Research BeijingBGI ResearchBeijingChina
| |
Collapse
|
4
|
Kraege A, Chavarro-Carrero E, Schnell E, Heilmann-Heimbach S, Becker K, Köhrer K, Huettel B, Sargheini N, Schiffer P, Waldvogel AM, Thomma BPHJ, Rovenich H. High quality genome assembly and annotation (v1) of the eukaryotic freshwater microalga Coccomyxa elongata SAG 216-3b. G3 (BETHESDA, MD.) 2025; 15:jkae294. [PMID: 39671565 PMCID: PMC11797067 DOI: 10.1093/g3journal/jkae294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 11/25/2024] [Accepted: 11/29/2024] [Indexed: 12/15/2024]
Abstract
Unicellular green algae of the genus Coccomyxa are recognized for their worldwide distribution and ecological versatility. Coccomyxa elongata is a freshwater species of the Coccomyxa simplex clade, which also includes lichen symbionts. To facilitate future molecular and phylogenomic studies of this versatile clade of algae, we generated a high-quality genome assembly for C. elongata Chodat & Jaag SAG 216-3b within the framework of the Biodiversity Genomics Center Cologne (BioC2) initiative. A combination of long-read PacBio HiFi and Oxford Nanopore Technologies with chromatin conformation capture (Hi-C) sequencing led to the assembly of the genome into 21 scaffolds with a total length of 51.4 Mb and an N50 of 2.8 Mb. Nineteen of the scaffolds represent highly complete nuclear chromosomes delimited by telomeric repeats, while the two additional scaffolds represent the mitochondrial and plastid genomes. Transcriptome-guided gene annotation resulted in the identification of 14,811 protein-coding genes, of which 61% have annotated protein family domains and 841 are predicted to be secreted. Benchmarking universal single-copy orthologs analysis against the Chlorophyta database identified a total of 1,494 (98.4%) complete gene models, suggesting a highly complete genome annotation.
Collapse
Affiliation(s)
- Anton Kraege
- Institute of Plant Sciences, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
| | - Edgar Chavarro-Carrero
- Institute of Plant Sciences, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
| | - Eva Schnell
- Institute of Plant Sciences, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
| | - Stefanie Heilmann-Heimbach
- Institute of Human Genetics, University Hospital of Bonn, University of Bonn, Venusberg, Sigmund-Freund, Straße 25, Bonn 53127, Germany
- NGS Core Facility, Medical Faculty of the University of Bonn, University of Bonn, Venusberg-Campus 1, Bonn 53127, Germany
| | - Kerstin Becker
- Cologne Center for Genomics (CCG), Medical Faculty, University of Cologne, Weyertal 115b, Cologne 50931, Germany
- Biological and Medical Research Centre (BMFZ), Genomics and Transcriptomics Laboratory, Heinrich-Heine-University Düsseldorf, Universitätsstraße 1, Düsseldorf 40225, Germany
| | - Karl Köhrer
- Biological and Medical Research Centre (BMFZ), Genomics and Transcriptomics Laboratory, Heinrich-Heine-University Düsseldorf, Universitätsstraße 1, Düsseldorf 40225, Germany
| | - Bruno Huettel
- Max Planck Genome Centre, Max Planck Institute for Plant Breeding Research, Carl-von-Linne-Weg 10, Cologne 50829, Germany
| | - Nafiseh Sargheini
- Max Planck Genome Centre, Max Planck Institute for Plant Breeding Research, Carl-von-Linne-Weg 10, Cologne 50829, Germany
| | - Philipp Schiffer
- Institute of Zoology, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
| | - Ann-Marie Waldvogel
- Institute of Zoology, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
| | - Bart P H J Thomma
- Institute of Plant Sciences, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
- Department of Biology, Cluster of Excellence on Plant Sciences (CEPLAS), Zülpicher Straße 47b, Cologne 50674, Germany
| | - Hanna Rovenich
- Institute of Plant Sciences, Department of Biology, University of Cologne, Zülpicher Straße 47b, Cologne 50674, Germany
| |
Collapse
|
5
|
Sharma A, Liu X, Yin J, Yu PJ, Qi L, He M, Li KJ, Zheng DQ. Genomic characteristics and genetic manipulation of the marine yeast Scheffersomyces spartinae. Appl Microbiol Biotechnol 2024; 108:539. [PMID: 39702830 PMCID: PMC11659333 DOI: 10.1007/s00253-024-13382-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Revised: 12/06/2024] [Accepted: 12/09/2024] [Indexed: 12/21/2024]
Abstract
The halotolerant yeast Scheffersomyces spartinae, commonly found in marine environments, holds significant potential for various industrial applications. Despite this, its genetic characteristics have been relatively underexplored. In this study, we isolated a strain of S. spartinae named YMxiao from seawater in Zhoushan City, China. Through scanning electron microscopy and flow cytometry, we characterized S. spartinae YMxiao cells as urn-shaped, demonstrating asymmetric division via budding, and possessing a diploid genome. Compared to the model yeast Saccharomyces cerevisiae, S. spartinae YMxiao exhibited greater tolerance to various stressful conditions. Furthermore, S. spartinae YMxiao was capable of utilizing xylose, mannitol, sorbitol, and arabinose as sole carbon sources for growth. We conducted whole-genome sequencing of S. spartinae YMxiao using a combination of Nanopore and Illumina technologies, resulting in a telomere-to-telomere complete genome assembly of 12 Mb. Genome annotation identified 5311 protein-coding genes, 214 tRNA genes, and 236 transposable elements distributed across 8 chromosomes. Comparative genomics between S. spartinae strains YMxiao and ARV011 revealed genomic variations and evolutionary patterns within this species. Notably, certain genes in S. spartinae strains were found to be under strong positive selection. Additionally, we developed a genetic manipulation protocol that successfully enabled gene knockouts in S. spartinae. Our findings not only enhance our understanding of the S. spartinae genome but also provide a foundation for future research into its potential biotechnological applications. KEY POINTS: • The unique phenotypes and genetic characteristics of S. spartinae were disclosed. • Comparative genomics showed vast genetic variations between S. spartinae strains. • Genetic manipulation protocol was established for S. spartinae strain.
Collapse
Affiliation(s)
- Awkash Sharma
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China
| | - Xing Liu
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China
| | - Jun Yin
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China
| | - Pei-Jing Yu
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China
| | - Lei Qi
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China
| | - Min He
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China
| | - Ke-Jing Li
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China.
| | - Dao-Qiong Zheng
- National Key Laboratory of Biobased Transportation Fuel Technology, Ocean College, Zhejiang University, Hangzhou, 310027, China.
| |
Collapse
|
6
|
Li R, Yao J, Cai S, Fu Y, Lai C, Zhu X, Cui L, Li Y. Genome-wide characterization and evolution analysis of miniature inverted-repeat transposable elements in Barley ( Hordeum vulgare). FRONTIERS IN PLANT SCIENCE 2024; 15:1474846. [PMID: 39544535 PMCID: PMC11560428 DOI: 10.3389/fpls.2024.1474846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 10/14/2024] [Indexed: 11/17/2024]
Abstract
Miniature inverted-repeat transposable elements (MITEs) constitute a class of class II transposable elements (TEs) that are abundant in plant genomes, playing a crucial role in their evolution and diversity. Barley (Hordeum vulgare), the fourth-most important cereal crop globally, is widely used for brewing, animal feed, and human consumption. However, despite their significance, the mechanisms underlying the insertion or amplification of MITEs and their contributions to barley genome evolution and diversity remain poorly understood. Through our comprehensive analysis, we identified 32,258 full-length MITEs belonging to 2,992 distinct families, accounting for approximately 0.17% of the barley genome. These MITE families can be grouped into four well-known superfamilies (Tc1/Mariner-like, PIF/Harbinger-like, hAT-like, and Mutator-like) and one unidentified superfamily. Notably, we observed two major expansion events in the barley MITE population, occurring approximately 12-13 million years ago (Mya) and 2-3 Mya. Our investigation revealed a strong preference of MITEs for gene-related regions, particularly in promoters, suggesting their potential involvement in regulating host gene expression. Additionally, we discovered that 7.73% miRNAs are derived from MITEs, thereby influencing the origin of certain miRNAs and potentially exerting a significant impact on post-transcriptional gene expression control. Evolutionary analysis demonstrated that MITEs exhibit lower conservation compared to genes, consistent with their dynamic mobility. We also identified a series of MITE insertions or deletions associated with domestication, highlighting these regions as promising targets for crop improvement strategies. These findings significantly advance our understanding of the fundamental characteristics and evolutionary patterns of MITEs in the barley genome. Moreover, they contribute to our knowledge of gene regulatory networks and provide valuable insights for crop improvement endeavors.
Collapse
Affiliation(s)
- Ruiying Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Ju Yao
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Shaoshuai Cai
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Yi Fu
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Chongde Lai
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
- The Public Instrument Platform of Jiangxi Agricultural University, Jiangxi Agricultural University, Nanchang, China
| | - Xiangdong Zhu
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| | - Yihan Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, China
| |
Collapse
|
7
|
Rojas J, Hose J, Dutcher HA, Place M, Wolters JF, Hittinger CT, Gasch AP. Comparative modeling reveals the molecular determinants of aneuploidy fitness cost in a wild yeast model. CELL GENOMICS 2024; 4:100656. [PMID: 39317188 PMCID: PMC11602619 DOI: 10.1016/j.xgen.2024.100656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 07/10/2024] [Accepted: 08/20/2024] [Indexed: 09/26/2024]
Abstract
Although implicated as deleterious in many organisms, aneuploidy can underlie rapid phenotypic evolution. However, aneuploidy will be maintained only if the benefit outweighs the cost, which remains incompletely understood. To quantify this cost and the molecular determinants behind it, we generated a panel of chromosome duplications in Saccharomyces cerevisiae and applied comparative modeling and molecular validation to understand aneuploidy toxicity. We show that 74%-94% of the variance in aneuploid strains' growth rates is explained by the cumulative cost of genes on each chromosome, measured for single-gene duplications using a genomic library, along with the deleterious contribution of small nucleolar RNAs (snoRNAs) and beneficial effects of tRNAs. Machine learning to identify properties of detrimental gene duplicates provided no support for the balance hypothesis of aneuploidy toxicity and instead identified gene length as the best predictor of toxicity. Our results present a generalized framework for the cost of aneuploidy with implications for disease biology and evolution.
Collapse
Affiliation(s)
- Julie Rojas
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - James Hose
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - H Auguste Dutcher
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Michael Place
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA; Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - John F Wolters
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Chris Todd Hittinger
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA; Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA; Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA; J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Audrey P Gasch
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA; Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA; Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA; J.F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
8
|
Unneberg P, Larsson M, Olsson A, Wallerman O, Petri A, Bunikis I, Vinnere Pettersson O, Papetti C, Gislason A, Glenner H, Cartes JE, Blanco-Bercial L, Eriksen E, Meyer B, Wallberg A. Ecological genomics in the Northern krill uncovers loci for local adaptation across ocean basins. Nat Commun 2024; 15:6297. [PMID: 39090106 PMCID: PMC11294593 DOI: 10.1038/s41467-024-50239-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 05/15/2024] [Indexed: 08/04/2024] Open
Abstract
Krill are vital as food for many marine animals but also impacted by global warming. To learn how they and other zooplankton may adapt to a warmer world we studied local adaptation in the widespread Northern krill (Meganyctiphanes norvegica). We assemble and characterize its large genome and compare genome-scale variation among 74 specimens from the colder Atlantic Ocean and warmer Mediterranean Sea. The 19 Gb genome likely evolved through proliferation of retrotransposons, now targeted for inactivation by extensive DNA methylation, and contains many duplicated genes associated with molting and vision. Analysis of 760 million SNPs indicates extensive homogenizing gene-flow among populations. Nevertheless, we detect signatures of adaptive divergence across hundreds of genes, implicated in photoreception, circadian regulation, reproduction and thermal tolerance, indicating polygenic adaptation to light and temperature. The top gene candidate for ecological adaptation was nrf-6, a lipid transporter with a Mediterranean variant that may contribute to early spring reproduction. Such variation could become increasingly important for fitness in Atlantic stocks. Our study underscores the widespread but uneven distribution of adaptive variation, necessitating characterization of genetic variation among natural zooplankton populations to understand their adaptive potential, predict risks and support ocean conservation in the face of climate change.
Collapse
Affiliation(s)
- Per Unneberg
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Mårten Larsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden
| | - Anna Olsson
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden
| | - Anna Petri
- Uppsala Genome Center, Department of Immunology, Genetics and Pathology, Uppsala University, National Genomics Infrastructure hosted by SciLifeLab, Uppsala, Sweden
| | - Ignas Bunikis
- Uppsala Genome Center, Department of Immunology, Genetics and Pathology, Uppsala University, National Genomics Infrastructure hosted by SciLifeLab, Uppsala, Sweden
| | - Olga Vinnere Pettersson
- Uppsala Genome Center, Department of Immunology, Genetics and Pathology, Uppsala University, National Genomics Infrastructure hosted by SciLifeLab, Uppsala, Sweden
| | | | - Astthor Gislason
- Marine and Freshwater Research Institute, Pelagic Division, Reykjavik, Iceland
| | - Henrik Glenner
- Department of Biological Sciences, University of Bergen, Bergen, Norway
- Center for Macroecology, Evolution and Climate Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Joan E Cartes
- Instituto de Ciencias del Mar (ICM-CSIC), Barcelona, Spain
| | | | | | - Bettina Meyer
- Section Polar Biological Oceanography, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
- Institute for Chemistry and Biology of the Marine Environment, Carlvon Ossietzky University of Oldenburg, Oldenburg, Germany
- Helmholtz Institute for Functional Marine Biodiversity (HIFMB), University of Oldenburg, Oldenburg, Germany
| | - Andreas Wallberg
- Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 751 23, Uppsala, Sweden.
| |
Collapse
|
9
|
Kim J, Lim J, Kim M, Lee YK. Whole-genome sequencing of 13 Arctic plants and draft genomes of Oxyria digyna and Cochlearia groenlandica. Sci Data 2024; 11:793. [PMID: 39025921 PMCID: PMC11258133 DOI: 10.1038/s41597-024-03569-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 06/24/2024] [Indexed: 07/20/2024] Open
Abstract
To understand the genomic characteristics of Arctic plants, we generated 28-44 Gb of short-read sequencing data from 13 Arctic plants collected from the High Arctic Svalbard. We successfully estimated the genome sizes of eight species by using the k-mer-based method (180-894 Mb). Among these plants, the mountain sorrel (Oxyria digyna) and Greenland scurvy grass (Cochlearia groenlandica) had relatively small genome sizes and chromosome numbers. We obtained 45 × and 121 × high-fidelity long-read sequencing data. We assembled their reads into high-quality draft genomes (genome size: 561 and 250 Mb; contig N50 length: 36.9 and 14.8 Mb, respectively), and correspondingly annotated 43,105 and 29,675 genes using ~46 and ~85 million RNA sequencing reads. We identified 765,012 and 88,959 single-nucleotide variants, and 18,082 and 7,698 structural variants (variant size ≥ 50 bp). This study provided high-quality genome assemblies of O. digyna and C. groenlandica, which are valuable resources for the population and molecular genetic studies of these plants.
Collapse
Affiliation(s)
- Jun Kim
- Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, Daejeon, 34134, Korea
| | - Jiseon Lim
- Department of Convergent Bioscience and Informatics, College of Bioscience and Biotechnology, Chungnam National University, Daejeon, 34134, Korea
| | - Moonkyo Kim
- Korea Polar Research Institute, Incheon, 21990, Korea
- Department of Life Sciences, Incheon National University, Incheon, 22012, Korea
| | - Yoo Kyung Lee
- Korea Polar Research Institute, Incheon, 21990, Korea.
- Department of Polar Sciences, University of Science and Technology, Incheon, 21990, Korea.
| |
Collapse
|
10
|
Lax C, Mondo SJ, Osorio-Concepción M, Muszewska A, Corrochano-Luque M, Gutiérrez G, Riley R, Lipzen A, Guo J, Hundley H, Amirebrahimi M, Ng V, Lorenzo-Gutiérrez D, Binder U, Yang J, Song Y, Cánovas D, Navarro E, Freitag M, Gabaldón T, Grigoriev IV, Corrochano LM, Nicolás FE, Garre V. Symmetric and asymmetric DNA N6-adenine methylation regulates different biological responses in Mucorales. Nat Commun 2024; 15:6066. [PMID: 39025853 PMCID: PMC11258239 DOI: 10.1038/s41467-024-50365-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/05/2024] [Indexed: 07/20/2024] Open
Abstract
DNA N6-adenine methylation (6mA) has recently gained importance as an epigenetic modification in eukaryotes. Its function in lineages with high levels, such as early-diverging fungi (EDF), is of particular interest. Here, we investigated the biological significance and evolutionary implications of 6mA in EDF, which exhibit divergent evolutionary patterns in 6mA usage. The analysis of two Mucorales species displaying extreme 6mA usage reveals that species with high 6mA levels show symmetric methylation enriched in highly expressed genes. In contrast, species with low 6mA levels show mostly asymmetric 6mA. Interestingly, transcriptomic regulation throughout development and in response to environmental cues is associated with changes in the 6mA landscape. Furthermore, we identify an EDF-specific methyltransferase, likely originated from endosymbiotic bacteria, as responsible for asymmetric methylation, while an MTA-70 methylation complex performs symmetric methylation. The distinct phenotypes observed in the corresponding mutants reinforced the critical role of both types of 6mA in EDF.
Collapse
Affiliation(s)
- Carlos Lax
- Departamento de Genética y Microbiología, Facultad de Biología, Universidad de Murcia, Murcia, Spain
| | - Stephen J Mondo
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Department of Agricultural Biology, Colorado State University, Fort Collins, CO, 80523, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Macario Osorio-Concepción
- Departamento de Genética y Microbiología, Facultad de Biología, Universidad de Murcia, Murcia, Spain
| | - Anna Muszewska
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5A, 02-106, Warsaw, Poland
| | | | - Gabriel Gutiérrez
- Departamento de Genética, Facultad de Biología, Universidad de Sevilla, Sevilla, Spain
| | - Robert Riley
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Anna Lipzen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jie Guo
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Hope Hundley
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Mojgan Amirebrahimi
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Vivian Ng
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Damaris Lorenzo-Gutiérrez
- Departamento de Genética y Microbiología, Facultad de Biología, Universidad de Murcia, Murcia, Spain
| | - Ulrike Binder
- Institute of Hygiene and Medical Microbiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Junhuan Yang
- College of Food Science and Engineering, Lingnan Normal University, Zhanjiang, 524048, China
| | - Yuanda Song
- Colin Ratledge Center for Microbial Lipids, School of Agricultural Engineering and Food Science, Shandong University of Technology, Zibo, 255049, China
| | - David Cánovas
- Departamento de Genética, Facultad de Biología, Universidad de Sevilla, Sevilla, Spain
| | - Eusebio Navarro
- Departamento de Genética y Microbiología, Facultad de Biología, Universidad de Murcia, Murcia, Spain
| | - Michael Freitag
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR, 97331, USA
| | - Toni Gabaldón
- Barcelona Supercomputing Centre (BSC-CNS), Plaça Eusebi Güell, 1-3, 08034, Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, 94720, USA
| | - Luis M Corrochano
- Departamento de Genética, Facultad de Biología, Universidad de Sevilla, Sevilla, Spain.
| | - Francisco E Nicolás
- Departamento de Genética y Microbiología, Facultad de Biología, Universidad de Murcia, Murcia, Spain.
| | - Victoriano Garre
- Departamento de Genética y Microbiología, Facultad de Biología, Universidad de Murcia, Murcia, Spain.
| |
Collapse
|
11
|
Lebherz MK, Fouks B, Schmidt J, Bornberg-Bauer E, Grandchamp A. DNA Transposons Favor De Novo Transcript Emergence Through Enrichment of Transcription Factor Binding Motifs. Genome Biol Evol 2024; 16:evae134. [PMID: 38934893 PMCID: PMC11264136 DOI: 10.1093/gbe/evae134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 06/11/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024] Open
Abstract
De novo genes emerge from noncoding regions of genomes via succession of mutations. Among others, such mutations activate transcription and create a new open reading frame (ORF). Although the mechanisms underlying ORF emergence are well documented, relatively little is known about the mechanisms enabling new transcription events. Yet, in many species a continuum between absent and very prominent transcription has been reported for essentially all regions of the genome. In this study, we searched for de novo transcripts by using newly assembled genomes and transcriptomes of seven inbred lines of Drosophila melanogaster, originating from six European and one African population. This setup allowed us to detect sample specific de novo transcripts, and compare them to their homologous nontranscribed regions in other samples, as well as genic and intergenic control sequences. We studied the association with transposable elements (TEs) and the enrichment of transcription factor motifs upstream of de novo emerged transcripts and compared them with regulatory elements. We found that de novo transcripts overlap with TEs more often than expected by chance. The emergence of new transcripts correlates with regions of high guanine-cytosine content and TE expression. Moreover, upstream regions of de novo transcripts are highly enriched with regulatory motifs. Such motifs are more enriched in new transcripts overlapping with TEs, particularly DNA TEs, and are more conserved upstream de novo transcripts than upstream their 'nontranscribed homologs'. Overall, our study demonstrates that TE insertion is important for transcript emergence, partly by introducing new regulatory motifs from DNA TE families.
Collapse
Affiliation(s)
| | - Bertrand Fouks
- CEFE, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398, Montpellier, France
- CIRAD, UMR AGAP Institut, F-34398, Montpellier, France
| | - Julian Schmidt
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Biology, Tübingen, Germany
| | - Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| |
Collapse
|
12
|
Stuckert AMM, Chouteau M, McClure M, LaPolice TM, Linderoth T, Nielsen R, Summers K, MacManes MD. The genomics of mimicry: Gene expression throughout development provides insights into convergent and divergent phenotypes in a Müllerian mimicry system. Mol Ecol 2024; 33:e17438. [PMID: 38923007 DOI: 10.1111/mec.17438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 04/22/2024] [Accepted: 05/24/2024] [Indexed: 06/28/2024]
Abstract
A common goal in evolutionary biology is to discern the mechanisms that produce the astounding diversity of morphologies seen across the tree of life. Aposematic species, those with a conspicuous phenotype coupled with some form of defence, are excellent models to understand the link between vivid colour pattern variations, the natural selection shaping it, and the underlying genetic mechanisms underpinning this variation. Mimicry systems in which species share a conspicuous phenotype can provide an even better model for understanding the mechanisms of colour production in aposematic species, especially if comimics have divergent evolutionary histories. Here we investigate the genetic mechanisms by which mimicry is produced in poison frogs. We assembled a 6.02-Gbp genome with a contig N50 of 310 Kbp, a scaffold N50 of 390 Kbp and 85% of expected tetrapod genes. We leveraged this genome to conduct gene expression analyses throughout development of four colour morphs of Ranitomeya imitator and two colour morphs from both R. fantastica and R. variabilis which R. imitator mimics. We identified a large number of pigmentation and patterning genes differentially expressed throughout development, many of them related to melanophores/melanin, iridophore development and guanine synthesis. We also identify the pteridine synthesis pathway (including genes such as qdpr and xdh) as a key driver of the variation in colour between morphs of these species, and identify several plausible candidates for colouration in vertebrates (e.g. cd36, ep-cadherin and perlwapin). Finally, we hypothesise that keratin genes (e.g. krt8) are important for producing different structural colours within these frogs.
Collapse
Affiliation(s)
- Adam M M Stuckert
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, USA
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, New Hampshire, USA
- Department of Biology, East Carolina University, Greenville, North Carolina, USA
| | - Mathieu Chouteau
- Laboratoire Écologie, Évolution, Interactions Des Systèmes Amazoniens (LEEISA), CNRS, IFREMER, Université de Guyane, Cayenne, France
| | - Melanie McClure
- Laboratoire Écologie, Évolution, Interactions Des Systèmes Amazoniens (LEEISA), CNRS, IFREMER, Université de Guyane, Cayenne, France
| | - Troy M LaPolice
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, New Hampshire, USA
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA
| | - Tyler Linderoth
- Department of Integrative Biology, University of California, Berkeley, California, USA
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, California, USA
| | - Kyle Summers
- Department of Biology, East Carolina University, Greenville, North Carolina, USA
| | - Matthew D MacManes
- Department of Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, New Hampshire, USA
| |
Collapse
|
13
|
Sierra P, Durbin R. Identification of transposable element families from pangenome polymorphisms. Mob DNA 2024; 15:13. [PMID: 38926873 PMCID: PMC11202377 DOI: 10.1186/s13100-024-00323-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 06/13/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Transposable Elements (TEs) are segments of DNA, typically a few hundred base pairs up to several tens of thousands bases long, that have the ability to generate new copies of themselves in the genome. Most existing methods used to identify TEs in a newly sequenced genome are based on their repetitive character, together with detection based on homology and structural features. As new high quality assemblies become more common, including the availability of multiple independent assemblies from the same species, an alternative strategy for identification of TE families becomes possible in which we focus on the polymorphism at insertion sites caused by TE mobility. RESULTS We develop the idea of using the structural polymorphisms found in pangenomes to create a library of the TE families recently active in a species, or in a closely related group of species. We present a tool, pantera, that achieves this task, and illustrate its use both on species with well-curated libraries, and on new assemblies. CONCLUSIONS Our results show that pantera is sensitive and accurate, tending to correctly identify complete elements with precise boundaries, and is particularly well suited to detect larger, low copy number TEs that are often undetected with existing de novo methods.
Collapse
Affiliation(s)
- Pío Sierra
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK.
| |
Collapse
|
14
|
Yu Z, Li J, Wang H, Ping B, Li X, Liu Z, Guo B, Yu Q, Zou Y, Sun Y, Ma F, Zhao T. Transposable elements in Rosaceae: insights into genome evolution, expression dynamics, and syntenic gene regulation. HORTICULTURE RESEARCH 2024; 11:uhae118. [PMID: 38919560 PMCID: PMC11197308 DOI: 10.1093/hr/uhae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 04/17/2024] [Indexed: 06/27/2024]
Abstract
Transposable elements (TEs) exert significant influence on plant genomic structure and gene expression. Here, we explored TE-related aspects across 14 Rosaceae genomes, investigating genomic distribution, transposition activity, expression patterns, and nearby differentially expressed genes (DEGs). Analyses unveiled distinct long terminal repeat retrotransposon (LTR-RT) evolutionary patterns, reflecting varied genome size changes among nine species over the past million years. In the past 2.5 million years, Rubus idaeus showed a transposition rate twice as fast as Fragaria vesca, while Pyrus bretschneideri displayed significantly faster transposition compared with Crataegus pinnatifida. Genes adjacent to recent TE insertions were linked to adversity resistance, while those near previous insertions were functionally enriched in morphogenesis, enzyme activity, and metabolic processes. Expression analysis revealed diverse responses of LTR-RTs to internal or external conditions. Furthermore, we identified 3695 pairs of syntenic DEGs proximal to TEs in Malus domestica cv. 'Gala' and M. domestica (GDDH13), suggesting TE insertions may contribute to varietal trait differences in these apple varieties. Our study across representative Rosaceae species underscores the pivotal role of TEs in plant genome evolution within this diverse family. It elucidates how these elements regulate syntenic DEGs on a genome-wide scale, offering insights into Rosaceae-specific genomic evolution.
Collapse
Affiliation(s)
- Ze Yu
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Jiale Li
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Hanyu Wang
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Boya Ping
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xinchu Li
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhiguang Liu
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Bocheng Guo
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Qiaoming Yu
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yangjun Zou
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yaqiang Sun
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Fengwang Ma
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Tao Zhao
- State Key Laboratory for Crop Stress Resistance and High-Efficiency Production/Shaanxi Key Laboratory of Apple, College of Horticulture, Northwest A&F University, Yangling, Shaanxi 712100, China
| |
Collapse
|
15
|
Garza AB, Lerat E, Girgis HZ. Look4LTRs: a Long terminal repeat retrotransposon detection tool capable of cross species studies and discovering recently nested repeats. Mob DNA 2024; 15:8. [PMID: 38627766 PMCID: PMC11020628 DOI: 10.1186/s13100-024-00317-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 03/08/2024] [Indexed: 04/20/2024] Open
Abstract
Plant genomes include large numbers of transposable elements. One particular type of these elements is flanked by two Long Terminal Repeats (LTRs) and can translocate using RNA. Such elements are known as LTR-retrotransposons; they are the most abundant type of transposons in plant genomes. They have many important functions involving gene regulation and the rise of new genes and pseudo genes in response to severe stress. Additionally, LTR-retrotransposons have several applications in biotechnology. Due to the abundance and the importance of LTR-retrotransposons, multiple computational tools have been developed for their detection. However, none of these tools take advantages of the availability of related genomes; they process one chromosome at a time. Further, recently nested LTR-retrotransposons (multiple elements of the same family are inserted into each other) cannot be annotated accurately - or cannot be annotated at all - by the currently available tools. Motivated to overcome these two limitations, we built Look4LTRs, which can annotate LTR-retrotransposons in multiple related genomes simultaneously and discover recently nested elements. The methodology of Look4LTRs depends on techniques imported from the signal-processing field, graph algorithms, and machine learning with a minimal use of alignment algorithms. Four plant genomes were used in developing Look4LTRs and eight plant genomes for evaluating it in contrast to three related tools. Look4LTRs is the fastest while maintaining better or comparable F1 scores (the harmonic average of recall and precision) to those obtained by the other tools. Our results demonstrate the added benefit of annotating LTR-retrotransposons in multiple related genomes simultaneously and the ability to discover recently nested elements. Expert human manual examination of six elements - not included in the ground truth - revealed that three elements belong to known families and two elements are likely from new families. With respect to examining recently nested LTR-retrotransposons, three out of five were confirmed to be valid elements. Look4LTRs - with its speed, accuracy, and novel features - represents a true advancement in the annotation of LTR-retrotransposons, opening the door to many studies focused on understanding their functions in plants.
Collapse
Affiliation(s)
- Anthony B Garza
- Bioinformatics Toolsmith Laboratory, Department of Electrical Engineering and Computer Science, Texas A &M University-Kingsville, Kingsville, Texas, USA
| | - Emmanuelle Lerat
- The Biometrics and Evolutionary Biology Laboratory, University Lyon 1, Lyon, France
| | - Hani Z Girgis
- Bioinformatics Toolsmith Laboratory, Department of Electrical Engineering and Computer Science, Texas A &M University-Kingsville, Kingsville, Texas, USA.
| |
Collapse
|
16
|
Rojas J, Hose J, Auguste Dutcher H, Place M, Wolters JF, Hittinger CT, Gasch AP. Comparative modeling reveals the molecular determinants of aneuploidy fitness cost in a wild yeast model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.09.588778. [PMID: 38645209 PMCID: PMC11030387 DOI: 10.1101/2024.04.09.588778] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Although implicated as deleterious in many organisms, aneuploidy can underlie rapid phenotypic evolution. However, aneuploidy will only be maintained if the benefit outweighs the cost, which remains incompletely understood. To quantify this cost and the molecular determinants behind it, we generated a panel of chromosome duplications in Saccharomyces cerevisiae and applied comparative modeling and molecular validation to understand aneuploidy toxicity. We show that 74-94% of the variance in aneuploid strains' growth rates is explained by the additive cost of genes on each chromosome, measured for single-gene duplications using a genomic library, along with the deleterious contribution of snoRNAs and beneficial effects of tRNAs. Machine learning to identify properties of detrimental gene duplicates provided no support for the balance hypothesis of aneuploidy toxicity and instead identified gene length as the best predictor of toxicity. Our results present a generalized framework for the cost of aneuploidy with implications for disease biology and evolution.
Collapse
Affiliation(s)
- Julie Rojas
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - James Hose
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - H Auguste Dutcher
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Michael Place
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - John F Wolters
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Chris Todd Hittinger
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
- J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Audrey P Gasch
- Center for Genomic Science Innovation, University of Wisconsin-Madison, Madison, WI 53706, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53706, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
- J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
17
|
Loreto ELS, Melo ESD, Wallau GL, Gomes TMFF. The good, the bad and the ugly of transposable elements annotation tools. Genet Mol Biol 2024; 46:e20230138. [PMID: 38373163 PMCID: PMC10876081 DOI: 10.1590/1678-4685-gmb-2023-0138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 11/26/2023] [Indexed: 02/21/2024] Open
Abstract
Transposable elements are repetitive and mobile DNA segments that can be found in virtually all organisms investigated to date. Their complex structure and variable nature are particularly challenging from the genomic annotation point of view. Many softwares have been developed to automate and facilitate TEs annotation at the genomic level, but they are highly heterogeneous regarding documentation, usability and methods. In this review, we revisited the existing software for TE genomic annotation, concentrating on the most often used ones, the methodologies they apply, and usability. Building on the state of the art of TE annotation software we propose best practices and highlight the strengths and weaknesses from the available solutions.
Collapse
Affiliation(s)
- Elgion L S Loreto
- Universidade Federal do Rio Grande do Sul, Programa de Pós-Graduação em Genética e Biologia Molecular, Porto Alegre, RS, Brazil
- Universidade Federal de Santa Maria, Departamento de Bioquímica e Biologia Molecular, Santa Maria, RS, Brazil
| | - Elverson S de Melo
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Entomologia, Recife, PE, Brazil
| | - Gabriel L Wallau
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Entomologia, Recife, PE, Brazil
| | - Tiago M F F Gomes
- Universidade Federal do Rio Grande do Sul, Programa de Pós-Graduação em Genética e Biologia Molecular, Porto Alegre, RS, Brazil
| |
Collapse
|
18
|
Bernabeu M, Cabello-Yeves E, Flores E, Samarra A, Kimberley Summers J, Marina A, Collado MC. Role of vertical and horizontal microbial transmission of antimicrobial resistance genes in early life: insights from maternal-infant dyads. Curr Opin Microbiol 2024; 77:102424. [PMID: 38237429 DOI: 10.1016/j.mib.2023.102424] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 12/19/2023] [Accepted: 12/20/2023] [Indexed: 02/12/2024]
Abstract
Early life represents a critical window for metabolic, cognitive and immune system development, which is influenced by the maternal microbiome as well as the infant gut microbiome. Antibiotic exposure, mode of delivery and breastfeeding practices modulate the gut microbiome and the reservoir of antibiotic resistance genes (ARGs). Vertical and horizontal microbial gene transfer during early life and the mechanisms behind these transfers are being uncovered. In this review, we aim to provide an overview of the current knowledge on the transfer of antibiotic resistance in the mother-infant dyad through vertical and horizontal transmission and to highlight the main gaps and challenges in this area.
Collapse
Affiliation(s)
- Manuel Bernabeu
- Institute of Agrochemistry and Food Technology - National Research Council (IATA-CSIC), 46980 Valencia, Spain.
| | - Elena Cabello-Yeves
- Instituto de Biomedicina de Valencia-Consejo de Investigaciones Científicas (IBV-CSIC), CIBER de Enfermedades Raras (CIBERER), 46010 Valencia, Spain.
| | - Eduard Flores
- Institute of Agrochemistry and Food Technology - National Research Council (IATA-CSIC), 46980 Valencia, Spain
| | - Anna Samarra
- Institute of Agrochemistry and Food Technology - National Research Council (IATA-CSIC), 46980 Valencia, Spain
| | - Joanna Kimberley Summers
- Wellington Lab, School of Life Sciences, University of Warwick, CV4 7AL Coventry, United Kingdom
| | - Alberto Marina
- Instituto de Biomedicina de Valencia-Consejo de Investigaciones Científicas (IBV-CSIC), CIBER de Enfermedades Raras (CIBERER), 46010 Valencia, Spain
| | - M Carmen Collado
- Institute of Agrochemistry and Food Technology - National Research Council (IATA-CSIC), 46980 Valencia, Spain
| |
Collapse
|
19
|
Feldmeyer B, Bornberg-Bauer E, Dohmen E, Fouks B, Heckenhauer J, Huylmans AK, Jones ARC, Stolle E, Harrison MC. Comparative Evolutionary Genomics in Insects. Methods Mol Biol 2024; 2802:473-514. [PMID: 38819569 DOI: 10.1007/978-1-0716-3838-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Genome sequencing quality, in terms of both read length and accuracy, is constantly improving. By combining long-read sequencing technologies with various scaffolding techniques, chromosome-level genome assemblies are now achievable at an affordable price for non-model organisms. Insects represent an exciting taxon for studying the genomic underpinnings of evolutionary innovations, due to ancient origins, immense species-richness, and broad phenotypic diversity. Here we summarize some of the most important methods for carrying out a comparative genomics study on insects. We describe available tools and offer concrete tips on all stages of such an endeavor from DNA extraction through genome sequencing, annotation, and several evolutionary analyses. Along the way we describe important insect-specific aspects, such as DNA extraction difficulties or gene families that are particularly difficult to annotate, and offer solutions. We describe results from several examples of comparative genomics analyses on insects to illustrate the fascinating questions that can now be addressed in this new age of genomics research.
Collapse
Affiliation(s)
- Barbara Feldmeyer
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Molecular Ecology, Frankfurt, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Bertrand Fouks
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Jacqueline Heckenhauer
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
- Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, Germany
| | - Ann Kathrin Huylmans
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University, Mainz, Germany
| | - Alun R C Jones
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Eckart Stolle
- Museum Koenig, Leibniz Institute for the Analysis of Biodiversity Change (LIB), Bonn, Germany
| | - Mark C Harrison
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
20
|
Gao D. Introduction of Plant Transposon Annotation for Beginners. BIOLOGY 2023; 12:1468. [PMID: 38132293 PMCID: PMC10741241 DOI: 10.3390/biology12121468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 11/21/2023] [Accepted: 11/23/2023] [Indexed: 12/23/2023]
Abstract
Transposons are mobile DNA sequences that contribute large fractions of many plant genomes. They provide exclusive resources for tracking gene and genome evolution and for developing molecular tools for basic and applied research. Despite extensive efforts, it is still challenging to accurately annotate transposons, especially for beginners, as transposon prediction requires necessary expertise in both transposon biology and bioinformatics. Moreover, the complexity of plant genomes and the dynamic evolution of transposons also bring difficulties for genome-wide transposon discovery. This review summarizes the three major strategies for transposon detection including repeat-based, structure-based, and homology-based annotation, and introduces the transposon superfamilies identified in plants thus far, and some related bioinformatics resources for detecting plant transposons. Furthermore, it describes transposon classification and explains why the terms 'autonomous' and 'non-autonomous' cannot be used to classify the superfamilies of transposons. Lastly, this review also discusses how to identify misannotated transposons and improve the quality of the transposon database. This review provides helpful information about plant transposons and a beginner's guide on annotating these repetitive sequences.
Collapse
Affiliation(s)
- Dongying Gao
- Small Grains and Potato Germplasm Research Unit, USDA-ARS, Aberdeen, ID 83210, USA
| |
Collapse
|
21
|
Bush ZD, Naftaly AFS, Dinwiddie D, Albers C, Hillers KJ, Libuda DE. Comprehensive detection of structural variation and transposable element differences between wild type laboratory lineages of C. elegans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.13.523974. [PMID: 37961628 PMCID: PMC10634987 DOI: 10.1101/2023.01.13.523974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Genomic structural variations (SVs) and transposable elements (TEs) can be significant contributors to genome evolution, altered gene expression, and risk of genetic diseases. Recent advancements in long-read sequencing have greatly improved the quality of de novo genome assemblies and enhanced the detection of sequence variants at the scale of hundreds or thousands of bases. Comparisons between two diverged wild isolates of Caenorhabditis elegans, the Bristol and Hawaiian strains, have been widely utilized in the analysis of small genetic variations. Genetic drift, including SVs and rearrangements of repeated sequences such as TEs, can occur over time from long-term maintenance of wild type isolates within the laboratory. To comprehensively detect both large and small structural variations as well as TEs due to genetic drift, we generated de novo genome assemblies and annotations for each strain from our lab collection using both long- and short-read sequencing and compared our assemblies and annotations with that of other lab wild type strains. Within our lab assemblies, we annotate over 3.1Mb of sequence divergence between the Bristol and Hawaiian isolates: 337,584 SNPs, 94,503 small insertion-deletions (<50bp), and 4,334 structural variations (>50bp). Further, we define the location and movement of specific DNA TEs between N2 Bristol and CB4856 Hawaiian wild type isolates. Specifically, we find the N2 Bristol genome has 20.6% more TEs from the Tc1/mariner family than the CB4856 Hawaiian genome. Moreover, we identified Zator elements as the most abundant and mobile TE family in the genome. Using specific TE sequences with unique SNPs, we also identify 38 TEs that moved intrachromosomally and 9 TEs that moved interchromosomally between the N2 Bristol and CB4856 Hawaiian genomes. By comparing the de novo genome assembly of our lab collection Bristol isolate to the VC2010 Bristol assembly, we also reveal that lab lineages display over 2 Mb of total variation: 1,162 SNPs, 1,528 indels, and 897 SVs with 95% of the variation due to SVs. Overall, our work demonstrates the unique contribution of SVs and TEs to variation and genetic drift between wild type laboratory strains assumed to be isogenic despite growing evidence of genetic drift and phenotypic variation.
Collapse
Affiliation(s)
- Zachary D. Bush
- Institute of Molecular Biology, Department of Biology, University of Oregon, 1229 Franklin Blvd Eugene, OR 97403, USA
| | - Alice F. S. Naftaly
- Institute of Molecular Biology, Department of Biology, University of Oregon, 1229 Franklin Blvd Eugene, OR 97403, USA
| | - Devin Dinwiddie
- Institute of Molecular Biology, Department of Biology, University of Oregon, 1229 Franklin Blvd Eugene, OR 97403, USA
| | - Cora Albers
- Institute of Molecular Biology, Department of Biology, University of Oregon, 1229 Franklin Blvd Eugene, OR 97403, USA
| | - Kenneth J. Hillers
- Biological Sciences Department, California Polytechnic State University, San Luis Obispo, California, USA
| | - Diana E. Libuda
- Institute of Molecular Biology, Department of Biology, University of Oregon, 1229 Franklin Blvd Eugene, OR 97403, USA
| |
Collapse
|
22
|
Holtz MA, Racicot R, Preininger D, Stuckert AMM, Mangiamele LA. Genome assembly of the foot-flagging frog, Staurois parvus: a resource for understanding mechanisms of behavior. G3 (BETHESDA, MD.) 2023; 13:jkad193. [PMID: 37625789 PMCID: PMC10542557 DOI: 10.1093/g3journal/jkad193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 03/22/2023] [Accepted: 08/13/2023] [Indexed: 08/27/2023]
Abstract
Elaborate and skilled movements of the body have been selected in a variety of species as courtship and rivalry signals. One roadblock in studying these behaviors has been a lack of resources for understanding how they evolved at the genetic level. The Bornean rock frog (Staurois parvus) is an ideal species in which to address this issue. Males wave their hindlimbs in a "foot-flagging" display when competing for mates. The evolution of foot flagging in S. parvus and other species is accompanied by increases in the expression of the androgen receptor gene within its neuromuscular system, but it remains unclear what genetic or transcriptional changes are associated with this behavioral phenotype. We have now assembled the genome of S. parvus, resulting in 3.98 Gbp of 22,402 contigs with an N50 of 611,229 bp. The genome will be a resource for finding genes related to the physiology underlying foot flagging and to adaptations of the neuromuscular system. As a first application of the genome, we also began work in comparative genomics and differential gene expression analysis. We show that the androgen receptor is diverged from other anuran species, and we identify unique expression patterns of genes in the spinal cord and leg muscle that are important for axial patterning, cell specification and morphology, or muscle contraction. This genome will continue to be an important tool for future -omics studies to understand the evolution of elaborate signaling behaviors in this and potentially related species.
Collapse
Affiliation(s)
- Mika A Holtz
- Department of Biological Sciences, Smith College, Northampton, MA 01053, USA
| | - Riccardo Racicot
- Department of Biological Sciences, Smith College, Northampton, MA 01053, USA
| | - Doris Preininger
- Vienna Zoo, 1130 Vienna, Austria
- Department of Evolutionary Biology, University of Vienna, 1030 Vienna, Austria
| | - Adam M M Stuckert
- Department of Biology & Biochemistry, University of Houston, Houston, TX 77204, USA
| | - Lisa A Mangiamele
- Department of Biological Sciences, Smith College, Northampton, MA 01053, USA
| |
Collapse
|
23
|
Jiang K, Lim J, Sgrizzi S, Trinh M, Kayabolen A, Yutin N, Bao W, Kato K, Koonin EV, Gootenberg JS, Abudayyeh OO. Programmable RNA-guided DNA endonucleases are widespread in eukaryotes and their viruses. SCIENCE ADVANCES 2023; 9:eadk0171. [PMID: 37756409 PMCID: PMC10530073 DOI: 10.1126/sciadv.adk0171] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 08/24/2023] [Indexed: 09/29/2023]
Abstract
Programmable RNA-guided DNA nucleases perform numerous roles in prokaryotes, but the extent of their spread outside prokaryotes is unclear. Fanzors, the eukaryotic homolog of prokaryotic TnpB proteins, have been detected in genomes of eukaryotes and large viruses, but their activity and functions in eukaryotes remain unknown. Here, we characterize Fanzors as RNA-programmable DNA endonucleases, using biochemical and cellular evidence. We found diverse Fanzors that frequently associate with various eukaryotic transposases. Reconstruction of Fanzors evolution revealed multiple radiations of RuvC-containing TnpB homologs in eukaryotes. Fanzor genes captured introns and proteins acquired nuclear localization signals, indicating extensive, long-term adaptation to functioning in eukaryotic cells. Fanzor nucleases contain a rearranged catalytic site of the RuvC domain, similar to a distinct subset of TnpBs, and lack collateral cleavage activity. We demonstrate that Fanzors can be harnessed for genome editing in human cells, highlighting the potential of these widespread eukaryotic RNA-guided nucleases for biotechnology applications.
Collapse
Affiliation(s)
- Kaiyi Jiang
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Justin Lim
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Samantha Sgrizzi
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Michael Trinh
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Alisan Kayabolen
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Natalya Yutin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Weidong Bao
- Genetic Information Research Institute, 20380 Town Center Ln, Suite 240, Cupertino, CA, USA
| | - Kazuki Kato
- Structural Biology Division, Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo 153-8904, Japan
- Department of Molecular and Mechanistic Immunology, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo 113-8510, Japan
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Jonathan S. Gootenberg
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Omar O. Abudayyeh
- McGovern Institute for Brain Research at MIT Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
24
|
Orozco-Arias S, Lopez-Murillo LH, Piña JS, Valencia-Castrillon E, Tabares-Soto R, Castillo-Ossa L, Isaza G, Guyot R. Genomic object detection: An improved approach for transposable elements detection and classification using convolutional neural networks. PLoS One 2023; 18:e0291925. [PMID: 37733731 PMCID: PMC10513252 DOI: 10.1371/journal.pone.0291925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 09/10/2023] [Indexed: 09/23/2023] Open
Abstract
Analysis of eukaryotic genomes requires the detection and classification of transposable elements (TEs), a crucial but complex and time-consuming task. To improve the performance of tools that accomplish these tasks, Machine Learning approaches (ML) that leverage computer resources, such as GPUs (Graphical Processing Unit) and multiple CPU (Central Processing Unit) cores, have been adopted. However, until now, the use of ML techniques has mostly been limited to classification of TEs. Herein, a detection-classification strategy (named YORO) based on convolutional neural networks is adapted from computer vision (YOLO) to genomics. This approach enables the detection of genomic objects through the prediction of the position, length, and classification in large DNA sequences such as fully sequenced genomes. As a proof of concept, the internal protein-coding domains of LTR-retrotransposons are used to train the proposed neural network. Precision, recall, accuracy, F1-score, execution times and time ratios, as well as several graphical representations were used as metrics to measure performance. These promising results open the door for a new generation of Deep Learning tools for genomics. YORO architecture is available at https://github.com/simonorozcoarias/YORO.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Colombia
- Center for Technology Development Bioprocess and Agroindustry Plant, Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia
| | | | - Johan S. Piña
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Colombia
| | | | - Reinel Tabares-Soto
- Center for Technology Development Bioprocess and Agroindustry Plant, Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia
| | - Luis Castillo-Ossa
- Center for Technology Development Bioprocess and Agroindustry Plant, Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia
| | - Gustavo Isaza
- Center for Technology Development Bioprocess and Agroindustry Plant, Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia
| | - Romain Guyot
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia
- Institut de Recherche pour le Développement, CIRAD, Univ. Montpellier, Montpellier, France
| |
Collapse
|
25
|
Jung J, Jhang SY, Kim B, Koh B, Ban C, Seo H, Park T, Chi WJ, Kim S, Kim H, Yu J. The first high-quality genome assembly and annotation of Patiria pectinifera. Sci Data 2023; 10:642. [PMID: 37730712 PMCID: PMC10511450 DOI: 10.1038/s41597-023-02508-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 08/29/2023] [Indexed: 09/22/2023] Open
Abstract
The blue bat star, a highly adaptive species in the East Sea of Korea, has displayed remarkable success in adapting to recent climate change. The genetic mechanisms behind this success were not well-understood, prompting our report on the first chromosome-level assembly of the Patiria genus. We assembled the genome using Nanopore and Illumina sequences, yielding a total length of 615 Mb and a scaffold N50 of 24,204,423 bp. Hi-C analysis allowed us to anchor the scaffold sequences onto 22 pseudochromosomes. K-mer based analysis revealed 5.16% heterozygosity rate of the genome, higher than any previously reported echinoderm species. Our transposable element analysis exposed a substantial number of genome-wide retrotransposons and DNA transposons. These results offer valuable resources for understanding the evolutionary mechanisms behind P. pectinifera's successful adaptation in fluctuating environments.
Collapse
Affiliation(s)
- Jaehoon Jung
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - So Yun Jhang
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Bongsang Kim
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - Bomin Koh
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - Chaeyoung Ban
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
| | - Hyojung Seo
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
| | - Taeseo Park
- Animal Resources Division, National Institute of Biological Resources, Incheon, 22689, Republic of Korea
| | - Won-Jae Chi
- Microorganism Resources Division, National Institute of Biological Resources, Incheon, 22689, Republic of Korea
| | - Soonok Kim
- Microorganism Resources Division, National Institute of Biological Resources, Incheon, 22689, Republic of Korea
| | - Heebal Kim
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea
- Department of Agricultural and Life Sciences and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Jaewoong Yu
- eGnome, Inc., 26 Beobwon-ro 9-gil, Songpa-gu, Seoul, 05836, Republic of Korea.
| |
Collapse
|
26
|
Liao X, Zhu W, Zhou J, Li H, Xu X, Zhang B, Gao X. Repetitive DNA sequence detection and its role in the human genome. Commun Biol 2023; 6:954. [PMID: 37726397 PMCID: PMC10509279 DOI: 10.1038/s42003-023-05322-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/04/2023] [Indexed: 09/21/2023] Open
Abstract
Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.
Collapse
Affiliation(s)
- Xingyu Liao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Wufei Zhu
- Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, 443000, Yichang, P.R. China
| | - Juexiao Zhou
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Haoyang Li
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xiaopeng Xu
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Bin Zhang
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
27
|
Zhao P, Peng C, Fang L, Wang Z, Liu GE. Taming transposable elements in livestock and poultry: a review of their roles and applications. Genet Sel Evol 2023; 55:50. [PMID: 37479995 PMCID: PMC10362595 DOI: 10.1186/s12711-023-00821-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/30/2023] [Indexed: 07/23/2023] Open
Abstract
Livestock and poultry play a significant role in human nutrition by converting agricultural by-products into high-quality proteins. To meet the growing demand for safe animal protein, genetic improvement of livestock must be done sustainably while minimizing negative environmental impacts. Transposable elements (TE) are important components of livestock and poultry genomes, contributing to their genetic diversity, chromatin states, gene regulatory networks, and complex traits of economic value. However, compared to other species, research on TE in livestock and poultry is still in its early stages. In this review, we analyze 72 studies published in the past 20 years, summarize the TE composition in livestock and poultry genomes, and focus on their potential roles in functional genomics. We also discuss bioinformatic tools and strategies for integrating multi-omics data with TE, and explore future directions, feasibility, and challenges of TE research in livestock and poultry. In addition, we suggest strategies to apply TE in basic biological research and animal breeding. Our goal is to provide a new perspective on the importance of TE in livestock and poultry genomes.
Collapse
Affiliation(s)
- Pengju Zhao
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Chen Peng
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China
| | - Lingzhao Fang
- Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark.
| | - Zhengguang Wang
- Hainan Institute of Zhejiang University, Hainan Sanya, 572000, China.
- College of Animal Sciences, Zhejiang University, Zhejiang, Hangzhou, People's Republic of China.
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, MD, 20705, USA.
| |
Collapse
|
28
|
Gonzalez‐García LN, Lozano‐Arce D, Londoño JP, Guyot R, Duitama J. Efficient homology-based annotation of transposable elements using minimizers. APPLICATIONS IN PLANT SCIENCES 2023; 11:e11520. [PMID: 37601317 PMCID: PMC10439823 DOI: 10.1002/aps3.11520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 03/02/2023] [Accepted: 03/04/2023] [Indexed: 08/22/2023]
Abstract
Premise Transposable elements (TEs) make up more than half of the genomes of complex plant species and can modulate the expression of neighboring genes, producing significant variability of agronomically relevant traits. The availability of long-read sequencing technologies allows the building of genome assemblies for plant species with large and complex genomes. Unfortunately, TE annotation currently represents a bottleneck in the annotation of genome assemblies. Methods and Results We present a new functionality of the Next-Generation Sequencing Experience Platform (NGSEP) to perform efficient homology-based TE annotation. Sequences in a reference library are treated as long reads and mapped to an input genome assembly. A hierarchical annotation is then assigned by homology using the annotation of the reference library. We tested the performance of our algorithm on genome assemblies of different plant species, including Arabidopsis thaliana, Oryza sativa, Coffea humblotiana, and Triticum aestivum (bread wheat). Our algorithm outperforms traditional homology-based annotation tools in speed by a factor of three to >20, reducing the annotation time of the T. aestivum genome from months to hours, and recovering up to 80% of TEs annotated with RepeatMasker with a precision of up to 0.95. Conclusions NGSEP allows rapid analysis of TEs, especially in very large and TE-rich plant genomes.
Collapse
Affiliation(s)
- Laura Natalia Gonzalez‐García
- Systems and Computing Engineering DepartmentUniversidad de los AndesBogotáColombia
- UMR DIADE, Institut de Recherche pour le DéveloppementUniversité de Montpellier, CIRAD34394MontpellierFrance
| | - Daniela Lozano‐Arce
- Systems and Computing Engineering DepartmentUniversidad de los AndesBogotáColombia
| | | | - Romain Guyot
- UMR DIADE, Institut de Recherche pour le DéveloppementUniversité de Montpellier, CIRAD34394MontpellierFrance
| | - Jorge Duitama
- Systems and Computing Engineering DepartmentUniversidad de los AndesBogotáColombia
| |
Collapse
|
29
|
Jiang K, Lim J, Sgrizzi S, Trinh M, Kayabolen A, Yutin N, Koonin EV, Abudayyeh OO, Gootenberg JS. Programmable RNA-guided endonucleases are widespread in eukaryotes and their viruses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.13.544871. [PMID: 37398409 PMCID: PMC10312701 DOI: 10.1101/2023.06.13.544871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
TnpB proteins are RNA-guided nucleases that are broadly associated with IS200/605 family transposons in prokaryotes. TnpB homologs, named Fanzors, have been detected in genomes of some eukaryotes and large viruses, but their activity and functions in eukaryotes remain unknown. We searched genomes of diverse eukaryotes and their viruses for TnpB homologs and identified numerous putative RNA-guided nucleases that are often associated with various transposases, suggesting they are encoded in mobile genetic elements. Reconstruction of the evolution of these nucleases, which we rename Horizontally-transferred Eukaryotic RNA-guided Mobile Element Systems (HERMES), revealed multiple acquisitions of TnpBs by eukaryotes and subsequent diversification. In their adaptation and spread in eukaryotes, HERMES proteins acquired nuclear localization signals, and genes captured introns, indicating extensive, long term adaptation to functioning in eukaryotic cells. Biochemical and cellular evidence show that HERMES employ non-coding RNAs encoded adjacent to the nuclease for RNA-guided cleavage of double-stranded DNA. HERMES nucleases contain a re-arranged catalytic site of the RuvC domain, similar to a distinct subset of TnpBs, and lack collateral cleavage activity. We demonstrate that HERMES can be harnessed for genome editing in human cells, highlighting the potential of these widespread eukaryotic RNA-guided nucleases for biotechnology applications.
Collapse
Affiliation(s)
- Kaiyi Jiang
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Justin Lim
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Samantha Sgrizzi
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Michael Trinh
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Alisan Kayabolen
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Natalya Yutin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Omar O. Abudayyeh
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Jonathan S. Gootenberg
- McGovern Institute for Brain Research at MIT, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
30
|
Gao Y, Liao HB, Liu TH, Wu JM, Wang ZF, Cao HL. Draft genome and transcriptome of Nepenthes mirabilis, a carnivorous plant in China. BMC Genom Data 2023; 24:21. [PMID: 37060047 PMCID: PMC10103442 DOI: 10.1186/s12863-023-01126-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 04/06/2023] [Indexed: 04/16/2023] Open
Abstract
OBJECTIVES Nepenthes belongs to the monotypic family Nepenthaceae, one of the largest carnivorous plant families. Nepenthes species show impressive adaptive radiation and suffer from being overexploited in nature. Nepenthes mirabilis is the most widely distributed species and the only Nepenthes species that is naturally distributed within China. Herein, we reported the genome and transcriptome assemblies of N. mirabilis. The assemblies will be useful resources for comparative genomics, to understand the adaptation and conservation of carnivorous species. DATA DESCRIPTION This work produced ~ 139.5 Gb N. mirabilis whole genome sequencing reads using leaf tissues, and ~ 21.7 Gb and ~ 27.9 Gb of raw RNA-seq reads for its leaves and flowers, respectively. Transcriptome assembly obtained 339,802 transcripts, in which 79,758 open reading frames (ORFs) were identified. Function analysis indicated that these ORFs were mainly associated with proteolysis and DNA integration. The assembled genome was 691,409,685 bp with 159,555 contigs/scaffolds and an N50 of 10,307 bp. The BUSCO assessment of the assembled genome and transcriptome indicated 91.1% and 93.7% completeness, respectively. A total of 42,961 genes were predicted in the genome identified, coding for 45,461 proteins. The predicted genes were annotated using multiple databases, facilitating future functional analyses of them. This is the first genome report on the Nepenthaceae family.
Collapse
Affiliation(s)
- Yuan Gao
- Zhongshan Management Centre of the Natural Protected Area, Zhongshan, China
| | - Hao-Bin Liao
- Zhongshan Management Centre of the Natural Protected Area, Zhongshan, China
| | - Ting-Hong Liu
- Guangdong Provincial Key Laboratory of Applied Botany, Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Jia-Ming Wu
- Zhongshan Management Centre of the Natural Protected Area, Zhongshan, China
| | - Zheng-Feng Wang
- Guangdong Provincial Key Laboratory of Applied Botany, Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.
| | - Hong-Lin Cao
- Guangdong Provincial Key Laboratory of Applied Botany, Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.
| |
Collapse
|
31
|
Amiri S, Adibzadeh S, Ghanbari S, Rahmani B, Kheirandish MH, Farokhi-Fard A, Dastjerdeh MS, Davami F. CRISPR-interceded CHO cell line development approaches. Biotechnol Bioeng 2023; 120:865-902. [PMID: 36597180 DOI: 10.1002/bit.28329] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 11/28/2022] [Accepted: 01/02/2023] [Indexed: 01/05/2023]
Abstract
For industrial production of recombinant protein biopharmaceuticals, Chinese hamster ovary (CHO) cells represent the most widely adopted host cell system, owing to their capacity to produce high-quality biologics with human-like posttranslational modifications. As opposed to random integration, targeted genome editing in genomic safe harbor sites has offered CHO cell line engineering a new perspective, ensuring production consistency in long-term culture and high biotherapeutic expression levels. Corresponding the remarkable advancements in knowledge of CRISPR-Cas systems, the use of CRISPR-Cas technology along with the donor design strategies has been pushed into increasing novel scenarios in cell line engineering, allowing scientists to modify mammalian genomes such as CHO cell line quickly, readily, and efficiently. Depending on the strategies and production requirements, the gene of interest can also be incorporated at single or multiple loci. This review will give a gist of all the most fundamental recent advancements in CHO cell line development, such as different cell line engineering approaches along with donor design strategies for targeted integration of the desired construct into genomic hot spots, which could ultimately lead to the fast-track product development process with consistent, improved product yield and quality.
Collapse
Affiliation(s)
- Shahin Amiri
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Setare Adibzadeh
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Samaneh Ghanbari
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Behnaz Rahmani
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Mohammad H Kheirandish
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
- Department of Medical Biotechnology, School of Advanced Technologies, Tehran University of Medical Sciences, Tehran, Iran
| | - Aref Farokhi-Fard
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Mansoureh S Dastjerdeh
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Fatemeh Davami
- Department of Medical Biotechnology, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| |
Collapse
|
32
|
Gao L, Xu W, Xin T, Song J. Application of third-generation sequencing to herbal genomics. FRONTIERS IN PLANT SCIENCE 2023; 14:1124536. [PMID: 36959935 PMCID: PMC10027759 DOI: 10.3389/fpls.2023.1124536] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 02/02/2023] [Indexed: 06/18/2023]
Abstract
There is a long history of traditional medicine use. However, little genetic information is available for the plants used in traditional medicine, which limits the exploitation of these natural resources. Third-generation sequencing (TGS) techniques have made it possible to gather invaluable genetic information and develop herbal genomics. In this review, we introduce two main TGS techniques, PacBio SMRT technology and Oxford Nanopore technology, and compare the two techniques against Illumina, the predominant next-generation sequencing technique. In addition, we summarize the nuclear and organelle genome assemblies of commonly used medicinal plants, choose several examples from genomics, transcriptomics, and molecular identification studies to dissect the specific processes and summarize the advantages and disadvantages of the two TGS techniques when applied to medicinal organisms. Finally, we describe how we expect that TGS techniques will be widely utilized to assemble telomere-to-telomere (T2T) genomes and in epigenomics research involving medicinal plants.
Collapse
|
33
|
Maurya A, Szymanski M, Karlowski WM. ARA: a flexible pipeline for automated exploration of NCBI SRA datasets. Gigascience 2022; 12:giad067. [PMID: 37589306 PMCID: PMC10433097 DOI: 10.1093/gigascience/giad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 07/07/2023] [Accepted: 08/01/2023] [Indexed: 08/18/2023] Open
Abstract
BACKGROUND One of the most effective and useful methods to explore the content of biological databases is searching with nucleotide or protein sequences as a query. However, especially in the case of nucleic acids, due to the large volume of data generated by the next-generation sequencing (NGS) technologies, this approach is often not available. The hierarchical organization of the NGS records is primarily designed for browsing or text-based searches of the information provided in metadata-related keywords, limiting the efficiency of database exploration. FINDINGS We developed an automated pipeline that incorporates the well-established NGS data-processing tools and procedures to allow easy and effective sampling of the NCBI SRA database records. Given a file with query nucleotide sequences, our tool estimates the matching content of SRA accessions by probing only a user-defined fraction of a record's sequences. Based on the selected parameters, it allows performing a full mapping experiment with records that meet the required criteria. The pipeline is designed to be easy to operate-it offers a fully automatic setup procedure and is fixed on tested supporting tools. The modular design and implemented usage modes allow a user to scale up the analyses into complex computational infrastructure. CONCLUSIONS We present an easy-to-operate and automated tool that expands the way a user can access and explore the information contained within the records deposited in the NCBI SRA database.
Collapse
Affiliation(s)
- Anand Maurya
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 6, 61-614 Poznan, Poland
| | - Maciej Szymanski
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 6, 61-614 Poznan, Poland
| | - Wojciech M Karlowski
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 6, 61-614 Poznan, Poland
| |
Collapse
|
34
|
Orozco-Arias S, Humberto Lopez-Murillo L, Candamil-Cortés MS, Arias M, Jaimes PA, Rossi Paschoal A, Tabares-Soto R, Isaza G, Guyot R. Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes. Brief Bioinform 2022; 24:6887110. [PMID: 36502372 PMCID: PMC9851300 DOI: 10.1093/bib/bbac511] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 10/13/2022] [Accepted: 10/26/2022] [Indexed: 12/14/2022] Open
Abstract
LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Corresponding authors. Simon Orozco-Arias, Computer Science Department, Universidad Autónoma de Manizales, Antigua Estación del Ferrocarrill, Manizalez, Colombia. Tel.: +57(606)8727272 - 8727709 Ext 102; E-mail: ; Alexandre Rossi Paschoal, Department of Computer Science, Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, Paraná, 86300-000, Brazil. Tel.: +433133-3790; E-mail: ; Gustavo Isaza, Systems and Informatics Department, Center for Technology Development - Bioprocess and Agro-industry Plant, Universidad de Caldas, St 65 #26-10, Manizales, Colombia. Tel.: +57(606)8781500 ext 13146; E-mail: , Romain Guyot, IRD, 911 Av. Agropolis, 34394 Montpellier, France. Tel.: +334674160000; E-mail:
| | | | | | - Maradey Arias
- Department of Computer Science, Universidad Autónoma de Manizales, 170001, Caldas, Colombia
| | - Paula A Jaimes
- Department of Computer Science, Universidad Autónoma de Manizales, 170001, Caldas, Colombia
| | - Alexandre Rossi Paschoal
- Corresponding authors. Simon Orozco-Arias, Computer Science Department, Universidad Autónoma de Manizales, Antigua Estación del Ferrocarrill, Manizalez, Colombia. Tel.: +57(606)8727272 - 8727709 Ext 102; E-mail: ; Alexandre Rossi Paschoal, Department of Computer Science, Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, Paraná, 86300-000, Brazil. Tel.: +433133-3790; E-mail: ; Gustavo Isaza, Systems and Informatics Department, Center for Technology Development - Bioprocess and Agro-industry Plant, Universidad de Caldas, St 65 #26-10, Manizales, Colombia. Tel.: +57(606)8781500 ext 13146; E-mail: , Romain Guyot, IRD, 911 Av. Agropolis, 34394 Montpellier, France. Tel.: +334674160000; E-mail:
| | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, 170001, Caldas, Colombia
| | - Gustavo Isaza
- Corresponding authors. Simon Orozco-Arias, Computer Science Department, Universidad Autónoma de Manizales, Antigua Estación del Ferrocarrill, Manizalez, Colombia. Tel.: +57(606)8727272 - 8727709 Ext 102; E-mail: ; Alexandre Rossi Paschoal, Department of Computer Science, Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, Paraná, 86300-000, Brazil. Tel.: +433133-3790; E-mail: ; Gustavo Isaza, Systems and Informatics Department, Center for Technology Development - Bioprocess and Agro-industry Plant, Universidad de Caldas, St 65 #26-10, Manizales, Colombia. Tel.: +57(606)8781500 ext 13146; E-mail: , Romain Guyot, IRD, 911 Av. Agropolis, 34394 Montpellier, France. Tel.: +334674160000; E-mail:
| | - Romain Guyot
- Corresponding authors. Simon Orozco-Arias, Computer Science Department, Universidad Autónoma de Manizales, Antigua Estación del Ferrocarrill, Manizalez, Colombia. Tel.: +57(606)8727272 - 8727709 Ext 102; E-mail: ; Alexandre Rossi Paschoal, Department of Computer Science, Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, Paraná, 86300-000, Brazil. Tel.: +433133-3790; E-mail: ; Gustavo Isaza, Systems and Informatics Department, Center for Technology Development - Bioprocess and Agro-industry Plant, Universidad de Caldas, St 65 #26-10, Manizales, Colombia. Tel.: +57(606)8781500 ext 13146; E-mail: , Romain Guyot, IRD, 911 Av. Agropolis, 34394 Montpellier, France. Tel.: +334674160000; E-mail:
| |
Collapse
|
35
|
Berger J, Legendre F, Zelosko KM, Harrison MC, Grandcolas P, Bornberg-Bauer E, Fouks B. Eusocial Transition in Blattodea: Transposable Elements and Shifts of Gene Expression. Genes (Basel) 2022; 13:1948. [PMID: 36360186 PMCID: PMC9689775 DOI: 10.3390/genes13111948] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/19/2022] [Accepted: 10/20/2022] [Indexed: 11/07/2023] Open
Abstract
(1) Unravelling the molecular basis underlying major evolutionary transitions can shed light on how complex phenotypes arise. The evolution of eusociality, a major evolutionary transition, has been demonstrated to be accompanied by enhanced gene regulation. Numerous pieces of evidence suggest the major impact of transposon insertion on gene regulation and its role in adaptive evolution. Transposons have been shown to be play a role in gene duplication involved in the eusocial transition in termites. However, evidence of the molecular basis underlying the eusocial transition in Blattodea remains scarce. Could transposons have facilitated the eusocial transition in termites through shifts of gene expression? (2) Using available cockroach and termite genomes and transcriptomes, we investigated if transposons insert more frequently in genes with differential expression in queens and workers and if those genes could be linked to specific functions essential for eusocial transition. (3) The insertion rate of transposons differs among differentially expressed genes and displays opposite trends between termites and cockroaches. The functions of termite transposon-rich queen- and worker-biased genes are related to reproduction and ageing and behaviour and gene expression, respectively. (4) Our study provides further evidence on the role of transposons in the evolution of eusociality, potentially through shifts in gene expression.
Collapse
Affiliation(s)
- Juliette Berger
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, CP50, 57 rue Cuvier, 75005 Paris, France
| | - Frédéric Legendre
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, CP50, 57 rue Cuvier, 75005 Paris, France
| | - Kevin-Markus Zelosko
- Institute for Evolution and Biodiversity, Molecular Evolution and Bioinformatics, Westfälische Wilhelms-Universität, Hüfferstrasse 1, 48149 Münster, Germany
| | - Mark C. Harrison
- Institute for Evolution and Biodiversity, Molecular Evolution and Bioinformatics, Westfälische Wilhelms-Universität, Hüfferstrasse 1, 48149 Münster, Germany
| | - Philippe Grandcolas
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, CP50, 57 rue Cuvier, 75005 Paris, France
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, Molecular Evolution and Bioinformatics, Westfälische Wilhelms-Universität, Hüfferstrasse 1, 48149 Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Max-Planck-Ring 5, 72076 Tübingen, Germany
| | - Bertrand Fouks
- Institute for Evolution and Biodiversity, Molecular Evolution and Bioinformatics, Westfälische Wilhelms-Universität, Hüfferstrasse 1, 48149 Münster, Germany
| |
Collapse
|
36
|
McLay TGB, Murphy DJ, Holmes GD, Mathews S, Brown GK, Cantrill DJ, Udovicic F, Allnutt TR, Jackson CJ. A genome resource for Acacia, Australia's largest plant genus. PLoS One 2022; 17:e0274267. [PMID: 36240205 PMCID: PMC9565413 DOI: 10.1371/journal.pone.0274267] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 08/24/2022] [Indexed: 11/05/2022] Open
Abstract
Acacia (Leguminosae, Caesalpinioideae, mimosoid clade) is the largest and most widespread genus of plants in the Australian flora, occupying and dominating a diverse range of environments, with an equally diverse range of forms. For a genus of its size and importance, Acacia currently has surprisingly few genomic resources. Acacia pycnantha, the golden wattle, is a woody shrub or tree occurring in south-eastern Australia and is the country's floral emblem. To assemble a genome for A. pycnantha, we generated long-read sequences using Oxford Nanopore Technology, 10x Genomics Chromium linked reads, and short-read Illumina sequences, and produced an assembly spanning 814 Mb, with a scaffold N50 of 2.8 Mb, and 98.3% of complete Embryophyta BUSCOs. Genome annotation predicted 47,624 protein-coding genes, with 62.3% of the genome predicted to comprise transposable elements. Evolutionary analyses indicated a shared genome duplication event in the Caesalpinioideae, and conflict in the relationships between Cercis (subfamily Cercidoideae) and subfamilies Caesalpinioideae and Papilionoideae (pea-flowered legumes). Comparative genomics identified a suite of expanded and contracted gene families in A. pycnantha, and these were annotated with both GO terms and KEGG functional categories. One expanded gene family of particular interest is involved in flowering time and may be associated with the characteristic synchronous flowering of Acacia. This genome assembly and annotation will be a valuable resource for all studies involving Acacia, including the evolution, conservation, breeding, invasiveness, and physiology of the genus, and for comparative studies of legumes.
Collapse
Affiliation(s)
- Todd G. B. McLay
- Royal Botanic Gardens Victoria, South Yarra, Victoria, Australia
- School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
- Centre for Australian Biodiversity Research, CSIRO, Black Mountain, Australian Capital Territory, Australia
| | - Daniel J. Murphy
- Royal Botanic Gardens Victoria, South Yarra, Victoria, Australia
| | - Gareth D. Holmes
- Royal Botanic Gardens Victoria, South Yarra, Victoria, Australia
| | - Sarah Mathews
- Centre for Australian Biodiversity Research, CSIRO, Black Mountain, Australian Capital Territory, Australia
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
| | - Gillian K. Brown
- Queensland Herbarium, Department of Environment and Science, Toowong, Queensland, Australia
| | | | - Frank Udovicic
- Royal Botanic Gardens Victoria, South Yarra, Victoria, Australia
| | | | - Chris J. Jackson
- Royal Botanic Gardens Victoria, South Yarra, Victoria, Australia
| |
Collapse
|
37
|
Athanasouli M, Rödelsperger C. Analysis of repeat elements in the Pristionchus pacificus genome reveals an ancient invasion by horizontally transferred transposons. BMC Genomics 2022; 23:523. [PMID: 35854227 PMCID: PMC9297572 DOI: 10.1186/s12864-022-08731-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 07/01/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Repetitive sequences and mobile elements make up considerable fractions of individual genomes. While transposition events can be detrimental for organismal fitness, repetitive sequences form an enormous reservoir for molecular innovation. In this study, we aim to add repetitive elements to the annotation of the Pristionchus pacificus genome and assess their impact on novel gene formation. RESULTS Different computational approaches define up to 24% of the P. pacificus genome as repetitive sequences. While retroelements are more frequently found at the chromosome arms, DNA transposons are distributed more evenly. We found multiple DNA transposons, as well as LTR and LINE elements with abundant evidence of expression as single-exon transcripts. When testing whether transposons disproportionately contribute towards new gene formation, we found that roughly 10-20% of genes across all age classes overlap transposable elements with the strongest trend being an enrichment of low complexity regions among the oldest genes. Finally, we characterized a horizontal gene transfer of Zisupton elements into diplogastrid nematodes. These DNA transposons invaded nematodes from eukaryotic donor species and experienced a recent burst of activity in the P. pacificus lineage. CONCLUSIONS The comprehensive annotation of repetitive elements in the P. pacificus genome builds a resource for future functional genomic analyses as well as for more detailed investigations of molecular innovations.
Collapse
Affiliation(s)
- Marina Athanasouli
- Max Planck Institute for Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany
| | - Christian Rödelsperger
- Max Planck Institute for Biology, Department for Integrative Evolutionary Biology, Max-Planck-Ring 9, 72076, Tübingen, Germany.
| |
Collapse
|