1
|
Deng X, Liu L. BiGM-lncLoc: Bi-level Multi-Graph Meta-Learning for Predicting Cell-Specific Long Noncoding RNAs Subcellular Localization. Interdiscip Sci 2025; 17:359-374. [PMID: 39724386 DOI: 10.1007/s12539-024-00679-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 11/11/2024] [Accepted: 11/18/2024] [Indexed: 12/28/2024]
Abstract
The precise spatiotemporal expression of long noncoding RNAs (lncRNAs) plays a pivotal role in biological regulation, and aberrant expression of lncRNAs in different subcellular localizations has been intricately linked to the onset and progression of a variety of cancers. Computational methods provide effective means for predicting lncRNA subcellular localization, but current studies either ignore cell line and tissue specificity or the correlation and shared information among cell lines. In this study, we propose a novel approach, BiGM-lncLoc, treating the prediction of lncRNA subcellular localization across cell lines as a multi-graph meta-learning task. Our investigation involves two categories of data: the localization data of nucleotide sequences in different cell lines and cell line expression data. BiGM-lncLoc comprises a cell line-specific optimization network learning specific knowledge from cell line expression data and a graph neural network optimized across cell lines. Subsequently, the specific and shared knowledge acquired through bi-level optimization is applied to a new cell-line prediction task without the need for re-training or fine-tuning. Additionally, through key feature analysis of the impact of different nucleotide combinations on the model, we confirm the necessity of cell line-specific studies based on correlation analysis. Finally, experiments conducted on various cell lines with different data sizes indicate that BiGM-lncLoc outperforms other methods in terms of prediction accuracy, with an average accuracy of 97.7%. After removing overlapping samples to ensure data independence for each cell line, the accuracy ranged from 82.4% to 94.7%, still surpassing existing models. Our code can be found at https://github.com/BioCL1/BiGM-lncLoc .
Collapse
Affiliation(s)
- Xi Deng
- School of Information, Yunnan Normal University, Kunming, 650500, China
| | - Lin Liu
- School of Information, Yunnan Normal University, Kunming, 650500, China.
- Department of Education of Yunnan Province, Engineering Research Center of Computer Vision and Intelligent Control Technology, Kunming, 650500, China.
| |
Collapse
|
2
|
Natarajan S, Gehrke J, Pucker B. Mapping-based genome size estimation. BMC Genomics 2025; 26:482. [PMID: 40369445 PMCID: PMC12079912 DOI: 10.1186/s12864-025-11640-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2024] [Accepted: 04/25/2025] [Indexed: 05/16/2025] Open
Abstract
While the size of chromosomes can be measured under a microscope, obtaining the exact size of a genome remains a challenge. Biochemical methods and k-mer distribution-based approaches allow only estimations. An alternative approach to estimate the genome size based on high contiguity assemblies and read mappings is presented here. Analyses of Arabidopsis thaliana and Beta vulgaris data sets are presented to show the impact of different parameters. Oryza sativa, Brachypodium distachyon, Solanum lycopersicum, Vitis vinifera, and Zea mays were also analyzed to demonstrate the broad applicability of this approach. Further, MGSE was also used to analyze Escherichia coli, Saccharomyces cerevisiae, and Caenorhabditis elegans datasets to show its utility beyond plants. Mapping-based Genome Size Estimation (MGSE) and additional scripts are available on GitHub: https://github.com/bpucker/MGSE . MGSE predicts genome sizes based on short reads or long reads requiring a minimal coverage of 5-fold.
Collapse
Affiliation(s)
- Shakunthala Natarajan
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology & BRICS, TU Braunschweig, Mendelssohnstrasse 4, 38106, Braunschweig, Germany
- Molecular Plant Sciences, Institute for Cellular and Molecular Botany, University of Bonn, Kirschallee 1, 53115, Bonn, Germany
| | - Jessica Gehrke
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology & BRICS, TU Braunschweig, Mendelssohnstrasse 4, 38106, Braunschweig, Germany
| | - Boas Pucker
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology & BRICS, TU Braunschweig, Mendelssohnstrasse 4, 38106, Braunschweig, Germany.
- Molecular Plant Sciences, Institute for Cellular and Molecular Botany, University of Bonn, Kirschallee 1, 53115, Bonn, Germany.
| |
Collapse
|
3
|
Meng J, Wang Y, Guo R, Liu J, Jing K, Zuo J, Yuan Y, Jiang F, Dong N. Integrated genomic and transcriptomic analyses reveal the genetic and molecular mechanisms underlying hawthorn peel color and seed hardness diversity. J Genet Genomics 2025:S1673-8527(25)00097-9. [PMID: 40220858 DOI: 10.1016/j.jgg.2025.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Revised: 03/30/2025] [Accepted: 04/01/2025] [Indexed: 04/14/2025]
Abstract
Hawthorn (Crataegus pinnatifida) fruit peel color and seed hardness are key traits that significantly impact economic value. We present here the high-quality chromosome-scale genomes of two cultivars, including the hard-seed, yellow-peel C. pinnatifida "Jinruyi" (JRY) and the soft-seed, red-peel C. pinnatifida "Ruanzi" (RZ). The assembled genomes comprising 17 chromosomes are 809.1 Mb and 760.5 Mb in size, achieving scaffold N50 values of 48.5 Mb and 46.8 Mb for JRY and RZ, respectively. Comparative genomic analysis identifies 3.6-3.8 million single nucleotide polymorphisms, 8.5-9.3 million insertions/deletions, and approximately 30 Mb of presence/absence variations across different hawthorn genomes. Through integrating differentially expressed genes and accumulated metabolites, we filter candidate genes CpMYB114 and CpMYB44 associated with differences in hawthorn fruit peel color and seed hardness, respectively. Functional validation confirms that the CpMYB114-CpANS regulates anthocyanin biosynthesis in hawthorn peels, contributing to the observed variation in peel color. CpMYB44-CpCOMT is significantly upregulated in JRY and is verified to promote lignin biosynthesis, resulting in the distinction in seed hardness. Overall, this study reveals the new insights into understanding of distinct peel pigmentation and seed hardness in hawthorn and provides an abundant resource for molecular breeding.
Collapse
Affiliation(s)
- Jiaxin Meng
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China
| | - Yan Wang
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China
| | - Rongkun Guo
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China
| | - Jianyi Liu
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China
| | - Kerui Jing
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China
| | - Jiaqi Zuo
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China
| | - Yanping Yuan
- College of Landscape Architecture and Arts, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Fengchao Jiang
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China.
| | - Ningguang Dong
- Beijing Engineering Research Center for Deciduous Fruit Trees, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture and Rural Affairs, Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, China.
| |
Collapse
|
4
|
Dasgupta A, Saikia R, Kakoti BB, Handique PJ. Draft genome sequence of benzo[a]pyrene degrading Bacillus altitudinis strain AR19 isolated from Digboi oil refinery (India). Microbiol Resour Announc 2025; 14:e0095724. [PMID: 40079616 PMCID: PMC11984180 DOI: 10.1128/mra.00957-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2024] [Accepted: 02/15/2025] [Indexed: 03/15/2025] Open
Abstract
Bacillus altitudinis AR19, isolated from the Digboi oil refinery (India), has a genome size of 3,630,000 bp with a G+C content of 42.45%. The genome encodes 3,755 protein-coding genes, including those for ring-cleaving dioxygenases and biofilm formation. These genes likely play crucial roles in the bacterium's survival in hydrocarbon-enriched environments.
Collapse
Affiliation(s)
- Abhisek Dasgupta
- Centre for Biotechnology and Bioinformatics, Dibrugarh University, Dibrugarh, India
| | - Ratul Saikia
- Biological Sciences and Technology Division, CSIR-North East Institute of Science and Technology, Jorhat, India
| | - Bibhuti B. Kakoti
- Centre for Biotechnology and Bioinformatics, Dibrugarh University, Dibrugarh, India
| | | |
Collapse
|
5
|
Lim AH, Ng CCY, Hong JH, Lee EC, Kwek K, Tan P, Kifle H, Ng I, Teh BT. Genomic garden: From societal and scientific impacts to biodiversity conservation. CELL GENOMICS 2025; 5:100779. [PMID: 40020688 PMCID: PMC12008809 DOI: 10.1016/j.xgen.2025.100779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2024] [Revised: 12/20/2024] [Accepted: 01/30/2025] [Indexed: 03/03/2025]
Abstract
Because of urbanization, deforestation, pollution, climate change, and natural disasters, the loss of biodiversity is a pressing concern globally. As part of our efforts toward biodiversity conservation, we propose the establishment of a genomic garden, where the genome of each plant in the garden is elucidated. Combining science, horticulture, and a digital content hub accessible with any handheld device, the genomic garden serves multiple purposes, from enhancing urban landscapes, facilitating biomedical research, and improving population health to providing entertainment and education for visitors.
Collapse
Affiliation(s)
- Abner Herbert Lim
- SingHealth Duke-NUS Institute of Biodiversity Medicine, Singapore, Singapore
| | | | - Jing Han Hong
- SingHealth Duke-NUS Institute of Biodiversity Medicine, Singapore, Singapore; Duke-NUS Medical School, Singapore, Singapore
| | | | - Kenneth Kwek
- Singapore General Hospital, Singapore, Singapore
| | - Patrick Tan
- Duke-NUS Medical School, Singapore, Singapore; Genome Institute of Singapore, Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore
| | - Hazri Kifle
- Universiti Brunei Darussalam, Brunei Darussalem, Brunei
| | - Ivy Ng
- Singapore Health Services, Singapore, Singapore
| | - Bin Tean Teh
- SingHealth Duke-NUS Institute of Biodiversity Medicine, Singapore, Singapore; Duke-NUS Medical School, Singapore, Singapore; Genome Institute of Singapore, Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore; Singapore Health Services, Singapore, Singapore.
| |
Collapse
|
6
|
Shen L, Qi Z, Dai X, Ai Y, Chen J, Chao Y, He H, Han L, Xu L. Chromosome-scale genome assembly of Zoysia japonica uncovers cold tolerance candidate genes. Sci Data 2025; 12:571. [PMID: 40180989 PMCID: PMC11968985 DOI: 10.1038/s41597-025-04827-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 03/13/2025] [Indexed: 04/05/2025] Open
Abstract
Zoysiagrass stands out as a crucial native turfgrass due to its exceptional abiotic stress tolerance, extensive adaptability, and high ornamental value. In this study, we generated a high-quality chromosome-level genome assembly of Compadre (COM) zoysiagrass, leveraging PacBio SMRT sequencing and Hi-C scaffolding technologies. The resulting genome assembly (312.42 Mb) is anchored on 20 chromosomes, with a Scaffold N50 of 18.72 Mb. In total, 49,074 genes and 306,768 repeat sequences were annotated in the assembled genome. The first chromosome-scale genome of Zoysia japonica 'Compadre' provides a critical genetic resource for cold-tolerant turfgrass breeding through identifying stress-responsive candidate genes. Additionally, we have successfully established a cell nucleus extraction and library construction protocol tailored for zoysiagrass ATAC-seq technology, and a total of 80 low temperature tolerance candidate genes were preliminarily identified via ATAC-seq and RNA-seq profiling, thereby initiating the exploration of turfgrass epigenomics.
Collapse
Affiliation(s)
- Liangying Shen
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Zewen Qi
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China
- Institute of Advanced Agricultural Sciences, Peking University, Shandong, 261325, China
| | - Xiuru Dai
- Institute of Advanced Agricultural Sciences, Peking University, Shandong, 261325, China
| | - Ye Ai
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Jiabao Chen
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Yuehui Chao
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Hang He
- Institute of Advanced Agricultural Sciences, Peking University, Shandong, 261325, China
| | - Liebao Han
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China.
- Engineering and Technology Research Center for Sports Field and Slope Protection Turf, National Forestry and Grassland Administration, Beijing, 100083, China.
| | - Lixin Xu
- School of Grassland Science, Beijing Forestry University, Beijing, 100083, China.
| |
Collapse
|
7
|
Köhler G, Dost O, Than NL, Ohler A, Charunrochana PT, Chuaynkern Y, Chuaynkern C, Geiss K. A taxonomic revision of the genus Raorchestes in Myanmar and Thailand with the description of two new species from Myanmar (Amphibia, Anura, Rhacophoridae). Zootaxa 2025; 5613:47-81. [PMID: 40173518 DOI: 10.11646/zootaxa.5613.1.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2025] [Indexed: 04/04/2025]
Abstract
We revise the frogs of the genus Raorchestes from Myanmar and Thailand based on data of external morphology, bioacoustics, and molecular genetics. The results of this integrative study provide evidence for the recognition of seven species, two of which we describe as new: Raorchestes mindat sp. nov. from Mindat District, Chin State, western Myanmar, and Raorchestes leiktho sp. nov. from Hpa-an District, Kayin State, southeastern Myanmar. The other species that we recognize in Myanmar and Thailand are R. cangyuanensis, R. huanglianshan, R. longchuanensis, R. menglaensis, and R. parvulus. We have compared the external morphology of the lectotype and four paralectotypes of Ixalus parvulus Boulenger, 1893 with the species of the Raorchestes parvulus group currently recognized from South-east Asia. Although the type series of Ixalus parvulus is morphologically most similar to specimens of R. cangyuanensis from Thailand, we refrain from formally synonymizing these two taxa until genetic data for I. parvulus are available that would allow this hypothesis to be tested. Thus, R. parvulus remains an enigmatic taxon still only known from the original type series. As now defined, R. cangyuanensis is distributed across most of Myanmar except for the Malayan Peninsula, and also in adjacent Yunnan Province, China, and adjacent northeastern Bangladesh. Raorchestes longchuanensis occurs in northwestern Thailand as well as in eastern Myanmar and western Yunnan, China. Raorchestes menglaensis ranges from southern Yunnan, China, across Thailand, Laos, and Cambodia to northern Western Malaysia. Raorchestes huanglianshan is distributed in southern Yunnan, China, and northwestern Thailand. Often two, at some places even three species of this genus occur sympatrically (e.g., R. leiktho sp. nov., R. longchuanensis and R. parvulus near Leiktho, Kayin State, Myanmar; R. longchuanensis and R. huanglianshan at Doi Inthanon). We provide new bioacoustic data for R. longchuanensis, R. menglaensis, and R. leiktho sp. nov., and compare these with data of R. cangyuanensis and R. rezakhani.
Collapse
Affiliation(s)
- Gunther Köhler
- Senckenberg Forschungsinstitut und Naturmuseum; Senckenberganlage 25; 60325 Frankfurt a.M.; Germany.
| | - Ole Dost
- Sonnenstraße 14; 72275 Alpirsbach-Römlinsdorf; Germany.
| | | | - Annemarie Ohler
- Institut de Systématique; Evolution; Biodiversité (ISYEB); Muséum National d'Histoire Naturelle; CNRS; Sorbonne Université; EPHE; Université des Antilles; 57 Rue Cuvier; 75005 Paris; France.
| | | | - Yodchaiy Chuaynkern
- Department of Biology; Faculty of Science; Khon Kaen University; Mueang; Khon Kaen; Thailand 40002.
| | - Chantip Chuaynkern
- Department of Biology; Faculty of Science; Khon Kaen University; Mueang; Khon Kaen; Thailand 40002.
| | - Katharina Geiss
- Senckenberg Forschungsinstitut und Naturmuseum; Senckenberganlage 25; 60325 Frankfurt a.M.; Germany.
| |
Collapse
|
8
|
Valencia-Pesqueira LM, Hoff SNK, Tørresen OK, Jentoft S, Lefevre S. Chromosome-level de novo genome assembly of wild, anoxia-tolerant crucian carp, Carassius carassius. Sci Data 2025; 12:491. [PMID: 40128231 PMCID: PMC11933416 DOI: 10.1038/s41597-025-04813-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Accepted: 03/11/2025] [Indexed: 03/26/2025] Open
Abstract
Crucian carp (Carassius carassius), a member of the carp family (Cyprinidae), is known for its remarkable anoxia tolerance. The physiological responses and adaptations to anoxia are well documented, but there is a need for better understanding of the molecular regulation and evolutionary mechanisms behind these adaptations. Here we present a high-quality, functionally annotated, chromosome-level genome assembly that can facilitate such further studies. Genomic DNA was obtained from a wild-caught crucian carp specimen and used for PacBio long-read, Illumina short-read and Hi-C sequencing. Short-read mRNA data were used for structural annotation using the BRAKER3 pipeline, while PacBio long-read RNA sequencing data were used for annotation of untranslated regions and refinement of gene-isoform relationships, using the PASA pipeline. The full assembly had a contig-level N50 of 15Mbp in 290 scaffolds and 98.6% of the total length (1.65Gbp) placed in 50 chromosomes. Structural annotation resulted in 82,557 protein-coding transcripts (in 45,667 genes), with a BUSCO completeness of 99.6% and of which 77,370 matched a protein in the UniProtKB/Swiss-Prot database.
Collapse
Affiliation(s)
| | - Siv Nam Khang Hoff
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Ole K Tørresen
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Sissel Jentoft
- Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of Oslo, Oslo, Norway
| | - Sjannie Lefevre
- Section for Physiology and Cell Biology, Department of Biosciences, University of Oslo, Oslo, Norway.
| |
Collapse
|
9
|
Roberts MD, Davis O, Josephs EB, Williamson RJ. K-mer-based Approaches to Bridging Pangenomics and Population Genetics. Mol Biol Evol 2025; 42:msaf047. [PMID: 40111256 PMCID: PMC11925024 DOI: 10.1093/molbev/msaf047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 01/10/2025] [Accepted: 02/04/2025] [Indexed: 03/12/2025] Open
Abstract
Many commonly studied species now have more than one chromosome-scale genome assembly, revealing a large amount of genetic diversity previously missed by approaches that map short reads to a single reference. However, many species still lack multiple reference genomes and correctly aligning references to build pangenomes can be challenging for many species, limiting our ability to study this missing genomic variation in population genetics. Here, we argue that k-mers are a very useful but underutilized tool for bridging the reference-focused paradigms of population genetics with the reference-free paradigms of pangenomics. We review current literature on the uses of k-mers for performing three core components of most population genetics analyses: identifying, measuring, and explaining patterns of genetic variation. We also demonstrate how different k-mer-based measures of genetic variation behave in population genetic simulations according to the choice of k, depth of sequencing coverage, and degree of data compression. Overall, we find that k-mer-based measures of genetic diversity scale consistently with pairwise nucleotide diversity (π) up to values of about π=0.025 (R2=0.97) for neutrally evolving populations. For populations with even more variation, using shorter k-mers will maintain the scalability up to at least π=0.1. Furthermore, in our simulated populations, k-mer dissimilarity values can be reliably approximated from counting bloom filters, highlighting a potential avenue to decreasing the memory burden of k-mer-based genomic dissimilarity analyses. For future studies, there is a great opportunity to further develop methods to identifying selected loci using k-mers.
Collapse
Affiliation(s)
- Miles D Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing, MI 48824, USA
| | - Olivia Davis
- Department of Computer Science and Software Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
| | - Emily B Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI 48824, USA
- Plant Resilience Institute, Michigan State University, East Lansing, MI 48824, USA
| | - Robert J Williamson
- Department of Computer Science and Software Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
- Department of Biology and Biomedical Engineering, Rose-Hulman Institute of Technology, Terre Haute, IN 47803, USA
| |
Collapse
|
10
|
Fu Y, Zhang X, Zhang T, Sun W, Yang W, Shi Y, Zhang J, He Q, Charlesworth D, Jiao Y, Chen Z, Xu B. Evidence for evolution of a new sex chromosome within the haploid-dominant Marchantiales plant lineage. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2025. [PMID: 39981726 DOI: 10.1111/jipb.13867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2025] [Accepted: 01/28/2025] [Indexed: 02/22/2025]
Abstract
Sex chromosomes have evolved independently in numerous lineages across the Tree of Life, in both diploid-dominant species, including many animals and plants, and the less studied haploid-dominant plants and algae. Strict genetic sex determination ensures that individuals reproduce by outcrossing. However, species with separate sexes (termed dioecy in diploid plants, and dioicy in haploid plants) may sometimes evolve different sex systems, and become monoicous, with the ability to self-fertilize. Here, we studied dioicy-monoicy transitions in the ancient liverwort haploid-dominant plant lineage, using three telomere-to-telomere gapless chromosome-scale reference genome assemblies from the Ricciaceae group of Marchantiales. Ancestral liverworts are believed to have been dioicous, with U and V chromosomes (chromosome 9) determining femaleness and maleness, respectively. We confirm the finding that monoicy in Ricciocarpos natans evolved from a dioicous ancestor, and most ancestrally U chromosomal genes have been retained on autosomes in this species. We also describe evidence suggesting the possible re-evolution of dioicy in the genus Riccia, with probable de novo establishment of a sex chromosome from an autosome (chromosome 5), and further translocations of genes from the new sex chromosome to autosomes. Our results also indicated that micro-chromosomes are consistent genomic features, and may have evolved independently from sex chromosomes in Ricciocarpos and Riccia lineages.
Collapse
Affiliation(s)
- Yuan Fu
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiaoxia Zhang
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- China National Botanical Garden, Beijing, 100093, China
| | - Tian Zhang
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wenjing Sun
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wenjun Yang
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yajing Shi
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jian Zhang
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- China National Botanical Garden, Beijing, 100093, China
| | - Qiang He
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- China National Botanical Garden, Beijing, 100093, China
| | - Deborah Charlesworth
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Yuannian Jiao
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
- China National Botanical Garden, Beijing, 100093, China
| | - Zhiduan Chen
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
- China National Botanical Garden, Beijing, 100093, China
| | - Bo Xu
- State Key Laboratory of Plant Diversity and Prominent Crop/State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, the Chinese Academy of Sciences, Beijing, 100093, China
- China National Botanical Garden, Beijing, 100093, China
| |
Collapse
|
11
|
D’Angelo D, Sorrentino R, Nkomo T, Zhou X, Vaghefi N, Sonnekus B, Bose T, Cerrato D, Cozzolino L, Creux N, D’Agostino N, Fourie G, Fusco G, Hammerbacher A, Idnurm A, Kiss L, Hu Y, Hu H, Lahoz E, Risteski J, Steenkamp ET, Viscardi M, van der Nest MA, Wu Y, Yu H, Zhou J, Karandeni Dewage CS, Kotta-Loizou LI, Stotz HU, Fitt BDL, Huang Y, Wingfield BD. IMA GENOME - F20 A draft genome assembly of Agroatheliarolfsii, Ceratobasidiumpapillatum, Pyrenopezizabrassicae, Neopestalotiopsismacadamiae, Sphaerellopsisfilum and genomic resources for Colletotrichumspaethianum and Colletotrichumfructicola. IMA Fungus 2025; 16:e141732. [PMID: 40052082 PMCID: PMC11882029 DOI: 10.3897/imafungus.16.141732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 11/13/2024] [Indexed: 03/09/2025] Open
Abstract
This is a genome announacment there is no abstract
Collapse
Affiliation(s)
- Davide D’Angelo
- Department of Agricultural Sciences, University of Naples Federico II, piazza Carlo di Borbone 1, 80055, Portici, Naples, Italy
| | - Roberto Sorrentino
- Research Centre for Cereal and Industrial Crops (CREA-CI), via Torrino 3, 81100, Caserta, Italy
| | - Tiphany Nkomo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Xianzhi Zhou
- Institute of Plant Protection, Fujian Academy of Agricultural Sciences, Wusi Road 247, Fuzhou 350003, China
| | - Niloofar Vaghefi
- School of Agriculture, Food and Ecosystem Sciences, Faculty of Science, The University of Melbourne, Parkville, Australia
| | - Byron Sonnekus
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Tanay Bose
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Domenico Cerrato
- Department of Zoology and Entomology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Loredana Cozzolino
- Istituto Zooprofilattico Sperimentale del Mezzogiorno, Via Salute 2, 80055, Portici, Naples, Italy
| | - Nicky Creux
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Nunzio D’Agostino
- Department of Agricultural Sciences, University of Naples Federico II, piazza Carlo di Borbone 1, 80055, Portici, Naples, Italy
| | - Gerda Fourie
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Giovanna Fusco
- Department of Plant and Soil Science, Forestry and Agricultural Biotechnology (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Almuth Hammerbacher
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Alexander Idnurm
- School of Agriculture, Food and Ecosystem Sciences, Faculty of Science, The University of Melbourne, Parkville, Australia
| | - Levente Kiss
- School of BioSciences, Faculty of Science, The University of Melbourne, Parkville, Australia
- Centre for Crop Health, University of Southern Queensland, Toowoomba, Australia
- Eszterházy Károly Catholic University, Eger, Hungary
| | - Yanping Hu
- Plant Protection Institute, Centre for Agricultural Research, HUN-REN, Budapest, Hungary
| | - Hongli Hu
- Plant Protection Institute, Centre for Agricultural Research, HUN-REN, Budapest, Hungary
| | - Ernesto Lahoz
- Research Centre for Cereal and Industrial Crops (CREA-CI), via Torrino 3, 81100, Caserta, Italy
| | - Jason Risteski
- School of Agriculture, Food and Ecosystem Sciences, Faculty of Science, The University of Melbourne, Parkville, Australia
| | - Emma T. Steenkamp
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Maurizio Viscardi
- Istituto Zooprofilattico Sperimentale del Mezzogiorno, Via Salute 2, 80055, Portici, Naples, Italy
| | - Magriet A. van der Nest
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| | - Yuan Wu
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Hao Yu
- Hans Merensky Chair in Avocado Research, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria 0028, South Africa
| | - Jianjin Zhou
- Technology Center, Xiamen Customs, Xiamen 361026, China
| | - Chinthani S. Karandeni Dewage
- Sanming Academy of Agricultural Sciences/Fujian Key Laboratory of Crop Genetic Improvement and Innovative Utilization for Mountain Area, Sanming, Fujian 365051, China
| | - Loly I. Kotta-Loizou
- Sanming Academy of Agricultural Sciences/Fujian Key Laboratory of Crop Genetic Improvement and Innovative Utilization for Mountain Area, Sanming, Fujian 365051, China
| | - Henrik U. Stotz
- Sanming Academy of Agricultural Sciences/Fujian Key Laboratory of Crop Genetic Improvement and Innovative Utilization for Mountain Area, Sanming, Fujian 365051, China
| | - Bruce D. L. Fitt
- Sanming Academy of Agricultural Sciences/Fujian Key Laboratory of Crop Genetic Improvement and Innovative Utilization for Mountain Area, Sanming, Fujian 365051, China
| | - Yongju Huang
- Sanming Academy of Agricultural Sciences/Fujian Key Laboratory of Crop Genetic Improvement and Innovative Utilization for Mountain Area, Sanming, Fujian 365051, China
| | - Brenda D. Wingfield
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, 0028, South Africa
| |
Collapse
|
12
|
Serajian M, Testagrose C, Prosperi M, Boucher C. A comparative study of antibiotic resistance patterns in Mycobacterium tuberculosis. Sci Rep 2025; 15:5104. [PMID: 39934219 PMCID: PMC11814411 DOI: 10.1038/s41598-025-89087-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 02/03/2025] [Indexed: 02/13/2025] Open
Abstract
This study leverages the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) to analyze over 27,000 Mycobacterium tuberculosis (MTB) genomic strains, providing a comprehensive and large-scale overview of antibiotic resistance (AMR) prevalence and resistance patterns. We used MTB++, which is the newest and most comprehensive AI-based MTB drug resistance profiler tool, to predict the resistance profile of each of the 27,000 MTB isolates and then used feature analysis to identify key genes that were associated with the resistance. There are three main contributions to this study. Firstly, it provides a detailed picture of the prevalence of specific AMR genes in the BV-BRC dataset as well as their biological implications, providing critical insight into MTB's resistance mechanisms that can help identify genes of high priority for further investigation. The second aspect of this study is to compare the prevalence of antibiotic resistance across previous studies that have addressed both the temporal and geographical evolution of MTB drug resistance. Lastly, this study emphasizes the need for targeted diagnostics and personalized treatment plans. In addition to these contributions, the study acknowledges the limitations of computational prediction and recommends future experimental validation.
Collapse
Affiliation(s)
- Mohammadali Serajian
- Department of Computer and Information Science and Engineering, University of Florida, 1889 Museum Road, Gainesville, 32611, FL, USA
| | - Conrad Testagrose
- Department of Computer and Information Science and Engineering, University of Florida, 1889 Museum Road, Gainesville, 32611, FL, USA
| | - Mattia Prosperi
- Department of Epidemiology, University of Florida, Gainesville, 32603, FL, USA
| | - Christina Boucher
- Department of Computer and Information Science and Engineering, University of Florida, 1889 Museum Road, Gainesville, 32611, FL, USA.
| |
Collapse
|
13
|
Gaston JM, Alm EJ, Zhang AN. X-Mapper: fast and accurate sequence alignment via gapped x-mers. Genome Biol 2025; 26:15. [PMID: 39844205 PMCID: PMC11755882 DOI: 10.1186/s13059-024-03473-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 12/31/2024] [Indexed: 01/24/2025] Open
Abstract
Sequence alignment is foundational to many bioinformatic analyses. Many aligners start by splitting sequences into contiguous, fixed-length seeds, called k-mers. Alignment is faster with longer, unique seeds, but more accurate with shorter seeds avoiding mutations. Here, we introduce X-Mapper, aiming to offer high speed and accuracy via dynamic-length seeds containing gaps, called gapped x-mers. We observe 11-24-fold fewer suboptimal alignments analyzing a human reference and 3-579-fold lower inconsistency across bacterial references than other aligners, improving on 53% and 30% of reads aligned to non-target strains and species, respectively. Other seed-based analysis algorithms might benefit from gapped x-mers too.
Collapse
Affiliation(s)
- Jeffry M Gaston
- Google, Cambridge, MA, USA
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Eric J Alm
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - An-Ni Zhang
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore.
| |
Collapse
|
14
|
Cruz-Laufer AJ, Vanhove MPM, Bachmann L, Barson M, Bassirou H, Bitja Nyom AR, Geraerts M, Hahn C, Huyse T, Kasembele GK, Njom S, Resl P, Smeets K, Kmentová N. Adaptive evolution of stress response genes in parasites aligns with host niche diversity. BMC Biol 2025; 23:10. [PMID: 39800686 PMCID: PMC11727194 DOI: 10.1186/s12915-024-02091-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 12/09/2024] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND Stress responses are key the survival of parasites and, consequently, also the evolutionary success of these organisms. Despite this importance, our understanding of the evolution of molecular pathways dealing with environmental stressors in parasitic animals remains limited. Here, we tested the link between adaptive evolution of parasite stress response genes and their ecological diversity and species richness. We comparatively investigated antioxidant, heat shock, osmoregulatory, and behaviour-related genes (foraging) in two model parasitic flatworm lineages with contrasting ecological diversity, Cichlidogyrus and Kapentagyrus (Platyhelminthes: Monopisthocotyla), through whole-genome sequencing of 11 species followed by in silico exon bait capture as well as phylogenetic and codon analyses. RESULTS We assembled the sequences of 48 stress-related genes and report the first foraging (For) gene orthologs in flatworms. We found duplications of heat shock (Hsp) and oxidative stress genes in Cichlidogyrus compared to Kapentagyrus. We also observed positive selection patterns in genes related to mitochondrial protein import (Hsp) and behaviour (For) in species of Cichlidogyrus infecting East African cichlids-a host lineage under adaptive radiation. These patterns are consistent with a potential adaptation linked to a co-radiation of these parasites and their hosts. Additionally, the absence of cytochrome P450 and kappa and sigma-class glutathione S-transferases in monogenean flatworms is reported, genes considered essential for metazoan life. CONCLUSIONS This study potentially identifies the first molecular function linked to a flatworm radiation. Furthermore, the observed gene duplications and positive selection indicate the potentially important role of stress responses for the ecological adaptation of parasite species.
Collapse
Affiliation(s)
- Armando J Cruz-Laufer
- Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium.
- Systems Ecology and Resource Management Research Unit (SERM), Université Libre de Bruxelles-ULB, Brussels, Belgium.
| | - Maarten P M Vanhove
- Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium
| | - Lutz Bachmann
- Natural History Museum, University of Oslo, Oslo, Norway
| | - Maxwell Barson
- Department of Biological Sciences, University of Botswana, Gaborone, Botswana
| | - Hassan Bassirou
- Department of Biological Sciences, University of Ngaoundéré, Ngaoundéré, Cameroon
| | - Arnold R Bitja Nyom
- Department of Biological Sciences, University of Ngaoundéré, Ngaoundéré, Cameroon
- Department of Management of Fisheries and Aquatic Ecosystems, Institute of Fisheries, University of Douala, Douala, Cameroon
| | - Mare Geraerts
- Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium
- Department of Biology, Evolutionary Ecology Group - EVECO, University of Antwerp, Antwerp, Belgium
| | - Christoph Hahn
- Institute of Biology, University of Graz, Graz, Austria.
| | - Tine Huyse
- Department of Biology, Royal Museum for Central Africa, Tervuren, Belgium
| | - Gyrhaiss Kapepula Kasembele
- Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium
- Unité de Recherche en Biodiversité Et Exploitation Durable Des Zones Humides (BEZHU), Faculté Des Sciences Agronomiques, Université de Lubumbashi, Lubumbashi, Democratic Republic of the Congo
| | - Samuel Njom
- Department of Biological Sciences, University of Ngaoundéré, Ngaoundéré, Cameroon
| | - Philipp Resl
- Institute of Biology, University of Graz, Graz, Austria
| | - Karen Smeets
- Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium
| | - Nikol Kmentová
- Faculty of Sciences, Centre for Environmental Sciences, Research Group Zoology: Biodiversity and Toxicology, UHasselt - Hasselt University, Diepenbeek, Belgium
- Aquatic and Terrestrial Ecology, Operational Directorate Natural Environment, Royal Belgian Institute for Natural Sciences, Brussels, Belgium
| |
Collapse
|
15
|
Lei Y, Jiu S, Xu Y, Chen B, Dong X, Lv Z, Bernard A, Liu X, Wang L, Wang L, Wang J, Zhang Z, Cai Y, Zheng W, Zhang X, Li F, Li H, Liu C, Li M, Wang J, Zhu J, Peng L, Barreneche T, Yu F, Wang S, Dong Y, Elisabeth D, Duan S, Zhang C. Population sequencing of cherry accessions unravels the evolution of Cerasus species and the selection of genetic characteristics in edible cherries. MOLECULAR HORTICULTURE 2025; 5:6. [PMID: 39780235 PMCID: PMC11708008 DOI: 10.1186/s43897-024-00120-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Accepted: 10/16/2024] [Indexed: 01/11/2025]
Abstract
Cerasus is a subgenus of Prunus in the family Rosaceae that is popular owing to its ornamental, edible, and medicinal properties. Understanding the evolution of the Cerasus subgenus and identifying selective trait loci in edible cherries are crucial for the improvement of cherry cultivars to meet producer and consumer demands. In this study, we performed a de novo assembly of a chromosome-scale genome for the sweet cherry (Prunus avium L.) cultivar 'Burlat', covering 297.55 Mb and consisting of eight chromosomes with 33,756 protein-coding genes. The resequencing and population structural analysis of 384 Cerasus representative accessions revealed that they could be divided into four groups (Group 1, Group 2, Group 3, and Group 4). We inferred that Group 1 was the oldest population and Groups 2, 3, and 4 were clades derived from it. In addition, we found selective sweeps for fruit flavor and improved stress resistance in different varieties of edible cherries (P. avium, P. cerasus, and P. pseudocerasus). Transcriptome analysis revealed significant differential expression of genes associated with key pathways, such as sucrose starch and sucrose metabolism, fructose and mannose metabolism, and the pentose phosphate pathway, between the leaves and fruits of P. avium. This study enhances the understanding of the evolutionary processes of the Cerasus subgenus and provides resources for functional genomics research and the improvement of edible cherries.
Collapse
Affiliation(s)
- Yahui Lei
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
- College of Science, Yunnan Agricultural University, Kunming, Yunnan, 650201, P. R. China
| | - Songtao Jiu
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China.
| | - Yan Xu
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Baozheng Chen
- College of Food Science and Technology, Yunnan Agricultural University, Kunming, 650201, Yunnan, P. R. China
| | - Xiao Dong
- College of Science, Yunnan Agricultural University, Kunming, Yunnan, 650201, P. R. China
| | - Zhengxin Lv
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Anthony Bernard
- UMR BFP, INRAE, Univ. Bordeaux, 71 Avenue Edouard Bourlaux, 33882, Villenave d'Ornon, France
| | - Xunju Liu
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Lei Wang
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Li Wang
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Jiyuan Wang
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Zhuo Zhang
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Yuliang Cai
- College of Horticulture, Northwest A&F University, Yangling, Shaanxi, 712100, P. R. China
| | - Wei Zheng
- Dalian Academy of Agricultural Sciences, Dalian, Liaoning, 116036, P. R. China
| | - Xu Zhang
- Yantai Academy of Agricultural Sciences, Yantai, Shandong, 265500, P. R. China
| | - Fangdong Li
- Yantai Academy of Agricultural Sciences, Yantai, Shandong, 265500, P. R. China
| | - Hongwen Li
- Horticulture Research Institute, Sichuan Academy of Agricultural Sciences, Sichuan, 610066, P. R. China
| | - Congli Liu
- Zhengzhou Fruit Tree Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou, Henan, 450009, P. R. China
| | - Ming Li
- Zhengzhou Fruit Tree Research Institute, Chinese Academy of Agricultural Sciences, Zhengzhou, Henan, 450009, P. R. China
| | - Jing Wang
- Forestry and Fruit Research Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing, P. R. China
| | - Jijun Zhu
- Shanghai Botanical Garden, Shanghai, 200231, P. R. China
| | - Lei Peng
- College of Landscape and Horticulture, Yunnan Agricultural University, Kunming, Yunnan, 650201, P. R. China
| | - Teresa Barreneche
- UMR BFP, INRAE, Univ. Bordeaux, 71 Avenue Edouard Bourlaux, 33882, Villenave d'Ornon, France
| | - Fei Yu
- Horticultural Research Institute, Yunnan Academy of Agricultural Sciences, Kunming, Yunnan, 650201, P. R. China
| | - Shiping Wang
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Yang Dong
- College of Science, Yunnan Agricultural University, Kunming, Yunnan, 650201, P. R. China
- College of Food Science and Technology, Yunnan Agricultural University, Kunming, 650201, Yunnan, P. R. China
| | - Dirlewanger Elisabeth
- UMR BFP, INRAE, Univ. Bordeaux, 71 Avenue Edouard Bourlaux, 33882, Villenave d'Ornon, France.
| | - Shengchang Duan
- College of Plant Protection, Yunnan Agricultural University, Kunming, Yunnan, 650201, P. R. China.
| | - Caixi Zhang
- Department of Plant Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China.
| |
Collapse
|
16
|
Mu W, Darian JC, Sung WK, Guo X, Yang T, Tang MWM, Chen Z, Tong SKH, Chik IWS, Davidson RL, Edmunds SC, Wei T, Tsui SKW. The haplotype-resolved T2T genome for Bauhinia × blakeana sheds light on the genetic basis of flower heterosis. Gigascience 2025; 14:giaf044. [PMID: 40276955 PMCID: PMC12012898 DOI: 10.1093/gigascience/giaf044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Revised: 02/20/2025] [Accepted: 03/20/2025] [Indexed: 04/26/2025] Open
Abstract
BACKGROUND The Hong Kong orchid tree Bauhinia × blakeana Dunn has long been proposed to be a sterile interspecific hybrid exhibiting flower heterosis when compared to its likely parental species, Bauhinia purpurea L. and Bauhinia variegata L. Here, we report comparative genomic and transcriptomic analyses of the 3 Bauhinia species. FINDINGS We generated chromosome-level assemblies for the parental species and applied a trio-binning approach to construct a haplotype-resolved telomere-to-telomere (T2T) genome for B. blakeana. Comparative chloroplast genome analysis confirmed B. purpurea as the maternal parent. Transcriptome profiling of flower tissues highlighted a closer resemblance of B. blakeana to its maternal parent. Differential gene expression analyses revealed distinct expression patterns among the 3 species, particularly in biosynthetic and metabolic processes. To investigate the genetic basis of flower heterosis observed in B. blakeana, we focused on gene expression patterns within pigment biosynthesis-related pathways. High-parent dominance and overdominance expression patterns were observed, particularly in genes associated with carotenoid biosynthesis. Additionally, allele-specific expression analysis revealed a balanced contribution of maternal and paternal alleles in shaping the gene expression patterns in B. blakeana. CONCLUSIONS Our study offers valuable insights into the genome architecture of hybrid B. blakeana, establishing a comprehensive genomic and transcriptomic resource for future functional genetics research within the Bauhinia genus. It also serves as a model for exploring the characteristics of hybrid species using T2T haplotype-resolved genomes, providing a novel approach to understanding genetic interactions and evolutionary mechanisms in complex genomes with high heterozygosity.
Collapse
Affiliation(s)
- Weixue Mu
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | | | - Wing-Kin Sung
- Department of Chemical Pathology, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- JC STEM Laboratory of Computational Genomics, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Hong Kong Genome Institute, Hong Kong Science Park, Shatin, N.T., Hong Kong SAR, China
| | - Xing Guo
- BGI Research, East Lake High-Tech Development Zone, Wuhan 430074, China
| | - Tuo Yang
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen 518004, China
| | - Mandy Wai Man Tang
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| | - Ziqiang Chen
- National Key Laboratory for Germplasm Innovation & Utilization of Horticultural Crops, College of Horticulture & Forestry Sciences, Huazhong Agricultural University, Wuhan 430070, China
| | - Steve Kwan Hok Tong
- BGI Genomics, Tai Po, N.T., Hong Kong SAR, China
- International DNA Research Centre, Kowloon, Hong Kong SAR, China
| | | | - Robert L Davidson
- School of Physics, Engineering & Computer Science, University of Hertfordshire, Hatfield AL10 9AB, United Kingdom
| | - Scott C Edmunds
- GigaScience Press, BGI Hong Kong Tech Co. Ltd., Sheung Wan, Hong Kong SAR, China
| | - Tong Wei
- BGI Research, East Lake High-Tech Development Zone, Wuhan 430074, China
| | - Stephen Kwok-Wing Tsui
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
- Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong SAR, China
| |
Collapse
|
17
|
Patil MP, Kim JO, Yoo SH, Shin J, Yang JY, Kim K, Kim GD. Complete Mitochondrial Genome of Niphon spinosus (Perciformes: Niphonidae): Genome Characterization and Phylogenetic Analysis. Biomolecules 2025; 15:52. [PMID: 39858446 PMCID: PMC11764044 DOI: 10.3390/biom15010052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Revised: 12/31/2024] [Accepted: 01/01/2025] [Indexed: 01/27/2025] Open
Abstract
The species Niphon spinosus (Cuvier, 1829) is the only representative of the family Niphonidae and the genus Niphon, and its taxonomic history is complicated; it is still unclear in a phylogenetic sense. In this study, we report the complete mitochondrial genome of N. spinosus (OP391482), which was determined to be 16,503 bp long with biased A + T contents (53.8%) using next-generation technology. The typical set of 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and one control region (D-loop) are included in the mitochondrial genome. The H-strand encoded 28 genes (14 tRNA, 2 rRNA, and 12 PCGs), and D-loop, whereas the L-strand encoded the remaining 9 genes (8 tRNA and ND6). Its nucleotide composition, gene arrangement, codon usage patterns, and tRNA secondary structures are identical with other members of the Percoidei suborder. Furthermore, we reconstructed phylogenetic trees based on the 13 PCGs. The resulting phylogenetic trees showed N. spinosus placing as a separate lineage within the family Niphonidae, its close relationship to Trachinus draco (Trachinidae), and the clustering of major subfamilies like Luciopercinae and Percinae of the Percoidei suborder. These findings will contribute to future studies on the evolutionary history, population genetics, molecular taxonomy, and phylogeny of N. spinosus and related species.
Collapse
Affiliation(s)
- Maheshkumar Prakash Patil
- Industry-University Cooperation Foundation, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| | - Jong-Oh Kim
- Department of Microbiology, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
- School of Marine and Fisheries Life Science, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| | - Seung Hyun Yoo
- School of Marine and Fisheries Life Science, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| | - Jiyoung Shin
- Institute of Food Science, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| | - Ji-Young Yang
- Department of Food Science and Technology, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| | - Kyunghoi Kim
- Department of Ocean Engineering, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| | - Gun-Do Kim
- Department of Microbiology, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
- School of Marine and Fisheries Life Science, Pukyong National University, 45 Yongso-ro, Nam-Gu, Busan 48513, Republic of Korea
| |
Collapse
|
18
|
Zhang N, Zhao P, Zhang W, Wang H, Wang K, Wang X, Zhang Z, Tan N, Chen L. A chromosome-level genome of Lobelia seguinii provides insights into the evolution of Campanulaceae and the lobeline biosynthesis. Genomics 2025; 117:110979. [PMID: 39675685 DOI: 10.1016/j.ygeno.2024.110979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 12/04/2024] [Accepted: 12/10/2024] [Indexed: 12/17/2024]
Abstract
Lobelia seguinii is a plant with great ecological and medicinal value and belongs to Campanulaceae. Lobelia contains lobeline, a well-known compound used to treat respiratory diseases. Nevertheless, lobeline biosynthesis needs further exploration. Moreover, whole-genome duplication (WGD) and karyotype evolution within Campanulaceae still need to be better understood. In this study, we obtained a chromosome-level genome of L. seguinii with a size of 1.4 Gb and 38253 protein-coding genes. Analyses revealed two WGDs within Campanulaceae, one at the most recent common ancestor (MRCA) of Campanula and Adenophora, and another at the MRCA of Lobelioideae. Analyses further revealed that the karyotype of Platycodon grandiflorus represents the ancient type within Asterales. We proposed eight enzymes involved in the lobeline biosynthesis pathway of L. seguinii. Molecular cloning and heterologous expression of phenylalanine ammonia-lyase (PAL), a candidate enzyme involved in the first step of lobeline biosynthesis, verified its function to catalyze the deamination of phenylalanine to cinnamic acid. This study sheds light on the evolution of Campanulaceae and lobeline biosynthesis.
Collapse
Affiliation(s)
- Na Zhang
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Puguang Zhao
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Wenda Zhang
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Huiying Wang
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 211198, China
| | - Kaixuan Wang
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Xiangyu Wang
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Zhanjiang Zhang
- National Center for Traditional Chinese Medicine (TCM) Inheritance and Innovation, Guangxi Botanical Garden of Medicinal Plants, 530023 Nanning, China.
| | - Ninghua Tan
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China.
| | - Lingyun Chen
- Department of Resources Science of Traditional Chinese Medicines, School of Traditional Chinese Pharmacy, China Pharmaceutical University, Nanjing 211198, China; Medical Botanical Garden, China Pharmaceutical University, Nanjing 211198, China.
| |
Collapse
|
19
|
Park A, Koslicki D. Prokrustean Graph: A substring index for rapid k-mer size analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.21.568151. [PMID: 38853857 PMCID: PMC11160577 DOI: 10.1101/2023.11.21.568151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Despite the widespread adoption of k-mer-based methods in bioinformatics, understanding the influence of k-mer sizes remains a persistent challenge. Selecting an optimal k-mer size or employing multiple k-mer sizes is often arbitrary, application-specific, and fraught with computational complexities. Typically, the influence of k-mer size is obscured by the outputs of complex bioinformatics tasks, such as genome analysis, comparison, assembly, alignment, and error correction. However, it is frequently overlooked that every method is built above a well-defined k-mer-based object like Jaccard Similarity, de Bruijn graphs, k-mer spectra, and Bray-Curtis Dissimilarity. Despite these objects offering a clearer perspective on the role of k-mer sizes, the dynamics of k-mer-based objects with respect to k-mer sizes remain surprisingly elusive. This paper introduces a computational framework that generalizes the transition of k-mer-based objects across k-mer sizes, utilizing a novel substring index, the Prokrustean graph. The primary contribution of this framework is to compute quantities associated with k-mer-based objects for all k-mer sizes, where the computational complexity depends solely on the number of maximal repeats and is independent of the range of k-mer sizes. For example, counting vertices of compacted de Bruijn graphs for k = 1 , … , 100 can be accomplished in mere seconds with our substring index constructed on a gigabase-sized read set. Additionally, we derive a space-efficient algorithm to extract the Prokrustean graph from the Burrows-Wheeler Transform. It becomes evident that modern substring indices, mostly based on longest common prefixes of suffix arrays, inherently face difficulties at exploring varying k-mer sizes due to their limitations at grouping co-occurring substrings. We have implemented four applications that utilize quantities critical in modern pangenomics and metagenomics. The code for these applications and the construction algorithm is available at https://github.com/KoslickiLab/prokrustean.
Collapse
Affiliation(s)
- Adam Park
- Computer Science and Engineering in Pennsylvania State University, PA, USA
| | - David Koslicki
- Computer Science and Engineering in Pennsylvania State University, PA, USA
- Biology in Pennsylvania State University, PA, USA
- Huck Institutes of the Life Sciences in Pennsylvania State University, PA, USA
| |
Collapse
|
20
|
Palmer Droguett DH, Fletcher M, Alston BT, Kocher S, Cabral-de-Mello DC, Wright AE. Neo-Sex Chromosome Evolution in Treehoppers Despite Long-Term X Chromosome Conservation. Genome Biol Evol 2024; 16:evae264. [PMID: 39657114 DOI: 10.1093/gbe/evae264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 11/20/2024] [Accepted: 12/04/2024] [Indexed: 12/17/2024] Open
Abstract
Sex chromosomes follow distinct evolutionary trajectories compared to the rest of the genome. In many cases, sex chromosomes (X and Y or Z and W) significantly differentiate from one another resulting in heteromorphic sex chromosome systems. Such heteromorphic systems are thought to act as an evolutionary trap that prevents subsequent turnover of the sex chromosome system. For old, degenerated sex chromosome systems, chromosomal fusion with an autosome may be one way that sex chromosomes can "refresh" their sequence content. We investigated these dynamics using treehoppers (hemipteran insects of the family Membracidae), which ancestrally have XX/X0 sex chromosomes. We assembled the most complete reference assembly for treehoppers to date for Umbonia crassicornis and employed comparative genomic analyses of 12 additional treehopper species to analyze X chromosome variation across different evolutionary timescales. We find that the X chromosome is largely conserved, with one exception being an X-autosome fusion in Calloconophora caliginosa. We also compare the ancestral treehopper X with other X chromosomes in Auchenorrhyncha (the clade containing treehoppers, leafhoppers, spittlebugs, cicadas, and planthoppers), revealing X conservation across more than 300 million years. These findings shed light on chromosomal evolution dynamics in treehoppers and the role of chromosomal rearrangements in sex chromosome evolution.
Collapse
Affiliation(s)
- Daniela H Palmer Droguett
- Ecology and Evolutionary Biology, School of Biosciences, The University of Sheffield, Sheffield, UK
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI, USA
| | - Micah Fletcher
- Department of Ecology and Evolutionary Biology, the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Ben T Alston
- Ecology and Evolutionary Biology, School of Biosciences, The University of Sheffield, Sheffield, UK
| | - Sarah Kocher
- Department of Ecology and Evolutionary Biology, the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Diogo C Cabral-de-Mello
- Department of General and Biology, Institute of Biosciences, São Paulo State University (UNESP), Rio Claro, Brazil
| | - Alison E Wright
- Ecology and Evolutionary Biology, School of Biosciences, The University of Sheffield, Sheffield, UK
| |
Collapse
|
21
|
Akutsu M, Shinozawa A, Nishiyama T, Sakata Y, Hiwatashi Y. De novo sequencing allows genome-wide identification of genes involved in galactomannan synthesis in locust bean (Ceratonia siliqua). DNA Res 2024; 31:dsae033. [PMID: 39673409 DOI: 10.1093/dnares/dsae033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 11/08/2024] [Accepted: 11/28/2024] [Indexed: 12/16/2024] Open
Abstract
Locust bean (Ceratonia siliqua) accumulates the galactomannan (GM) locust bean gum (LBG) in its seeds. LBG is a major industrial raw material used as a food thickener and gelling agent, whose unique properties mean that it cannot be readily replaced by other GMs. Whereas much is known about GM accumulation and the genes associated with GM biosynthesis in legumes, the genes involved in GM biosynthesis in C. siliqua are largely unknown. Here, we present a genome-wide list of genes predicted to be associated with the GM biosynthesis pathway in C. siliqua. We confirmed high GM accumulation in endosperm using a newly established GM quantification method involving LC-MS/MS. Through de novo draft genome assembly, we comprehensively identified genes predicted to be related to the GM biosynthesis pathway in C. siliqua by identifying orthologous groups. In particular, we identified all genes predicted to encode mannan synthase (ManS) and galactomannan galactosyltransferase (GMGT), enzymes functioning in the final step of GM biosynthesis, from the C. siliqua draft genome. ManS and the GMGT paralogs were predominantly expressed in endosperm. The genome and transcriptome produced in this study should facilitate research examining why C. siliqua produces LBG, unlike other legumes.
Collapse
Affiliation(s)
- Mitsuaki Akutsu
- Graduate School of Food, Agricultural and Environmental Sciences, Miyagi University, Sendai 982-0215, Japan
- Aoba Kasei Co., Ltd, Sendai 981-3137, Japan
| | - Akihisa Shinozawa
- Department of Bioscience, Tokyo University of Agriculture, Tokyo 156-8502, Japan
| | - Tomoaki Nishiyama
- Research Center for Experimental Modeling of Human Disease, Kanazawa University, Kanazawa, Ishikawa 920-0934, Japan
- School of Science, Academic Assembly, University of Toyama, Toyama, 930-8555, Japan
| | - Yoichi Sakata
- Department of Bioscience, Tokyo University of Agriculture, Tokyo 156-8502, Japan
| | - Yuji Hiwatashi
- Graduate School of Food, Agricultural and Environmental Sciences, Miyagi University, Sendai 982-0215, Japan
| |
Collapse
|
22
|
Moeckel C, Mareboina M, Konnaris MA, Chan CS, Mouratidis I, Montgomery A, Chantzi N, Pavlopoulos GA, Georgakopoulos-Soares I. A survey of k-mer methods and applications in bioinformatics. Comput Struct Biotechnol J 2024; 23:2289-2303. [PMID: 38840832 PMCID: PMC11152613 DOI: 10.1016/j.csbj.2024.05.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/07/2024] Open
Abstract
The rapid progression of genomics and proteomics has been driven by the advent of advanced sequencing technologies, large, diverse, and readily available omics datasets, and the evolution of computational data processing capabilities. The vast amount of data generated by these advancements necessitates efficient algorithms to extract meaningful information. K-mers serve as a valuable tool when working with large sequencing datasets, offering several advantages in computational speed and memory efficiency and carrying the potential for intrinsic biological functionality. This review provides an overview of the methods, applications, and significance of k-mers in genomic and proteomic data analyses, as well as the utility of absent sequences, including nullomers and nullpeptides, in disease detection, vaccine development, therapeutics, and forensic science. Therefore, the review highlights the pivotal role of k-mers in addressing current genomic and proteomic problems and underscores their potential for future breakthroughs in research.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Manvita Mareboina
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Maxwell A. Konnaris
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Candace S.Y. Chan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| | - Austin Montgomery
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Penn State University, University Park, Pennsylvania, USA
| |
Collapse
|
23
|
Balao F, Medrano M, Bazaga P, Paun O, Alonso C. Long-term methylome changes after experimental seed demethylation and their interaction with recurrent water stress in Erodium cicutarium (Geraniaceae). PLANT BIOLOGY (STUTTGART, GERMANY) 2024; 26:1199-1212. [PMID: 39250311 DOI: 10.1111/plb.13713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 08/09/2024] [Indexed: 09/11/2024]
Abstract
The frequencies and lengths of drought periods are increasing in subtropical and temperate regions worldwide. Epigenetic responses to water stress could be key for plant resilience to these largely unpredictable challenges. Experimental DNA demethylation, together with application of a stress factor is an appropriate strategy to reveal the contribution of epigenetics to plant responses to stress. We analysed leaf cytosine methylation changes in adult plants of the annual Mediterranean herb, Erodium cicutarium, in a greenhouse, after seed demethylation with 5-Azacytidine and/or recurrent water stress. We used bisulfite RADseq (BsRADseq) and a newly reported reference genome for E. cicutarium to characterize methylation changes in a 2 × 2 factorial design, controlling for plant relatedness. In the long term, 5-Azacytidine treatment alone caused both hypo- and hyper-methylation at individual cytosines, with substantial hypomethylation in CG contexts. In control conditions, drought resulted in a decrease in methylation in all but CHH contexts. In contrast, the genome of plants that experienced recurrent water stress and had been treated with 5-Azacytidine increased DNA methylation level by ca. 5%. Seed demethylation and recurrent drought produced a highly significant interaction in terms of global and context-specific cytosine methylation. Most methylation changes occurred around genic regions and within Transposable Elements. The annotation of these Differentially Methylated Regions associated with genes included several with a potential role in stress responses (e.g., PAL, CDKC, and ABCF), confirming an epigenetic contribution in response to stress at the molecular level.
Collapse
Affiliation(s)
- F Balao
- Departamento de Biología Vegetal y Ecología, Universidad de Sevilla, Sevilla, Spain
| | - M Medrano
- Estación Biológica de Doñana, CSIC, Sevilla, Spain
| | - P Bazaga
- Estación Biológica de Doñana, CSIC, Sevilla, Spain
| | - O Paun
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - C Alonso
- Estación Biológica de Doñana, CSIC, Sevilla, Spain
| |
Collapse
|
24
|
Guo C, Wang X, Ren H. Databases and computational methods for the identification of piRNA-related molecules: A survey. Comput Struct Biotechnol J 2024; 23:813-833. [PMID: 38328006 PMCID: PMC10847878 DOI: 10.1016/j.csbj.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/31/2023] [Accepted: 01/15/2024] [Indexed: 02/09/2024] Open
Abstract
Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNAs (ncRNAs) that plays important roles in many biological processes and major cancer diagnosis and treatment, thus becoming a hot research topic. This study aims to provide an in-depth review of computational piRNA-related research, including databases and computational models. Herein, we perform literature analysis and use comparative evaluation methods to summarize and analyze three aspects of computational piRNA-related research: (i) computational models for piRNA-related molecular identification tasks, (ii) computational models for piRNA-disease association prediction tasks, and (iii) computational resources and evaluation metrics for these tasks. This study shows that computational piRNA-related research has significantly progressed, exhibiting promising performance in recent years, whereas they also suffer from the emerging challenges of inconsistent naming systems and the lack of data. Different from other reviews on piRNA-related identification tasks that focus on the organization of datasets and computational methods, we pay more attention to the analysis of computational models, algorithms, and performances that aim to provide valuable references for computational piRNA-related identification tasks. This study will benefit the theoretical development and practical application of piRNAs by better understanding computational models and resources to investigate the biological functions and clinical implications of piRNA.
Collapse
Affiliation(s)
- Chang Guo
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
| | - Xiaoli Wang
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Han Ren
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
- Laboratory of Language and Artificial Intelligence, Guangdong University of Foreign Studies, Guangzhou 510420, China
| |
Collapse
|
25
|
Locatelli NS, Kitchen SA, Stankiewicz KH, Osborne CC, Dellaert Z, Elder H, Kamel B, Koch HR, Fogarty ND, Baums IB. Chromosome-level genome assemblies and genetic maps reveal heterochiasmy and macrosynteny in endangered Atlantic Acropora. BMC Genomics 2024; 25:1119. [PMID: 39567907 PMCID: PMC11577847 DOI: 10.1186/s12864-024-11025-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Accepted: 11/08/2024] [Indexed: 11/22/2024] Open
Abstract
BACKGROUND Over their evolutionary history, corals have adapted to sea level rise and increasing ocean temperatures, however, it is unclear how quickly they may respond to rapid change. Genome structure and genetic diversity contained within may highlight their adaptive potential. RESULTS We present chromosome-scale genome assemblies and linkage maps of the critically endangered Atlantic acroporids, Acropora palmata and A. cervicornis. Both assemblies and linkage maps were resolved into 14 chromosomes with their gene content and colinearity. Repeats and chromosome arrangements were largely preserved between the species. The family Acroporidae and the genus Acropora exhibited many phylogenetically significant gene family expansions. Macrosynteny decreased with phylogenetic distance. Nevertheless, scleractinians shared six of the 21 cnidarian ancestral linkage groups as well as numerous fission and fusion events compared to other distantly related cnidarians. Genetic linkage maps were constructed from one A. palmata family and 16 A. cervicornis families using a genotyping array. The consensus maps span 1,013.42 cM and 927.36 cM for A. palmata and A. cervicornis, respectively. Both species exhibited high genome-wide recombination rates (3.04 to 3.53 cM/Mb) and pronounced sex-based differences, known as heterochiasmy, with 2 to 2.5X higher recombination rates estimated in the female maps. CONCLUSIONS Together, the chromosome-scale assemblies and genetic maps we present here are the first detailed look at the genomic landscapes of the critically endangered Atlantic acroporids. These data sets revealed that adaptive capacity of Atlantic acroporids is not limited by their recombination rates. The sister species maintain macrosynteny with few genes with high sequence divergence that may act as reproductive barriers between them. In the Atlantic Acropora, hybridization between the two sister species yields an F1 hybrid with limited fertility despite the high levels of macrosynteny and gene colinearity of their genomes. Together, these resources now enable genome-wide association studies and discovery of quantitative trait loci, two tools that can aid in the conservation of these species.
Collapse
Affiliation(s)
- Nicolas S Locatelli
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
| | - Sheila A Kitchen
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
- Department of Marine Biology, Texas A&M University at Galveston, Galveston, TX, USA
| | - Kathryn H Stankiewicz
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
- Institute for Systems Biology, Seattle, WA, USA
| | - C Cornelia Osborne
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
| | - Zoe Dellaert
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
| | - Holland Elder
- Australian Institute of Marine Science, Townsville, QLD, Australia
| | - Bishoy Kamel
- Lawrence Berkeley National Laboratory, Joint Genome Institute, Berkeley, CA, USA
| | - Hanna R Koch
- Mote Marine Laboratory, Coral Reef Restoration Program, Summerland Key, FL, USA
| | - Nicole D Fogarty
- Department of Biology and Marine Biology, University of North Carolina Wilmington, Wilmington, NC, USA
| | - Iliana B Baums
- Department of Biology, The Pennsylvania State University, University Park, PA, USA.
- Helmholtz Institute for Functional Marine Biodiversity at the University of Oldenburg (HIFMB), Heerstraße 231, Oldenburg, Ammerländer, 26129, Germany.
- Alfred Wegener Institute, Helmholtz-Centre for Polar and Marine Research (AWI), Am Handelshafen, Bremerhaven, Germany.
- Institute for Chemistry and Biology of the Marine Environment (ICBM), School of Mathematics and Science, Carl Von Ossietzky Universität Oldenburg, Ammerländer Heerstraße 114-118, Oldenburg, 26129, Germany.
| |
Collapse
|
26
|
Köhler G, Sameit J, Seipp R, Geiss K. A new species of giant gecko of the genus Rhacodactylus from New Caledonia (Squamata, Gekkota, Diplodactylidae). Zootaxa 2024; 5538:301-321. [PMID: 39645701 DOI: 10.11646/zootaxa.5538.4.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2024] [Indexed: 12/10/2024]
Abstract
We describe a new species of giant gecko, Rhacodactylus willihenkeli sp. nov. from New Caledonia. The new species is most similar in external appearance and molecular data (16S and ND2 sequences) with R. leachianus from which it differs in coloration as well as having a genetic distance of 5.0% in the 16S gene fragment and 8.5% in the ND2 fragment, respectively.
Collapse
Affiliation(s)
- Gunther Köhler
- Senckenberg Forschungsinstitut und Naturmuseum; Senckenberganlage 25; 60325 Frankfurt a.M.; Germany.
| | | | - Robert Seipp
- Alt Praunheim 45; 60488 Frankfurt a.M.; Germany.
| | - Katharina Geiss
- Senckenberg Forschungsinstitut und Naturmuseum; Senckenberganlage 25; 60325 Frankfurt a.M.; Germany.
| |
Collapse
|
27
|
Shi Q, Zheng K, Li H, Wang B, Liang X, Li X, Wang J. LKLPDA: A Low-Rank Fast Kernel Learning Approach for Predicting piRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:2179-2187. [PMID: 39213276 DOI: 10.1109/tcbb.2024.3452055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Piwi-interacting RNAs (piRNAs) are increasingly recognized as potential biomarkers for various diseases. Investig-ating the complex relationship between piRNAs and diseases through computational methods can reduce the costs and risks associated with biological experiments. Fast kernel learning (FKL) is a classical method for multi-source data fusion that is widely employed in association prediction research. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper the effectiveness of the network-based ideal kernel. The conventional FKL method does not address this issue. In this study, we propose a low-rank fast kernel learning (LRFKL) algorithm, which consists of low-rank representation (LRR) and the FKL algorithm. The LRFKL algorithm is designed to mitigate the effects of noise on the network-based ideal kernel. Using LRFKL, we propose a novel approach for predicting piRNA-disease associations called LKLPDA. Specifically, we first compute the similarity matrices for piRNAs and diseases. Then we use the LRFKL to fuse the similarity matrices for piRNAs and diseases separately. Finally, the LKLPDA employs AutoGluon-Tabular for predictive analysis. Computational results show that LKLPDA effectively predicts piRNA-disease associations with higher accuracy compared to previous methods. In addition, case studies confirm the reliability of the model in predicting piRNA-disease associations.
Collapse
|
28
|
Fu L, Xie Y, Ling S, Wang Y, Wang B, Du H, Peng Q, Sun H. findGSEP: estimating genome size of polyploid species using k-mer frequencies. Bioinformatics 2024; 40:btae647. [PMID: 39475440 PMCID: PMC11552620 DOI: 10.1093/bioinformatics/btae647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 08/05/2024] [Accepted: 10/29/2024] [Indexed: 11/13/2024] Open
Abstract
SUMMARY Estimating genome size using k-mer frequencies, which plays a fundamental role in designing genome sequencing and analysis projects, has remained challenging for polyploid species, i.e., ploidy p > 2. To address this, we introduce "findGSEP," which is designed based on iterative curve fitting of k-mer frequencies. Precisely, it first disentangles up to p normal distributions by analyzing k-mer frequencies in whole genome sequencing of the focal species. Second, it computes the sizes of genomic regions related to 1∼p (homologous) chromosome(s) using each respective curve fitting, from which it infers the full polyploid and average haploid genome size. "findGSEP" can handle any level of ploidy p, and infer more accurate genome size than other well-known tools, as shown by tests using simulated and real genomic sequencing data of various species including octoploids. AVAILABILITY AND IMPLEMENTATION "findGSEP" was implemented as a web server, which is freely available at http://146.56.237.198:3838/findGSEP/. Also, "findGSEP" was implemented as an R package for parallel processing of multiple samples. Source code and tutorial on its installation and usage is available at https://github.com/sperfu/findGSEP.
Collapse
Affiliation(s)
- Laiyi Fu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
- Research Institute of Xi’an Jiaotong University, Zhejiang, Hangzhou 311200, China
- Sichuan Digital Economy Industry Development Research Institute, Chengdu 610036, China
| | - Yanxin Xie
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| | - Shunkang Ling
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832000, China
| | - Ying Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| | - Binzhong Wang
- Hubei Key Laboratory of Three Gorges Project for Conservation of Fishes, Yichang, Hubei 443100, China
| | - Hejun Du
- Hubei Key Laboratory of Three Gorges Project for Conservation of Fishes, Yichang, Hubei 443100, China
| | - Qinke Peng
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| | - Hequan Sun
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne 50829, Germany
| |
Collapse
|
29
|
Cossette ML, Stewart DT, Shafer ABA. Comparative Genomics of the World's Smallest Mammals Reveals Links to Echolocation, Metabolism, and Body Size Plasticity. Genome Biol Evol 2024; 16:evae225. [PMID: 39431406 PMCID: PMC11544316 DOI: 10.1093/gbe/evae225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Revised: 10/01/2024] [Accepted: 10/03/2024] [Indexed: 10/22/2024] Open
Abstract
Originating 30 million years ago, shrews (Soricidae) have diversified into around 400 species worldwide. Shrews display a wide array of adaptations, with some species having developed distinctive traits such as echolocation, underwater diving, and venomous saliva. Accordingly, these tiny insectivores are ideal to study the genomic mechanisms of evolution and adaptation. We conducted a comparative genomic analysis of four shrew species and 16 other mammals to identify genomic variations unique to shrews. Using two existing shrew genomes and two de novo assemblies for the maritime (Sorex maritimensis) and smoky (Sorex fumeus) shrews, we identified mutations in conserved regions of the genomes, also known as accelerated regions, gene families that underwent significant expansion, and positively selected genes. Our analyses unveiled shrew-specific genomic variants in genes associated with the nervous, metabolic, and auditory systems, which can be linked to unique traits in shrews. Notably, genes suggested to be under convergent evolution in echolocating mammals exhibited accelerated regions in shrews, and pathways linked to putative body size plasticity were detected. These findings provide insight into the evolutionary mechanisms shaping shrew species, shedding light on their adaptation and divergence over time.
Collapse
Affiliation(s)
- Marie-Laurence Cossette
- Department of Environmental Life Sciences Graduate Program, Trent University, Peterborough, ON, Canada
| | | | - Aaron B A Shafer
- Department of Environmental Life Sciences Graduate Program, Trent University, Peterborough, ON, Canada
- Department of Forensic Science, Trent University, Peterborough, ON, Canada
| |
Collapse
|
30
|
SoundharaPandiyan N, Alphonse CRW, Thanumalaya S, Vincent SGP, Kannan RR. Genome sequencing of Caridina pseudogracilirostris and its comparative analysis with malacostracan crustaceans. 3 Biotech 2024; 14:276. [PMID: 39464522 PMCID: PMC11499489 DOI: 10.1007/s13205-024-04121-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 10/04/2024] [Indexed: 10/29/2024] Open
Abstract
The Caridina pseudogracilirostris is commonly found in the brackish waters of the southwestern coastal regions of India. This study provides a comprehensive genomic investigation of the shrimp species C. pseudogracilirostris, offering insights into its genetic makeup, evolutionary dynamics, and functional annotations. The genomic DNA was isolated from tissue samples, sequenced using next-generation sequencing (NGS), and stored in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database (Accession No: PRJNA847710). De novo sequencing indicated a genome size of 1.31 Gbp with a low heterozygosity of about 0.81%. Repeat masking and annotation revealed that repeated elements constitute 24.60% of the genome, with simple sequence repeats (SSRs) accounting for 7.26%. Gene prediction identified 14,101 genes, with functional annotations indicating involvement in critical biological processes such as development, cellular function, immunological responses, and reproduction. Furthermore, phylogenetic analysis revealed genomic links among Malacostraca species, indicating gene duplication as a strategy for genetic diversity and adaptation. C. pseudogracilirostris has 1,856 duplicated genes, reflecting a distinct genomic architecture and evolutionary strategy within the Malacostraca branch. These findings enhance our understanding of the genetic characteristics and evolutionary relationships of C. pseudogracilirostris, providing significant insights into the overall evolutionary dynamics of the Malacostraca group. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-024-04121-4.
Collapse
Affiliation(s)
- NandhaGopal SoundharaPandiyan
- Centre for Molecular and Nanomedical Sciences, Centre for Nanoscience and Nanotechnology, School of Bio and Chemical Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu 600119 India
| | - Carlton Ranjith Wilson Alphonse
- Centre for Molecular and Nanomedical Sciences, Centre for Nanoscience and Nanotechnology, School of Bio and Chemical Engineering, Sathyabama Institute of Science and Technology, Chennai, Tamil Nadu 600119 India
| | | | | | - Rajaretinam Rajesh Kannan
- Department of Biotechnology, Sharda School of Engineering and Technology, Sharda University, Plot No, 32, 34, Knowledge Park III, Greater Noida, Uttar Pradesh 201306 India
| |
Collapse
|
31
|
Wade KJ, Suseno R, Kizer K, Williams J, Boquett J, Caillier S, Pollock NR, Renschen A, Santaniello A, Oksenberg JR, Norman PJ, Augusto DG, Hollenbach JA. MHConstructor: a high-throughput, haplotype-informed solution to the MHC assembly challenge. Genome Biol 2024; 25:274. [PMID: 39420419 PMCID: PMC11484429 DOI: 10.1186/s13059-024-03412-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 09/30/2024] [Indexed: 10/19/2024] Open
Abstract
The extremely high levels of genetic polymorphism within the human major histocompatibility complex (MHC) limit the usefulness of reference-based alignment methods for sequence assembly. We incorporate a short-read, de novo assembly algorithm into a workflow for novel application to the MHC. MHConstructor is a containerized pipeline designed for high-throughput, haplotype-informed, reproducible assembly of both whole genome sequencing and target capture short-read data in large, population cohorts. To-date, no other self-contained tool exists for the generation of de novo MHC assemblies from short-read data. MHConstructor facilitates wide-spread access to high-quality, alignment-free MHC sequence analysis.
Collapse
Affiliation(s)
- Kristen J Wade
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Rayo Suseno
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Kerry Kizer
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Jacqueline Williams
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Juliano Boquett
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Stacy Caillier
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Nicholas R Pollock
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
| | - Adam Renschen
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Adam Santaniello
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Jorge R Oksenberg
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA
| | - Paul J Norman
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
| | - Danillo G Augusto
- Department of Biological Sciences, University of North Carolina Charlotte, Charlotte, NC, USA
- Programa de Pós-Graduação em Genética, Universidade Federal do Paraná, Curitiba, Brazil
| | - Jill A Hollenbach
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
32
|
Bessa MH, Gottschalk MS, Robe LJ. Whole genome phylogenomics helps to resolve the phylogenetic position of the Zygothrica genus group (Diptera, Drosophilidae) and the causes of previous incongruences. Mol Phylogenet Evol 2024; 199:108158. [PMID: 39025321 DOI: 10.1016/j.ympev.2024.108158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 06/28/2024] [Accepted: 07/14/2024] [Indexed: 07/20/2024]
Abstract
Incomplete Lineage Sorting (ILS) and introgression are among the two main factors causing incongruence between gene and species trees. Advances in phylogenomic studies have allowed us to overcome most of these issues, providing reliable phylogenetic hypotheses while revealing the underlying evolutionary scenario. Across the last century, many incongruent phylogenetic reconstructions were recovered for Drosophilidae, employing a limited sampling of genetic markers or species. In these studies, the monophyly and the phylogenetic positioning of the Zygothrica genus group stood out as one of the most controversial questions. Thus, here, we addressed these issues using a phylogenomic approach, while accessing the influence of ILS and introgressions on the diversification of these species and addressing the spatio-temporal scenario associated with their evolution. For this task, the genomes of nine specimens from six Neotropical species belonging to the Zygothrica genus group were sequenced and evaluated in a phylogenetic framework encompassing other 39 species of Drosophilidae. Nucleotide and amino acid sequences recovered for a set of 2,534 single-copy genes by BUSCO were employed to reconstruct maximum likelihood (ML) concatenated and multi-species coalescent (MSC) trees. Likelihood mapping, quartet sampling, and reticulation tests were employed to infer the level and causes of incongruence. Lastly, a penalized-likelihood molecular clock strategy with fossil calibrations was performed to infer divergence times. Taken together, our results recovered the subdivision of Drosophila into six different lineages, one of which clusters species of the Zygothrica genus group (except for H. duncani). The divergence of this lineage was dated to Oligocene ∼ 31 Mya and seems to have occurred in the same timeframe as other key diversification within Drosophila. According to the concatenated and MSC strategies, this lineage is sister to the clade joining Drosophila (Siphlodora) with the Hawaiian Drosophila and Scaptomyza. Likelihood mapping, quartet sampling, reticulation reconstructions as well as introgression tests revealed that this lineage was the target of several hybridization events involving the ancestors of different Drosophila lineages. Thus, our results generally show introgression as a major source of previous incongruence. Nevertheless, the similar diversification times recovered for several of the Neotropical Drosophila lineages also support the scenario of multiple and simultaneous diversifications taking place at the base of Drosophilidae phylogeny, at least in the Neotropics.
Collapse
Affiliation(s)
- Maiara Hartwig Bessa
- Programa de Pós-Graduação em Biodiversidade Animal (PPGBA), Universidade Federal de Santa Maria (UFSM), Brazil
| | - Marco Silva Gottschalk
- Programa de Pós-Graduação em Biodiversidade Animal (PPGBDiv), Instituto de Biologia, Universidade Federal de Pelotas (UFPel), Brazil
| | - Lizandra Jaqueline Robe
- Programa de Pós-Graduação em Biodiversidade Animal (PPGBA), Universidade Federal de Santa Maria (UFSM), Brazil.
| |
Collapse
|
33
|
Li W, Almirantis Y, Provata A. Range-limited Heaps' law for functional DNA words in the human genome. J Theor Biol 2024; 592:111878. [PMID: 38901778 DOI: 10.1016/j.jtbi.2024.111878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 05/31/2024] [Accepted: 06/10/2024] [Indexed: 06/22/2024]
Abstract
Heaps' or Herdan-Heaps' law is a linguistic law describing the relationship between the vocabulary/dictionary size (type) and word counts (token) to be a power-law function. Its existence in genomes with certain definition of DNA words is unclear partly because the dictionary size in genome could be much smaller than that in a human language. We define a DNA word as a coding region in a genome that codes for a protein domain. Using human chromosomes and chromosome arms as individual samples, we establish the existence of Heaps' law in the human genome within limited range. Our definition of words in a genomic or proteomic context is different from other definitions such as over-represented k-mers which are much shorter in length. Although an approximate power-law distribution of protein domain sizes due to gene duplication and the related Zipf's law is well known, their translation to the Heaps' law in DNA words is not automatic. Several other animal genomes are shown herein also to exhibit range-limited Heaps' law with our definition of DNA words, though with various exponents. When tokens were randomly sampled and sample sizes reach to the maximum level, a deviation from the Heaps' law was observed, but a quadratic regression in log-log type-token plot fits the data perfectly. Investigation of type-token plot and its regression coefficients could provide an alternative narrative of reusage and redundancy of protein domains as well as creation of new protein domains from a linguistic perspective.
Collapse
Affiliation(s)
- Wentian Li
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA(1); The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA.
| | - Yannis Almirantis
- Theoretical Biology and Computational Genomics Laboratory, Institute of Bioscience and Applications, National Center for Scientific Research "Demokritos", 15341 Athens, Greece
| | - Astero Provata
- Statistical Mechanics and Dynamical Systems Laboratory, Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", 15341 Athens, Greece
| |
Collapse
|
34
|
Tomihara K, Llopart A, Yamamoto D. A chromosome-level genome assembly of Drosophila madeirensis, a fruit fly species endemic to the island of Madeira. G3 (BETHESDA, MD.) 2024; 14:jkae167. [PMID: 39031588 PMCID: PMC11373663 DOI: 10.1093/g3journal/jkae167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 02/20/2024] [Accepted: 07/11/2024] [Indexed: 07/22/2024]
Abstract
Drosophila subobscura is distributed across Europe, the Near East, and the Americas, while its sister species, Drosophila madeirensis, is endemic to the island of Madeira in the Atlantic Ocean. D. subobscura is known for its strict light-dependence in mating and its unique courtship displays, including nuptial gift-giving. D. subobscura has also attracted the interest of researchers because of its abundant variations in chromosomal polymorphisms correlated to the latitude and season, which have been used as a tool to track global climate warming. Although D. madeirensis can be an important resource for understanding the evolutionary underpinning of these genetic characteristics of D. subobscura, little work has been done on the biology of this species. Here, we used a HiFi long-read sequencing data set to produce a de novo genome assembly for D. madeirensis. This assembly comprises a total of 111 contigs spanning 135.5 Mb and has an N50 of 24.2 Mb and a BUSCO completeness score of 98.6%. Each of the 6 chromosomes of D. madeirensis consisted of a single contig except for some centromeric regions. Breakpoints of the chromosomal inversions between D. subobscura and D. madeirensis were characterized using this genome assembly, updating some of the previously identified locations.
Collapse
Affiliation(s)
- Kenta Tomihara
- Advanced ICT Research Institute, National Institute of Information and Communications Technology, Kobe, Hyogo 651-2492, Japan
| | - Ana Llopart
- Interdisciplinary Graduate Program in Genetics, University of Iowa, Iowa City, IA 52242, USA
- Department of Biology, University of Iowa, Iowa City, IA 52242, USA
| | - Daisuke Yamamoto
- Advanced ICT Research Institute, National Institute of Information and Communications Technology, Kobe, Hyogo 651-2492, Japan
| |
Collapse
|
35
|
Jackson TK, Rhode C. Comparative genomics of dusky kob (Argyrosomus japonicus, Sciaenidae) conspecifics: Evidence for speciation and the genetic mechanisms underlying traits. JOURNAL OF FISH BIOLOGY 2024; 105:841-857. [PMID: 38885946 DOI: 10.1111/jfb.15844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 04/17/2024] [Accepted: 05/28/2024] [Indexed: 06/20/2024]
Abstract
Dusky kob (Argyrosomus japonicus) is a commercially important finfish, indigenous to South Africa, Australia, and China. Previous studies highlighted differences in genetic composition, life history, and morphology of the species across geographic regions. A draft genome sequence of 0.742 Gb (N50 = 5.49 Mb; BUSCO completeness = 97.8%) and 22,438 predicted protein-coding genes was generated for the South African (SA) conspecific. A comparison with the Chinese (CN) conspecific revealed a core set of 32,068 orthologous protein clusters across both genomes. The SA genome exhibited 440 unique clusters compared to 1928 unique clusters in the CN genome. Transportation and immune response processes were overrepresented among the SA accessory genome, whereas the CN accessory genome was enriched for immune response, DNA transposition, and sensory detection (FDR-adjusted p < 0.01). These unique clusters may represent an adaptive component of the species' pangenome that could explain population divergence due to differential environmental specialisation. Furthermore, 700 single-copy orthologues (SCOs) displayed evidence of positive selection between the SA and CN genomes, and globally these genomes shared only 92% similarity, suggesting they might be distinct species. These genes primarily play roles in metabolism and digestion, illustrating the evolutionary pathways that differentiate the species. Understanding these genomic mechanisms underlying adaptation and evolution within and between species provides valuable insights into growth and maturation of kob, traits that are particularly relevant to commercial aquaculture.
Collapse
Affiliation(s)
- Tassin Kim Jackson
- Department of Genetics, Stellenbosch University, Stellenbosch, South Africa
| | - Clint Rhode
- Department of Genetics, Stellenbosch University, Stellenbosch, South Africa
| |
Collapse
|
36
|
Beeloo R, Zomer A, Deorowicz S, Dutilh B. Graphite: painting genomes using a colored de Bruijn graph. NAR Genom Bioinform 2024; 6:lqae142. [PMID: 39445080 PMCID: PMC11497850 DOI: 10.1093/nargab/lqae142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 08/02/2024] [Accepted: 10/05/2024] [Indexed: 10/25/2024] Open
Abstract
The recent growth of microbial sequence data allows comparisons at unprecedented scales, enabling the tracking of strains, mobile genetic elements, or genes. Querying a genome against a large reference database can easily yield thousands of matches that are tedious to interpret and pose computational challenges. We developed Graphite that uses a colored de Bruijn graph (cDBG) to paint query genomes, selecting the local best matches along the full query length. By focusing on the best genomic match of each query region, Graphite reduces the number of matches while providing the most promising leads for sequence tracking or genomic forensics. When applied to hundreds of Campylobacter genomes we found extensive gene sharing, including a previously undetected C. coli plasmid that matched a C. jejuni chromosome. Together, genome painting using cDBGs as enabled by Graphite, can reveal new biological phenomena by mitigating computational hurdles.
Collapse
Affiliation(s)
- Rick Beeloo
- Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Aldert L Zomer
- Department of Infectious Diseases and Immunology, Faculty of Veterinary Medicine, Utrecht University, 3584 Utrecht, The Netherlands
| | - Sebastian Deorowicz
- Department of Algorithmics and Software, Silesian University of Technology, Akademicka 16, Gliwice PL-44100, Poland
| | - Bas E Dutilh
- Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743 Jena, Germany
| |
Collapse
|
37
|
Wang C, Liu L, Yin M, Liu B, Wu Y, Eller F, Gao Y, Brix H, Wang T, Guo W, Salojärvi J. Chromosome-level genome assemblies reveal genome evolution of an invasive plant Phragmites australis. Commun Biol 2024; 7:1007. [PMID: 39154094 PMCID: PMC11330502 DOI: 10.1038/s42003-024-06660-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 07/30/2024] [Indexed: 08/19/2024] Open
Abstract
Biological invasions pose a significant threat to ecosystems, disrupting local biodiversity and ecosystem functions. The genomic underpinnings of invasiveness, however, are still largely unknown, making it difficult to predict and manage invasive species effectively. The common reed (Phragmites australis) is a dominant grass species in wetland ecosystems and has become particularly invasive when transferred from Europe to North America. Here, we present a high-quality gap-free, telomere-to-telomere genome assembly of Phragmites australis consisting of 24 pseudochromosomes and a B chromosome. Fully phased subgenomes demonstrated considerable subgenome dominance and revealed the divergence of diploid progenitors approximately 30.9 million years ago. Comparative genomics using chromosome-level scaffolds for three other lineages and a previously published draft genome assembly of an invasive lineage revealed that gene family expansions in the form of tandem duplications may have contributed to the invasiveness of the lineage. This study sheds light on the genome evolution of Arundinoideae grasses and suggests that genetic drivers, such as gene family expansions and tandem duplications, may underly the processes of biological invasion in plants. These findings provide a crucial step toward understanding and managing the genetic basis of invasiveness in plant species.
Collapse
Affiliation(s)
- Cui Wang
- Key Laboratory of Ecological Prewarning, Protection and Restoration of Bohai Sea, Ministry of Natural Resources, School of Life Sciences, Shandong University, Qingdao, PR China
- Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
| | - Lele Liu
- Key Laboratory of Ecological Prewarning, Protection and Restoration of Bohai Sea, Ministry of Natural Resources, School of Life Sciences, Shandong University, Qingdao, PR China
| | - Meiqi Yin
- Key Laboratory of Ecological Prewarning, Protection and Restoration of Bohai Sea, Ministry of Natural Resources, School of Life Sciences, Shandong University, Qingdao, PR China
| | - Bingbing Liu
- Institute of Loess Plateau, Shanxi University, Taiyuan, China
| | - Yiming Wu
- Key Laboratory of Ecological Prewarning, Protection and Restoration of Bohai Sea, Ministry of Natural Resources, School of Life Sciences, Shandong University, Qingdao, PR China
| | | | - Yingqi Gao
- Institute of Loess Plateau, Shanxi University, Taiyuan, China
| | - Hans Brix
- Department of Biology, Aarhus University, Aarhus, Denmark
| | - Tong Wang
- College of Landscape Architecture and Forestry, Qingdao Agricultural University, Qingdao, China
| | - Weihua Guo
- Key Laboratory of Ecological Prewarning, Protection and Restoration of Bohai Sea, Ministry of Natural Resources, School of Life Sciences, Shandong University, Qingdao, PR China.
| | - Jarkko Salojärvi
- Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland.
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore.
| |
Collapse
|
38
|
Mira-Jover A, Graciá E, Giménez A, Fritz U, Rodríguez-Caro RC, Bourgeois Y. Taking advantage of reference-guided assembly in a slowly-evolving lineage: Application to Testudo graeca. PLoS One 2024; 19:e0303408. [PMID: 39121089 PMCID: PMC11315351 DOI: 10.1371/journal.pone.0303408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 07/22/2024] [Indexed: 08/11/2024] Open
Abstract
BACKGROUND Obtaining de novo chromosome-level genome assemblies greatly enhances conservation and evolutionary biology studies. For many research teams, long-read sequencing technologies (that produce highly contiguous assemblies) remain unaffordable or unpractical. For the groups that display high synteny conservation, these limitations can be overcome by a reference-guided assembly using a close relative genome. Among chelonians, tortoises (Testudinidae) are considered one of the most endangered taxa, which calls for more genomic resources. Here we make the most of high synteny conservation in chelonians to produce the first chromosome-level genome assembly of the genus Testudo with one of the most iconic tortoise species in the Mediterranean basin: Testudo graeca. RESULTS We used high-quality, paired-end Illumina sequences to build a reference-guided assembly with the chromosome-level reference of Gopherus evgoodei. We reconstructed a 2.29 Gb haploid genome with a scaffold N50 of 107.598 Mb and 5.37% gaps. We sequenced 25,998 protein-coding genes, and identified 41.2% of the assembly as repeats. Demographic history reconstruction based on the genome revealed two events (population decline and recovery) that were consistent with previously suggested phylogeographic patterns for the species. This outlines the value of such reference-guided assemblies for phylogeographic studies. CONCLUSIONS Our results highlight the value of using close relatives to produce de novo draft assemblies in species where such resources are unavailable. Our annotated genome of T. graeca paves the way to delve deeper into the species' evolutionary history and provides a valuable resource to enhance direct conservation efforts on their threatened populations.
Collapse
Affiliation(s)
- Andrea Mira-Jover
- Ecology Area, University Institute for Agro-food and Agro-environmental Research and Innovation (CIAGRO), Miguel Hernández University, Elche, Carretera de Beniel, Orihuela (Alicante), Spain
| | - Eva Graciá
- Ecology Area, University Institute for Agro-food and Agro-environmental Research and Innovation (CIAGRO), Miguel Hernández University, Elche, Carretera de Beniel, Orihuela (Alicante), Spain
| | - Andrés Giménez
- Ecology Area, University Institute for Agro-food and Agro-environmental Research and Innovation (CIAGRO), Miguel Hernández University, Elche, Carretera de Beniel, Orihuela (Alicante), Spain
| | - Uwe Fritz
- Museum of Zoology, Senckenberg Dresden, Dresden, Germany
| | | | | |
Collapse
|
39
|
Hashiguchi Y, Mishina T, Takeshima H, Nakayama K, Tanoue H, Takeshita N, Takahashi H. Draft Genome of Akame (Lates Japonicus) Reveals Possible Genetic Mechanisms for Long-Term Persistence and Adaptive Evolution with Low Genetic Diversity. Genome Biol Evol 2024; 16:evae174. [PMID: 39109913 PMCID: PMC11346364 DOI: 10.1093/gbe/evae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/03/2024] [Indexed: 08/27/2024] Open
Abstract
It is known that some endangered species have persisted for thousands of years despite their very small effective population sizes and low levels of genetic polymorphisms. To understand the genetic mechanisms of long-term persistence in threatened species, we determined the whole genome sequences of akame (Lates japonicus), which has survived for a long time with extremely low genetic variations. Genome-wide heterozygosity in akame was estimated to be 3.3 to 3.4 × 10-4/bp, one of the smallest values in teleost fishes. Analysis of demographic history revealed that the effective population size in akame was around 1,000 from 30,000 years ago to the recent past. The relatively high ratio of nonsynonymous to synonymous heterozygosity in akame indicated an increased genetic load. However, a detailed analysis of genetic diversity in the akame genome revealed that multiple genomic regions, including genes involved in immunity, synaptic development, and olfactory sensory systems, have retained relatively high nucleotide polymorphisms. This implies that the akame genome has preserved the functional genetic variations by balancing selection, to avoid a reduction in viability and loss of adaptive potential. Analysis of synonymous and nonsynonymous nucleotide substitution rates has detected signs of positive selection in many akame genes, suggesting adaptive evolution to temperate waters after the speciation of akame and its close relative, barramundi (Lates calcarifer). Our results indicate that the functional genetic diversity likely contributed to the long-term persistence of this species by avoiding the harmful effects of the population size reduction.
Collapse
Affiliation(s)
- Yasuyuki Hashiguchi
- Department of Biology, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Takatsuki, Osaka 569-0801, Japan
| | - Tappei Mishina
- Laboratory for Chromosome Segregation, RIKEN Center for Biosystems Dynamics Research (BDR), Chuo-ku, Kobe 650-0047, Japan
- Faculty of Agriculture, Kyushu University, Nishi-ku, Fukuoka 819-0395, Japan
| | - Hirohiko Takeshima
- Faculty of Marine Bioscience, Research Center for Marine Biosciences, Fukui Prefectural University, Obama, Fukui 917-0003, Japan
| | - Kouji Nakayama
- Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University, Kyoto 606-8502, Japan
| | - Hideaki Tanoue
- Operations Evaluation Division, General Planning and Coordination Department, Headquarters, Japan Fisheries Research and Education Agency, Yokohama, Kanagawa 221-8529, Japan
| | - Naohiko Takeshita
- Department of Applied Aquabiology, National Fisheries University, Shimonoseki, Yamaguchi 759-6595, Japan
| | - Hiroshi Takahashi
- Department of Applied Aquabiology, National Fisheries University, Shimonoseki, Yamaguchi 759-6595, Japan
| |
Collapse
|
40
|
Jackson AC, Carine MA, Chapman MA. Genomics of ecological adaptation in Canary Island Descurainia (Brassicaceae) and comparisons with other Brassicaceae. Ecol Evol 2024; 14:e70144. [PMID: 39119179 PMCID: PMC11307170 DOI: 10.1002/ece3.70144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 07/16/2024] [Accepted: 07/25/2024] [Indexed: 08/10/2024] Open
Abstract
Oceanic archipelagos provide striking examples of lineages that have radiated over pronounced ecological gradients. Accompanying this diversification, lineages have evolved adaptations allowing survival in extreme environments. Here, we investigate the genomic basis of ecological adaptation in Canary Island Descurainia (Brassicaceae), an island relative of Arabidopsis. The seven endemic species have diversified in situ along an elevational and ecological gradient, from low-elevation scrub to high-elevation sub-alpine desert. We first generated a reference genome for Descurainia millefolia, phylogenetic analysis of which placed it as sister to D. sophioides. Ninety-six gene families were found to be specific to D. millefolia and a further 1087 and 1469 gene families have expanded or contracted in size, respectively, along the D. millefolia branch. We then employed genome re-sequencing to sample 14 genomes across the seven species of Canary Island Descurainia and an outgroup. Phylogenomic analyses were consistent with previous reconstructions of Canary Island Descurainia in resolving low- and high-elevation clades. Using the branch-site dN/dS method, we detected positive selection for 275 genes on the branch separating the low- and high-elevation species and these positively selected genes (PSGs) were significantly enriched for functions related to reproduction and stress tolerance. Comparing PSGs to those in analyses of adaptation to elevation and/or latitude in other Brassicaceae, we found little evidence of widespread convergence and gene reuse, except for two examples, one of which was a significant overlap between Descurainia and Draba nivalis, a species restricted to high latitudes. The study of Canary Island Descurainia suggests that the transition to high-elevation environments such as that found in the high mountains of the Canary Islands involves selection on genes related to reproduction and stress tolerance but that repeated evolution across different lineages that have evolved into similar habitats is limited, indicating substantially different molecular trajectories to adaptation.
Collapse
Affiliation(s)
- Amy C. Jackson
- Biological SciencesUniversity of SouthamptonSouthamptonUK
- Algae, Fungi and Plants DivisionThe Natural History MuseumLondonUK
- Present address:
Royal Botanic Gardens, Kew, Kew GreenRichmondSurreyUK
| | - Mark A. Carine
- Algae, Fungi and Plants DivisionThe Natural History MuseumLondonUK
| | | |
Collapse
|
41
|
Todd A, Bhide K, Hayford R, Ayyappan V, Subramani M, Chintapenta LK, Thimmapuram J, Ozbay G, Kalavacharla V(K. Development of a Reference Transcriptome and Identification of Differentially Expressed Genes Linked to Salt Stress in Salt Marsh Grass ( Sporobolus alterniflorus) along Delaware Coastal Regions. PLANTS (BASEL, SWITZERLAND) 2024; 13:2008. [PMID: 39065534 PMCID: PMC11280579 DOI: 10.3390/plants13142008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2024] [Revised: 07/16/2024] [Accepted: 07/16/2024] [Indexed: 07/28/2024]
Abstract
Salt marsh grass (Sporobolus alterniflorus) plays a crucial role in Delaware coastal regions by serving as a physical barrier between land and water along the inland bays and beaches. This vegetation helps to stabilize the shoreline and prevent erosion, protecting the land from the powerful forces of the waves and tides. In addition to providing a physical barrier, salt marsh grass is responsible for filtering nutrients in the water, offering an environment for aquatic species and presenting a focal point of study for high salt tolerance in plants. As seawater concentrations vary along the Delaware coast from low to medium to high salinity, our study seeks to identify the impact of salt tolerance in marsh grass and to identify genes associated with salt tolerance levels. We developed more than 211,000 next-generation-sequencing (Illumina) transcriptomic reads to create a reference transcriptome from low-, medium-, and high-salinity marsh grass leaf samples collected from the Delaware coastline. Contiguous sequences were annotated based on a homology search using BLASTX against rice (Oryza sativa), foxtail millet (Setaria italica), and non-redundant species within the Viridiplantae database. Additionally, we identified differentially expressed genes related to salinity stress as candidates for salt stress qPCR analysis. The data generated from this study may help to elucidate the genetic signatures and physiological responses of plants to salinity stress, thereby offering valuable insight into the use of innovative approaches for gene expression studies in crops that are less salt tolerant.
Collapse
Affiliation(s)
- Antonette Todd
- College of Agriculture, Science and Technology, Delaware State University, Dover, DE 19901, USA; (R.H.); (V.A.); (M.S.); (G.O.); (V.K.)
| | - Ketaki Bhide
- Bioinformatics Core, Purdue University, West Lafayette, IN 47907, USA; (K.B.); (J.T.)
| | - Rita Hayford
- College of Agriculture, Science and Technology, Delaware State University, Dover, DE 19901, USA; (R.H.); (V.A.); (M.S.); (G.O.); (V.K.)
| | - Vasudevan Ayyappan
- College of Agriculture, Science and Technology, Delaware State University, Dover, DE 19901, USA; (R.H.); (V.A.); (M.S.); (G.O.); (V.K.)
| | - Mayavan Subramani
- College of Agriculture, Science and Technology, Delaware State University, Dover, DE 19901, USA; (R.H.); (V.A.); (M.S.); (G.O.); (V.K.)
| | - Lathadevi Karuna Chintapenta
- Department of Biology, College of Arts and Sciences, University of Wisconsin River Falls, River Falls, WI 54022, USA;
| | - Jyothi Thimmapuram
- Bioinformatics Core, Purdue University, West Lafayette, IN 47907, USA; (K.B.); (J.T.)
| | - Gulnihal Ozbay
- College of Agriculture, Science and Technology, Delaware State University, Dover, DE 19901, USA; (R.H.); (V.A.); (M.S.); (G.O.); (V.K.)
| | - Venu (Kal) Kalavacharla
- College of Agriculture, Science and Technology, Delaware State University, Dover, DE 19901, USA; (R.H.); (V.A.); (M.S.); (G.O.); (V.K.)
| |
Collapse
|
42
|
Shi CY, Qin GL, Qin YC, Lu LY, Guan DL, Gao LX. A high-quality chromosome-level genome assembly of the endangered tree Kmeria septentrionalis. Sci Data 2024; 11:775. [PMID: 39003271 PMCID: PMC11246460 DOI: 10.1038/s41597-024-03617-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 07/05/2024] [Indexed: 07/15/2024] Open
Abstract
Kmeria septentrionalis is a critically endangered tree endemic to Guangxi, China, and is listed on the International Union for Conservation of Nature's Red List. The lack of genetic information and high-quality genome data has hindered conservation efforts and studies on this species. In this study, we present a chromosome-level genome assembly of K. septentrionalis. The genome was initially assembled to be 2.57 Gb, with a contig N50 of 11.93 Mb. Hi-C guided genome assembly allowed us to anchor 98.83% of the total length of the initial contigs onto 19 pseudochromosomes, resulting in a scaffold N50 of 135.08 Mb. The final chromosome-level genome, spaning 2.54 Gb, achieved a BUSCO completeness of 98.9% and contained 1.67 Gb repetitive elements and 35,927 coding genes. This high-quality genome assembly provides a valuable resource for understanding the genetic basis of conservation-related traits and biological properties of this endangered tree species. Furthermore, it lays a critical foundation for evolutionary studies within the Magnoliaceae family.
Collapse
Grants
- This study was supported by the Scientific research project of Hechi University (Grant No: 2021GCC023, 2021GCC017, 2023GCC017), and Research platform of “Northwest Guangxi characteristic plant resources development and function research center”, “Northwest Guangxi Economic Plant Biotechnology Research Center” and “Screening and Breeding of high-value Medicinal plants in Krast”.
- National Key Research and Development Program of China (Grant No.2022YFC2601400), the National Nature Science Foundation (Grant No: 32102205), the Nanfan special project, CAAS (Grant No: ZDXM2312), and the Program of Beijing Academy of Agriculture and Forestry Sciences (Grant No: JKZX202208).
- Nanfan special project, CAAS (Grant No: ZDXM2312), and the Program of Beijing Academy of Agriculture and Forestry Sciences (Grant No: JKZX202208). Scientific research project of Hechi University (Grant No: 2021GCC023, 2021GCC017)
Collapse
Affiliation(s)
- Chen-Yu Shi
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, School of Chemistry and Bioengineering, Hechi University, Hechi, 546300, China
| | - Guo-Le Qin
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, School of Chemistry and Bioengineering, Hechi University, Hechi, 546300, China
| | - Ying-Can Qin
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, School of Chemistry and Bioengineering, Hechi University, Hechi, 546300, China
| | - Lin-Yuan Lu
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, School of Chemistry and Bioengineering, Hechi University, Hechi, 546300, China
| | - De-Long Guan
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, School of Chemistry and Bioengineering, Hechi University, Hechi, 546300, China.
| | - Li-Xia Gao
- Guangxi Key Laboratory of Sericulture Ecology and Applied Intelligent Technology, School of Chemistry and Bioengineering, Hechi University, Hechi, 546300, China.
| |
Collapse
|
43
|
Guguchkin E, Kasianov A, Belenikin M, Zobkova G, Kosova E, Makeev V, Karpulevich E. Enhancing SNV identification in whole-genome sequencing data through the incorporation of known genetic variants into the minimap2 index. BMC Bioinformatics 2024; 25:238. [PMID: 39003441 PMCID: PMC11246581 DOI: 10.1186/s12859-024-05862-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 07/10/2024] [Indexed: 07/15/2024] Open
Abstract
MOTIVATION Alignment of reads to a reference genome sequence is one of the key steps in the analysis of human whole-genome sequencing data obtained through Next-generation sequencing (NGS) technologies. The quality of the subsequent steps of the analysis, such as the results of clinical interpretation of genetic variants or the results of a genome-wide association study, depends on the correct identification of the position of the read as a result of its alignment. The amount of human NGS whole-genome sequencing data is constantly growing. There are a number of human genome sequencing projects worldwide that have resulted in the creation of large-scale databases of genetic variants of sequenced human genomes. Such information about known genetic variants can be used to improve the quality of alignment at the read alignment stage when analysing sequencing data obtained for a new individual, for example, by creating a genomic graph. While existing methods for aligning reads to a linear reference genome have high alignment speed, methods for aligning reads to a genomic graph have greater accuracy in variable regions of the genome. The development of a read alignment method that takes into account known genetic variants in the linear reference sequence index allows combining the advantages of both sets of methods. RESULTS In this paper, we present the minimap2_index_modifier tool, which enables the construction of a modified index of a reference genome using known single nucleotide variants and insertions/deletions (indels) specific to a given human population. The use of the modified minimap2 index improves variant calling quality without modifying the bioinformatics pipeline and without significant additional computational overhead. Using the PrecisionFDA Truth Challenge V2 benchmark data (for HG002 short-read data aligned to the GRCh38 linear reference (GCA_000001405.15) with parameters k = 27 and w = 14) it was demonstrated that the number of false negative genetic variants decreased by more than 9500, and the number of false positives decreased by more than 7000 when modifying the index with genetic variants from the Human Pangenome Reference Consortium.
Collapse
Affiliation(s)
- Egor Guguchkin
- Ivannikov Institute for System Programming, Moscow, Russia.
| | - Artem Kasianov
- Institute for Information Transmission Problems, Moscow, Russia
| | | | | | | | - Vsevolod Makeev
- Vavilov Institute of General Genetics, Moscow, Russia
- Institute of Biochemistry and Genetics of Ufa Scientific Centre, Ufa, Russia
- Cancer Research UK National Biomarker Centre, University of Manchester, Manchester, Manchester, M20 4BX, UK
| | | |
Collapse
|
44
|
Kitchen SA, Naragon TH, Brückner A, Ladinsky MS, Quinodoz SA, Badroos JM, Viliunas JW, Kishi Y, Wagner JM, Miller DR, Yousefelahiyeh M, Antoshechkin IA, Eldredge KT, Pirro S, Guttman M, Davis SR, Aardema ML, Parker J. The genomic and cellular basis of biosynthetic innovation in rove beetles. Cell 2024; 187:3563-3584.e26. [PMID: 38889727 PMCID: PMC11246231 DOI: 10.1016/j.cell.2024.05.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 02/29/2024] [Accepted: 05/06/2024] [Indexed: 06/20/2024]
Abstract
How evolution at the cellular level potentiates macroevolutionary change is central to understanding biological diversification. The >66,000 rove beetle species (Staphylinidae) form the largest metazoan family. Combining genomic and cell type transcriptomic insights spanning the largest clade, Aleocharinae, we retrace evolution of two cell types comprising a defensive gland-a putative catalyst behind staphylinid megadiversity. We identify molecular evolutionary steps leading to benzoquinone production by one cell type via a mechanism convergent with plant toxin release systems, and synthesis by the second cell type of a solvent that weaponizes the total secretion. This cooperative system has been conserved since the Early Cretaceous as Aleocharinae radiated into tens of thousands of lineages. Reprogramming each cell type yielded biochemical novelties enabling ecological specialization-most dramatically in symbionts that infiltrate social insect colonies via host-manipulating secretions. Our findings uncover cell type evolutionary processes underlying the origin and evolvability of a beetle chemical innovation.
Collapse
Affiliation(s)
- Sheila A Kitchen
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Thomas H Naragon
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Adrian Brückner
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Mark S Ladinsky
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Sofia A Quinodoz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jean M Badroos
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Joani W Viliunas
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Yuriko Kishi
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Julian M Wagner
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - David R Miller
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Mina Yousefelahiyeh
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Igor A Antoshechkin
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - K Taro Eldredge
- Museum of Zoology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Stacy Pirro
- Iridian Genomes, 613 Quaint Acres Dr., Silver Spring, MD 20904, USA
| | - Mitchell Guttman
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Steven R Davis
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY 10024, USA
| | - Matthew L Aardema
- Department of Biology, Montclair State University, Montclair, NJ 07043, USA
| | - Joseph Parker
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
45
|
Gaspar J, Trewick SA, Gibb GC. De-novo assembly of four rail (Aves: Rallidae) genomes: A resource for comparative genomics. Ecol Evol 2024; 14:e11694. [PMID: 39026944 PMCID: PMC11255403 DOI: 10.1002/ece3.11694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 06/19/2024] [Accepted: 06/24/2024] [Indexed: 07/20/2024] Open
Abstract
Rails are a phenotypically diverse family of birds that includes 130 species and displays a wide distribution around the world. Here we present annotated genome assemblies for four rails from Aotearoa New Zealand: two native volant species, pūkeko Porphyrio melanotus and mioweka Gallirallus philippensis, and two endemic flightless species takahē Porphyrio hochstetteri and weka Gallirallus australis. Using the sequence read data, heterozygosity was found to be lowest in the endemic flightless species and this probably reflects their relatively small populations. The quality checks and comparison with other rallid genomes showed that the new assemblies were of good quality. This study significantly increases the number of available rallid genomes and will enable future genomic studies on the evolution of this family.
Collapse
Affiliation(s)
- Julien Gaspar
- School of Food Technology and Natural Sciences, Wildlife and Ecology GroupMassey UniversityPalmerston NorthNew Zealand
- Royal Belgian Institute of Natural SciencesBrusselsBelgium
| | - Steve A. Trewick
- School of Food Technology and Natural Sciences, Wildlife and Ecology GroupMassey UniversityPalmerston NorthNew Zealand
| | - Gillian C. Gibb
- School of Food Technology and Natural Sciences, Wildlife and Ecology GroupMassey UniversityPalmerston NorthNew Zealand
| |
Collapse
|
46
|
Serajian M, Marini S, Alanko JN, Noyes NR, Prosperi M, Boucher C. Scalable de novo classification of antibiotic resistance of Mycobacterium tuberculosis. Bioinformatics 2024; 40:i39-i47. [PMID: 38940175 PMCID: PMC11211809 DOI: 10.1093/bioinformatics/btae243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION World Health Organization estimates that there were over 10 million cases of tuberculosis (TB) worldwide in 2019, resulting in over 1.4 million deaths, with a worrisome increasing trend yearly. The disease is caused by Mycobacterium tuberculosis (MTB) through airborne transmission. Treatment of TB is estimated to be 85% successful, however, this drops to 57% if MTB exhibits multiple antimicrobial resistance (AMR), for which fewer treatment options are available. RESULTS We develop a robust machine-learning classifier using both linear and nonlinear models (i.e. LASSO logistic regression (LR) and random forests (RF)) to predict the phenotypic resistance of Mycobacterium tuberculosis (MTB) for a broad range of antibiotic drugs. We use data from the CRyPTIC consortium to train our classifier, which consists of whole genome sequencing and antibiotic susceptibility testing (AST) phenotypic data for 13 different antibiotics. To train our model, we assemble the sequence data into genomic contigs, identify all unique 31-mers in the set of contigs, and build a feature matrix M, where M[i, j] is equal to the number of times the ith 31-mer occurs in the jth genome. Due to the size of this feature matrix (over 350 million unique 31-mers), we build and use a sparse matrix representation. Our method, which we refer to as MTB++, leverages compact data structures and iterative methods to allow for the screening of all the 31-mers in the development of both LASSO LR and RF. MTB++ is able to achieve high discrimination (F-1 >80%) for the first-line antibiotics. Moreover, MTB++ had the highest F-1 score in all but three classes and was the most comprehensive since it had an F-1 score >75% in all but four (rare) antibiotic drugs. We use our feature selection to contextualize the 31-mers that are used for the prediction of phenotypic resistance, leading to some insights about sequence similarity to genes in MEGARes. Lastly, we give an estimate of the amount of data that is needed in order to provide accurate predictions. AVAILABILITY The models and source code are publicly available on Github at https://github.com/M-Serajian/MTB-Pipeline.
Collapse
Affiliation(s)
- Mohammadali Serajian
- Department of Computer and Information Science and Engineering, University of Florida, 1889 Museum Road, Gainesville, Florida 32611, United States
| | - Simone Marini
- Department of Epidemiology, University of Florida, PO Box 100231, Gainesville, Florida 32601, United States
| | - Jarno N Alanko
- Department of Computer Science, University of Helsinki, P.O. Box 4, Helsinki 00014, Finland
| | - Noelle R Noyes
- Department of Veterinary Population Medicine, University of Minnesota, 1365 Gortner Avenue, St. Paul, Minnesota 55108, United States
| | - Mattia Prosperi
- Department of Epidemiology, University of Florida, PO Box 100231, Gainesville, Florida 32601, United States
| | - Christina Boucher
- Department of Computer and Information Science and Engineering, University of Florida, 1889 Museum Road, Gainesville, Florida 32611, United States
| |
Collapse
|
47
|
Chin KL, Suing EJ, Andong R, Foo CH, Chan SK, Jani J, Ahmed K, Mustapha ZA. First whole genome sequencing data of a Mycobacterium tuberculosis STB-T1A strain isolated from a spinal tuberculosis patient in Sabah, Malaysia. Data Brief 2024; 54:110476. [PMID: 38725551 PMCID: PMC11079456 DOI: 10.1016/j.dib.2024.110476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 04/22/2024] [Indexed: 05/12/2024] Open
Abstract
Spinal tuberculosis, also referred to as Pott's disease, presents a significant risk of severe paralysis if not promptly detected and treated, owing to complications such as spinal cord compression and deformity. This article presents the genetic analysis of a Mycobacterium tuberculosis STB-T1A strain, isolated from the spine of a 29-year-old female diagnosed with spinal tuberculosis. Genomic DNA was extracted from pure culture and subjected to sequencing using the Illumina NovaSeq 6000 sequencing system. The genome of the M. tuberculosis STB-T1A strain spans 4,367,616 base pairs with a G+C content of 65.56 % and 4174 protein-coding genes. Comparative genomic analysis, conducted via single nucleotide polymorphism (SNP)-based phylogenetic analysis using the Maximum Likelihood method, revealed that the strain falls within the Indo-Oceanic lineage (Lineage 1). It clusters with the M. tuberculosis 43-16836 strain, which was isolated from the cerebrospinal fluid of a patient with tuberculous meningitis in Thailand. The complete genome sequence has been deposited at the National Center for Biotechnology Information (NCBI) GenBank database with the accession number JBBMVZ000000000.
Collapse
Affiliation(s)
- Kai Ling Chin
- Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
- Borneo Medical and Health Research Centre, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
| | - Eraniyah Jastan Suing
- Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
| | - Ruhini Andong
- Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
| | - Choong Hoon Foo
- Department of Orthopaedics, Queen Elizabeth Hospital, Ministry of Health Malaysia, Kota Kinabalu, Sabah, Malaysia
| | - Sook Kwan Chan
- Department of Orthopaedics, Queen Elizabeth Hospital, Ministry of Health Malaysia, Kota Kinabalu, Sabah, Malaysia
| | - Jaeyres Jani
- Borneo Medical and Health Research Centre, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
| | - Kamruddin Ahmed
- Borneo Medical and Health Research Centre, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
- Department of Pathology and Microbiology, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
| | - Zainal Arifin Mustapha
- Department of Medical Education, Faculty of Medicine and Health Sciences, Universiti Malaysia Sabah, Kota Kinabalu, Sabah, Malaysia
| |
Collapse
|
48
|
Wang ZF, Wu LF, Chen L, Zhu WG, Yu EP, Xu FX, Cao HL. Genome assembly of Ottelia alismoides, a multiple-carbon utilisation aquatic plant. BMC Genom Data 2024; 25:48. [PMID: 38783174 PMCID: PMC11118731 DOI: 10.1186/s12863-024-01230-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Accepted: 05/15/2024] [Indexed: 05/25/2024] Open
Abstract
OBJECTIVES Ottelia Pers. is in the Hydrocharitaceae family. Species in the genus are aquatic, and China is their centre of origin in Asia. Ottelia alismoides (L.) Pers., which is distributed worldwide, is a distinguishing element in China, while other species of this genus are endemic to China. However, O. alismoides is also considered endangered due to habitat loss and pollution in some Asian countries. Ottelia alismoides is the only submerged macrophyte that contains three carbon dioxide-concentrating mechanisms, i.e. bicarbonate (HCO3-) use, crassulacean acid metabolism and the C4 pathway. In this study, we present its first genome assembly to help illustrate the various carbon metabolism mechanisms and to enable genetic conservation in the future. DATA DESCRIPTION Using DNA and RNA extracted from one O. alismoides leaf, this work produced ∼ 73.4 Gb HiFi reads, ∼ 126.4 Gb whole genome sequencing short reads and ∼ 21.9 Gb RNA-seq reads. The de novo genome assembly was 6,455,939,835 bp in length, with 11,923 scaffolds/contigs and an N50 of 790,733 bp. Genome assembly completeness assessment with Benchmarking Universal Single-Copy Orthologs revealed a score of 94.4%. The repetitive sequence in the assembly was 4,875,817,144 bp (75.5%). A total of 116,176 genes were predicted. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.
Collapse
Affiliation(s)
- Zheng-Feng Wang
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- South China National Botanical Garden, Guangzhou, 510650, China.
| | - Lin-Fang Wu
- Guangzhou Linfang Ecological Technology Co., Ltd, Guangzhou, 510000, China
| | - Lei Chen
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- South China National Botanical Garden, Guangzhou, 510650, China
| | - Wei-Guang Zhu
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- South China National Botanical Garden, Guangzhou, 510650, China
| | - En-Ping Yu
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- South China National Botanical Garden, Guangzhou, 510650, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Feng-Xia Xu
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
- South China National Botanical Garden, Guangzhou, 510650, China
| | - Hong-Lin Cao
- Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
- South China National Botanical Garden, Guangzhou, 510650, China.
| |
Collapse
|
49
|
Wade KJ, Suseno R, Kizer K, Williams J, Boquett J, Caillier S, Pollock NR, Renschen A, Santaniello A, Oksenberg JR, Norman PJ, Augusto DG, Hollenbach JA. MHConstructor: A high-throughput, haplotype-informed solution to the MHC assembly challenge. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.20.595060. [PMID: 38826378 PMCID: PMC11142050 DOI: 10.1101/2024.05.20.595060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
The extremely high levels of genetic polymorphism within the human major histocompatibility complex (MHC) limit the usefulness of reference-based alignment methods for sequence assembly. We incorporate a short read de novo assembly algorithm into a workflow for novel application to the MHC. MHConstructor is a containerized pipeline designed for high-throughput, haplotype-informed, reproducible assembly of both whole genome sequencing and target-capture short read data in large, population cohorts. To-date, no other self-contained tool exists for the generation of de novo MHC assemblies from short read data. MHConstructor facilitates wide-spread access to high quality, alignment-free MHC sequence analysis.
Collapse
Affiliation(s)
- Kristen J. Wade
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Rayo Suseno
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Kerry Kizer
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Jacqueline Williams
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Juliano Boquett
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Stacy Caillier
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Nicholas R. Pollock
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
| | - Adam Renschen
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Adam Santaniello
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Jorge R. Oksenberg
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
| | - Paul J. Norman
- Department of Biomedical Informatics, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
- Department of Immunology and Microbiology, Anschutz Medical Campus, University of Colorado, Aurora, Colorado, USA
| | - Danillo G. Augusto
- Department of Biological Sciences, University of North Carolina Charlotte, Charlotte, NC, United States
- Programa de Pós-Graduação em Genética, Universidade Federal do Paraná, Curitiba, Brazil
| | - Jill A. Hollenbach
- Weill Institute for Neurosciences, Department of Neurology, University of California San Francisco, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, United States
| |
Collapse
|
50
|
Opulente DA, LaBella AL, Harrison MC, Wolters JF, Liu C, Li Y, Kominek J, Steenwyk JL, Stoneman HR, VanDenAvond J, Miller CR, Langdon QK, Silva M, Gonçalves C, Ubbelohde EJ, Li Y, Buh KV, Jarzyna M, Haase MAB, Rosa CA, Čadež N, Libkind D, DeVirgilio JH, Hulfachor AB, Kurtzman CP, Sampaio JP, Gonçalves P, Zhou X, Shen XX, Groenewald M, Rokas A, Hittinger CT. Genomic factors shape carbon and nitrogen metabolic niche breadth across Saccharomycotina yeasts. Science 2024; 384:eadj4503. [PMID: 38662846 PMCID: PMC11298794 DOI: 10.1126/science.adj4503] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 03/22/2024] [Indexed: 05/03/2024]
Abstract
Organisms exhibit extensive variation in ecological niche breadth, from very narrow (specialists) to very broad (generalists). Two general paradigms have been proposed to explain this variation: (i) trade-offs between performance efficiency and breadth and (ii) the joint influence of extrinsic (environmental) and intrinsic (genomic) factors. We assembled genomic, metabolic, and ecological data from nearly all known species of the ancient fungal subphylum Saccharomycotina (1154 yeast strains from 1051 species), grown in 24 different environmental conditions, to examine niche breadth evolution. We found that large differences in the breadth of carbon utilization traits between yeasts stem from intrinsic differences in genes encoding specific metabolic pathways, but we found limited evidence for trade-offs. These comprehensive data argue that intrinsic factors shape niche breadth variation in microbes.
Collapse
Affiliation(s)
- Dana A. Opulente
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- Biology Department Villanova University, Villanova, PA 19085, USA
| | - Abigail Leavitt LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- North Carolina Research Center (NCRC), Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, 150 Research Campus Drive, Kannapolis, NC 28081, USA
| | - Marie-Claire Harrison
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - John F. Wolters
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Chao Liu
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou 310058, China
| | - Yonglin Li
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Jacek Kominek
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- LifeMine Therapeutics, Inc., Cambridge, MA 02140, USA
| | - Jacob L. Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Hayley R. Stoneman
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- University of Colorado - Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Jenna VanDenAvond
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Caroline R. Miller
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Quinn K. Langdon
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Margarida Silva
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Carla Gonçalves
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Emily J. Ubbelohde
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Yuanning Li
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Institute of Marine Science and Technology, Shandong University, Qingdao 266237, China
- Laboratory for Marine Biology and Biotechnology, Qingdao Marine Science and Technology Center, Qingdao 266237, China
| | - Kelly V. Buh
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Martin Jarzyna
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- Graduate Program in Neuroscience and Department of Biology, Washington University School of Medicine, St. Louis, MO 63130, USA
| | - Max A. B. Haase
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
- Vilcek Institute of Graduate Biomedical Sciences and Institute for Systems Genetics, NYU Langone Health, New York, NY 10016, USA
- Department of Mechanistic Cell Biology, Max Planck Institute of Molecular Physiology, 44227 Dortmund, Germany
| | - Carlos A. Rosa
- Departamento de Microbiologia, ICB, C.P. 486, Universidade Federal de Minas Gerais, Belo Horizonte, MG, 31270-901, Brazil
| | - Neža Čadež
- Food Science and Technology Department, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
| | - Diego Libkind
- Centro de Referencia en Levaduras y Tecnología Cervecera (CRELTEC), Instituto Andino Patagónico de Tecnologías Biológicas y Geoambientales (IPATEC), Universidad Nacional del Comahue, CONICET, CRUB, Quintral 1250, San Carlos de Bariloche, 8400, Río Negro, Argentina
| | - Jeremy H. DeVirgilio
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
| | - Amanda Beth Hulfachor
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| | - Cletus P. Kurtzman
- Mycotoxin Prevention and Applied Microbiology Research Unit, National Center for Agricultural Utilization Research, Agricultural Research Service, U.S. Department of Agriculture, Peoria, IL 61604, USA
| | - José Paulo Sampaio
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Paula Gonçalves
- UCIBIO, Department of Life Sciences, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
- Associate Laboratory i4HB, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, Portugal
| | - Xiaofan Zhou
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Center, South China Agricultural University, Guangzhou 510642, China
| | - Xing-Xing Shen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- College of Agriculture and Biotechnology and Centre for Evolutionary & Organismal Biology, Zhejiang University, Hangzhou 310058, China
| | | | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA
| | - Chris Todd Hittinger
- Laboratory of Genetics, Wisconsin Energy Institute, Center for Genomic Science Innovation, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Madison, WI 53726, USA
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI 53726, USA
| |
Collapse
|